3 @brief Neural Network: Self-Organized Feature Map 4 @details Implementation based on paper @cite article::nnet::som::1, @cite article::nnet::som::2. 6 @authors Andrei Novikov (pyclustering@yandex.ru) 8 @copyright GNU Public License 10 @cond GNU_PUBLIC_LICENSE 11 PyClustering is free software: you can redistribute it and/or modify 12 it under the terms of the GNU General Public License as published by 13 the Free Software Foundation, either version 3 of the License, or 14 (at your option) any later version. 16 PyClustering is distributed in the hope that it will be useful, 17 but WITHOUT ANY WARRANTY; without even the implied warranty of 18 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 19 GNU General Public License for more details. 21 You should have received a copy of the GNU General Public License 22 along with this program. If not, see <http://www.gnu.org/licenses/>. 32 import matplotlib.pyplot
as plt
33 except Exception
as error_instance:
34 warnings.warn(
"Impossible to import matplotlib (please, install 'matplotlib'), pyclustering's visualization " 35 "functionality is not available (details: '%s')." % str(error_instance))
37 import pyclustering.core.som_wrapper
as wrapper
39 from pyclustering.core.wrapper
import ccore_library
42 from pyclustering.utils.dimension
import dimension_info
44 from enum
import IntEnum
49 @brief Enumeration of connection types for SOM. 70 @brief Enumeration of initialization types for SOM. 91 @brief Represents SOM parameters. 97 @brief Creates SOM parameters. 119 @brief Represents self-organized feature map (SOM). 120 @details The self-organizing feature map (SOM) method is a powerful tool for the visualization of 121 of high-dimensional data. It converts complex, nonlinear statistical relationships between 122 high-dimensional data into simple geometric relationships on a low-dimensional display. 124 @details `ccore` option can be specified in order to control using C++ implementation of pyclustering library. By 125 default C++ implementation is on. C++ implementation improves performance of the self-organized feature 132 from pyclustering.utils import read_sample 133 from pyclustering.nnet.som import som, type_conn, type_init, som_parameters 134 from pyclustering.samples.definitions import FCPS_SAMPLES 136 # read sample 'Lsun' from file 137 sample = read_sample(FCPS_SAMPLES.SAMPLE_LSUN) 139 # create SOM parameters 140 parameters = som_parameters() 142 # create self-organized feature map with size 7x7 143 rows = 10 # five rows 144 cols = 10 # five columns 145 structure = type_conn.grid_four; # each neuron has max. four neighbors. 146 network = som(rows, cols, structure, parameters) 148 # train network on 'Lsun' sample during 100 epouchs. 149 network.train(sample, 100) 151 # simulate trained network using randomly modified point from input dataset. 152 index_point = random.randint(0, len(sample) - 1) 153 point = sample[index_point] # obtain randomly point from data 154 point[0] += random.random() * 0.2 # change randomly X-coordinate 155 point[1] += random.random() * 0.2 # change randomly Y-coordinate 156 index_winner = network.simulate(point) 158 # check what are objects from input data are much close to randomly modified. 159 index_similar_objects = network.capture_objects[index_winner] 161 # neuron contains information of encoded objects 162 print("Point '%s' is similar to objects with indexes '%s'." % (str(point), str(index_similar_objects))) 163 print("Coordinates of similar objects:") 164 for index in index_similar_objects: print("\tPoint:", sample[index]) 166 # result visualization: 167 # show distance matrix (U-matrix). 168 network.show_distance_matrix() 170 # show density matrix (P-matrix). 171 network.show_density_matrix() 173 # show winner matrix. 174 network.show_winner_matrix() 176 # show self-organized map. 177 network.show_network() 180 There is a visualization of 'Target' sample that was done by the self-organized feature map: 181 @image html target_som_processing.png 188 @brief Return size of self-organized map that is defined by total number of neurons. 190 @return (uint) Size of self-organized map (number of neurons). 202 @brief Return weight of each neuron. 204 @return (list) Weights of each neuron. 216 @brief Return amount of captured objects by each neuron after training. 218 @return (list) Amount of captured objects by each neuron. 232 @brief Returns indexes of captured objects by each neuron. 233 @details For example, a network with size 2x2 has been trained on a sample with five objects. Suppose neuron #1 234 won an object with index `1`, neuron #2 won objects `0`, `3`, `4`, neuron #3 did not won anything and 235 finally neuron #4 won an object with index `2`. Thus, for this example we will have the following 236 output `[[1], [0, 3, 4], [], [2]]`. 238 @return (list) Indexes of captured objects by each neuron. 247 def __init__(self, rows, cols, conn_type=type_conn.grid_eight, parameters=None, ccore=True):
249 @brief Constructor of self-organized map. 251 @param[in] rows (uint): Number of neurons in the column (number of rows). 252 @param[in] cols (uint): Number of neurons in the row (number of columns). 253 @param[in] conn_type (type_conn): Type of connection between oscillators in the network (grid four, grid eight, honeycomb, function neighbour). 254 @param[in] parameters (som_parameters): Other specific parameters. 255 @param[in] ccore (bool): If True simulation is performed by CCORE library (C++ implementation of pyclustering). 264 self.
_size = cols * rows
280 if self.
_params.init_radius
is None:
283 if (ccore
is True)
and ccore_library.workable():
303 if conn_type != type_conn.func_neighbor:
308 @brief Destructor of the self-organized feature map. 317 @brief Returns size of the network that defines by amount of neuron in it. 319 @return (uint) Size of self-organized map (amount of neurons). 327 @brief Returns state of SOM network that can be used to store network. 338 @brief Set state of SOM network that can be used to load network. 341 if som_state[
'ccore']
is True and ccore_library.workable():
346 def __initialize_initial_radius(self, rows, cols):
348 @brief Initialize initial radius using map sizes. 350 @param[in] rows (uint): Number of neurons in the column (number of rows). 351 @param[in] cols (uint): Number of neurons in the row (number of columns). 353 @return (list) Value of initial radius. 357 if (cols + rows) / 4.0 > 1.0:
360 elif (cols > 1)
and (rows > 1):
366 def __initialize_locations(self, rows, cols):
368 @brief Initialize locations (coordinates in SOM grid) of each neurons in the map. 370 @param[in] rows (uint): Number of neurons in the column (number of rows). 371 @param[in] cols (uint): Number of neurons in the row (number of columns). 373 @return (list) List of coordinates of each neuron in map. 378 for i
in range(rows):
379 for j
in range(cols):
380 location.append([float(i), float(j)])
384 def __initialize_distances(self, size, location):
386 @brief Initialize distance matrix in SOM grid. 388 @param[in] size (uint): Amount of neurons in the network. 389 @param[in] location (list): List of coordinates of each neuron in the network. 391 @return (list) Distance matrix between neurons in the network. 394 sqrt_distances = [[[]
for i
in range(size)]
for j
in range(size)]
395 for i
in range(size):
396 for j
in range(i, size, 1):
397 dist = euclidean_distance_square(location[i], location[j])
398 sqrt_distances[i][j] = dist
399 sqrt_distances[j][i] = dist
401 return sqrt_distances
403 def _create_initial_weights(self, init_type):
405 @brief Creates initial weights for neurons in line with the specified initialization. 407 @param[in] init_type (type_init): Type of initialization of initial neuron weights (random, random in center of the input data, random distributed in data, ditributed in line with uniform grid). 411 dim_info = dimension_info(self.
_data)
413 step_x = dim_info.get_center()[0]
415 step_x = dim_info.get_width()[0] / (self.
_rows - 1)
418 if dim_info.get_dimensions() > 1:
419 step_y = dim_info.get_center()[1]
421 step_y = dim_info.get_width()[1] / (self.
_cols - 1)
424 random.seed(self.
_params.random_state)
427 if init_type == type_init.uniform_grid:
429 self.
_weights = [[[]
for i
in range(dim_info.get_dimensions())]
for j
in range(self.
_size)]
430 for i
in range(self.
_size):
432 for dim
in range(dim_info.get_dimensions()):
435 self.
_weights[i][dim] = dim_info.get_minimum_coordinate()[dim] + step_x * location[dim]
437 self.
_weights[i][dim] = dim_info.get_center()[dim]
441 self.
_weights[i][dim] = dim_info.get_minimum_coordinate()[dim] + step_y * location[dim]
443 self.
_weights[i][dim] = dim_info.get_center()[dim]
445 self.
_weights[i][dim] = dim_info.get_center()[dim]
447 elif init_type == type_init.random_surface:
450 [random.uniform(dim_info.get_minimum_coordinate()[i], dim_info.get_maximum_coordinate()[i])
for i
in 451 range(dim_info.get_dimensions())]
for _
in range(self.
_size)]
453 elif init_type == type_init.random_centroid:
455 self.
_weights = [[(random.random() + dim_info.get_center()[i])
for i
in range(dim_info.get_dimensions())]
456 for _
in range(self.
_size)]
460 self.
_weights = [[random.random()
for i
in range(dim_info.get_dimensions())]
for _
in range(self.
_size)]
462 def _create_connections(self, conn_type):
464 @brief Create connections in line with input rule (grid four, grid eight, honeycomb, function neighbour). 466 @param[in] conn_type (type_conn): Type of connection between oscillators in the network. 472 for index
in range(0, self.
_size, 1):
473 upper_index = index - self.
_cols 474 upper_left_index = index - self.
_cols - 1
475 upper_right_index = index - self.
_cols + 1
477 lower_index = index + self.
_cols 478 lower_left_index = index + self.
_cols - 1
479 lower_right_index = index + self.
_cols + 1
481 left_index = index - 1
482 right_index = index + 1
484 node_row_index = math.floor(index / self.
_cols)
485 upper_row_index = node_row_index - 1
486 lower_row_index = node_row_index + 1
488 if (conn_type == type_conn.grid_eight)
or (conn_type == type_conn.grid_four):
492 if lower_index < self.
_size:
495 if (conn_type == type_conn.grid_eight)
or (conn_type == type_conn.grid_four)
or (
496 conn_type == type_conn.honeycomb):
497 if (left_index >= 0)
and (math.floor(left_index / self.
_cols) == node_row_index):
500 if (right_index < self.
_size)
and (math.floor(right_index / self.
_cols) == node_row_index):
503 if conn_type == type_conn.grid_eight:
504 if (upper_left_index >= 0)
and (math.floor(upper_left_index / self.
_cols) == upper_row_index):
505 self.
_neighbors[index].append(upper_left_index)
507 if (upper_right_index >= 0)
and (math.floor(upper_right_index / self.
_cols) == upper_row_index):
508 self.
_neighbors[index].append(upper_right_index)
510 if (lower_left_index < self.
_size)
and (math.floor(lower_left_index / self.
_cols) == lower_row_index):
511 self.
_neighbors[index].append(lower_left_index)
513 if (lower_right_index < self.
_size)
and (math.floor(lower_right_index / self.
_cols) == lower_row_index):
514 self.
_neighbors[index].append(lower_right_index)
516 if conn_type == type_conn.honeycomb:
517 if (node_row_index % 2) == 0:
518 upper_left_index = index - self.
_cols 519 upper_right_index = index - self.
_cols + 1
521 lower_left_index = index + self.
_cols 522 lower_right_index = index + self.
_cols + 1
524 upper_left_index = index - self.
_cols - 1
525 upper_right_index = index - self.
_cols 527 lower_left_index = index + self.
_cols - 1
528 lower_right_index = index + self.
_cols 530 if (upper_left_index >= 0)
and (math.floor(upper_left_index / self.
_cols) == upper_row_index):
531 self.
_neighbors[index].append(upper_left_index)
533 if (upper_right_index >= 0)
and (math.floor(upper_right_index / self.
_cols) == upper_row_index):
534 self.
_neighbors[index].append(upper_right_index)
536 if (lower_left_index < self.
_size)
and (math.floor(lower_left_index / self.
_cols) == lower_row_index):
537 self.
_neighbors[index].append(lower_left_index)
539 if (lower_right_index < self.
_size)
and (math.floor(lower_right_index / self.
_cols) == lower_row_index):
540 self.
_neighbors[index].append(lower_right_index)
542 def _competition(self, x):
544 @brief Calculates neuron winner (distance, neuron index). 546 @param[in] x (list): Input pattern from the input data set, for example it can be coordinates of point. 548 @return (uint) Returns index of neuron that is winner. 553 minimum = euclidean_distance_square(self.
_weights[0], x)
555 for i
in range(1, self.
_size, 1):
556 candidate = euclidean_distance_square(self.
_weights[i], x)
557 if candidate < minimum:
563 def _adaptation(self, index, x):
565 @brief Change weight of neurons in line with won neuron. 567 @param[in] index (uint): Index of neuron-winner. 568 @param[in] x (list): Input pattern from the input data set. 574 if self.
_conn_type == type_conn.func_neighbor:
575 for neuron_index
in range(self.
_size):
579 influence = math.exp(-(distance / (2.0 * self.
_local_radius)))
581 for i
in range(dimension):
584 x[i] - self.
_weights[neuron_index][i])
587 for i
in range(dimension):
593 influence = math.exp(-(distance / (2.0 * self.
_local_radius)))
595 for i
in range(dimension):
598 x[i] - self.
_weights[neighbor_index][i])
600 def train(self, data, epochs, autostop=False):
602 @brief Trains self-organized feature map (SOM). 604 @param[in] data (list): Input data - list of points where each point is represented by list of features, for example coordinates. 605 @param[in] epochs (uint): Number of epochs for training. 606 @param[in] autostop (bool): Automatic termination of learning process when adaptation is not occurred. 608 @return (uint) Number of learning iterations. 619 for i
in range(self.
_size):
626 previous_weights =
None 628 for epoch
in range(1, epochs + 1):
635 for i
in range(self.
_size):
639 for i
in range(len(self.
_data)):
647 if (autostop
is True)
or (epoch == epochs):
653 if previous_weights
is not None:
655 if maximal_adaptation < self.
_params.adaptation_threshold:
658 previous_weights = [item[:]
for item
in self.
_weights]
664 @brief Processes input pattern (no learining) and returns index of neuron-winner. 665 Using index of neuron winner catched object can be obtained using property capture_objects. 667 @param[in] input_pattern (list): Input pattern. 669 @return (uint) Returns index of neuron-winner. 680 def _get_maximal_adaptation(self, previous_weights):
682 @brief Calculates maximum changes of weight in line with comparison between previous weights and current weights. 684 @param[in] previous_weights (list): Weights from the previous step of learning process. 686 @return (double) Value that represents maximum changes of weight after adaptation process. 690 dimension = len(self.
_data[0])
691 maximal_adaptation = 0.0
693 for neuron_index
in range(self.
_size):
694 for dim
in range(dimension):
695 current_adaptation = previous_weights[neuron_index][dim] - self.
_weights[neuron_index][dim]
697 if current_adaptation < 0:
698 current_adaptation = -current_adaptation
700 if maximal_adaptation < current_adaptation:
701 maximal_adaptation = current_adaptation
703 return maximal_adaptation
707 @brief Calculates number of winner at the last step of learning process. 709 @return (uint) Number of winner. 717 for i
in range(self.
_size):
725 @brief Shows gray visualization of U-matrix (distance matrix). 727 @see get_distance_matrix() 732 plt.imshow(distance_matrix, cmap=plt.get_cmap(
'hot'), interpolation=
'kaiser')
733 plt.title(
"U-Matrix")
739 @brief Calculates distance matrix (U-matrix). 740 @details The U-Matrix visualizes based on the distance in input space between a weight vector and its neighbors on map. 742 @return (list) Distance matrix (U-matrix). 744 @see show_distance_matrix() 745 @see get_density_matrix() 751 if self.
_conn_type != type_conn.func_neighbor:
754 distance_matrix = [[0.0] * self.
_cols for i
in range(self.
_rows)]
756 for i
in range(self.
_rows):
757 for j
in range(self.
_cols):
758 neuron_index = i * self.
_cols + j
760 if self.
_conn_type == type_conn.func_neighbor:
763 for neighbor_index
in self.
_neighbors[neuron_index]:
764 distance_matrix[i][j] += euclidean_distance_square(self.
_weights[neuron_index],
767 distance_matrix[i][j] /= len(self.
_neighbors[neuron_index])
769 return distance_matrix
773 @brief Show density matrix (P-matrix) using kernel density estimation. 775 @param[in] surface_divider (double): Divider in each dimension that affect radius for density measurement. 777 @see show_distance_matrix() 782 plt.imshow(density_matrix, cmap=plt.get_cmap(
'hot'), interpolation=
'kaiser')
783 plt.title(
"P-Matrix")
789 @brief Calculates density matrix (P-Matrix). 791 @param[in] surface_divider (double): Divider in each dimension that affect radius for density measurement. 793 @return (list) Density matrix (P-Matrix). 795 @see get_distance_matrix() 802 density_matrix = [[0] * self.
_cols for i
in range(self.
_rows)]
805 dim_max = [float(
'-Inf')] * dimension
806 dim_min = [float(
'Inf')] * dimension
809 for index_dim
in range(dimension):
810 if weight[index_dim] > dim_max[index_dim]:
811 dim_max[index_dim] = weight[index_dim]
813 if weight[index_dim] < dim_min[index_dim]:
814 dim_min[index_dim] = weight[index_dim]
816 radius = [0.0] * len(self.
_weights[0])
817 for index_dim
in range(dimension):
818 radius[index_dim] = (dim_max[index_dim] - dim_min[index_dim]) / surface_divider
821 for point
in self.
_data:
822 for index_neuron
in range(len(self)):
825 for index_dim
in range(dimension):
826 if abs(point[index_dim] - self.
_weights[index_neuron][index_dim]) > radius[index_dim]:
827 point_covered =
False 830 row = int(math.floor(index_neuron / self.
_cols))
831 col = index_neuron - row * self.
_cols 833 if point_covered
is True:
834 density_matrix[row][col] += 1
836 return density_matrix
840 @brief Show a winner matrix where each element corresponds to neuron and value represents 841 amount of won objects from input data-space at the last training iteration. 843 @see show_distance_matrix() 850 (fig, ax) = plt.subplots()
851 winner_matrix = [[0] * self.
_cols for _
in range(self.
_rows)]
853 for i
in range(self.
_rows):
854 for j
in range(self.
_cols):
855 neuron_index = i * self.
_cols + j
857 winner_matrix[i][j] = self.
_award[neuron_index]
858 ax.text(i, j, str(winner_matrix[i][j]), va=
'center', ha=
'center')
860 ax.imshow(winner_matrix, cmap=plt.get_cmap(
'cool'), interpolation=
'none')
863 plt.title(
"Winner Matrix")
866 def show_network(self, awards=False, belongs=False, coupling=True, dataset=True, marker_type='o'):
868 @brief Shows neurons in the dimension of data. 870 @param[in] awards (bool): If True - displays how many objects won each neuron. 871 @param[in] belongs (bool): If True - marks each won object by according index of neuron-winner (only when 872 dataset is displayed too). 873 @param[in] coupling (bool): If True - displays connections between neurons (except case when function neighbor 875 @param[in] dataset (bool): If True - displays inputs data set. 876 @param[in] marker_type (string): Defines marker that is used to denote neurons on the plot. 891 if (dimension == 1)
or (dimension == 2):
892 axes = fig.add_subplot(111)
894 axes = fig.gca(projection=
'3d')
896 raise NotImplementedError(
'Impossible to show network in data-space that is differ from 1D, 2D or 3D.')
898 if (self.
_data is not None)
and (dataset
is True):
901 axes.plot(x[0], 0.0,
'b|', ms=30)
904 axes.plot(x[0], x[1],
'b.')
907 axes.scatter(x[0], x[1], x[2], c=
'b', marker=
'.')
910 for index
in range(self.
_size):
912 if self.
_award[index] == 0:
916 axes.plot(self.
_weights[index][0], 0.0, color + marker_type)
919 location =
'{0}'.format(self.
_award[index])
920 axes.text(self.
_weights[index][0], 0.0, location, color=
'black', fontsize=10)
922 if belongs
and self.
_data is not None:
923 location =
'{0}'.format(index)
924 axes.text(self.
_weights[index][0], 0.0, location, color=
'black', fontsize=12)
927 axes.text(point[0], 0.0, location, color=
'blue', fontsize=10)
930 axes.plot(self.
_weights[index][0], self.
_weights[index][1], color + marker_type)
933 location =
'{0}'.format(self.
_award[index])
934 axes.text(self.
_weights[index][0], self.
_weights[index][1], location, color=
'black', fontsize=10)
936 if belongs
and self.
_data is not None:
937 location =
'{0}'.format(index)
938 axes.text(self.
_weights[index][0], self.
_weights[index][1], location, color=
'black', fontsize=12)
941 axes.text(point[0], point[1], location, color=
'blue', fontsize=10)
943 if (self.
_conn_type != type_conn.func_neighbor)
and (coupling
is True):
954 if (self.
_conn_type != type_conn.func_neighbor)
and (coupling !=
False):
962 plt.title(
"Network Structure")
966 def __get_dump_from_python(self, ccore_usage):
967 return {
'ccore': ccore_usage,
968 'state': {
'cols': self.
_cols,
981 def __download_dump_from_ccore(self):
987 def __upload_common_part(self, state_dump):
988 self.
_cols = state_dump[
'cols']
989 self.
_rows = state_dump[
'rows']
990 self.
_size = state_dump[
'size']
995 self.
_params = state_dump[
'params']
998 def __upload_dump_to_python(self, state_dump):
1004 self.
_weights = state_dump[
'weights']
1005 self.
_award = state_dump[
'award']
1011 def __upload_dump_to_ccore(self, state_dump):
1015 state_dump[
'capture_objects'])
def show_winner_matrix(self)
Show a winner matrix where each element corresponds to neuron and value represents amount of won obje...
def __upload_dump_to_ccore(self, state_dump)
def __get_dump_from_python(self, ccore_usage)
def awards(self)
Return amount of captured objects by each neuron after training.
def size(self)
Return size of self-organized map that is defined by total number of neurons.
init_learn_rate
Rate of learning.
def _adaptation(self, index, x)
Change weight of neurons in line with won neuron.
def show_density_matrix(self, surface_divider=20.0)
Show density matrix (P-matrix) using kernel density estimation.
def _get_maximal_adaptation(self, previous_weights)
Calculates maximum changes of weight in line with comparison between previous weights and current wei...
Enumeration of initialization types for SOM.
def __len__(self)
Returns size of the network that defines by amount of neuron in it.
Utils that are used by modules of pyclustering.
Represents SOM parameters.
def weights(self)
Return weight of each neuron.
def __initialize_initial_radius(self, rows, cols)
Initialize initial radius using map sizes.
def _create_connections(self, conn_type)
Create connections in line with input rule (grid four, grid eight, honeycomb, function neighbour)...
def show_distance_matrix(self)
Shows gray visualization of U-matrix (distance matrix).
def __initialize_locations(self, rows, cols)
Initialize locations (coordinates in SOM grid) of each neurons in the map.
def __initialize_distances(self, size, location)
Initialize distance matrix in SOM grid.
def train(self, data, epochs, autostop=False)
Trains self-organized feature map (SOM).
def show_network(self, awards=False, belongs=False, coupling=True, dataset=True, marker_type='o')
Shows neurons in the dimension of data.
init_type
Defines an initialization way for neuron weights (random, random in center of the input data...
def _create_initial_weights(self, init_type)
Creates initial weights for neurons in line with the specified initialization.
def get_distance_matrix(self)
Calculates distance matrix (U-matrix).
def __download_dump_from_ccore(self)
def get_density_matrix(self, surface_divider=20.0)
Calculates density matrix (P-Matrix).
def __upload_dump_to_python(self, state_dump)
def __init__(self, rows, cols, conn_type=type_conn.grid_eight, parameters=None, ccore=True)
Constructor of self-organized map.
Enumeration of connection types for SOM.
def _competition(self, x)
Calculates neuron winner (distance, neuron index).
adaptation_threshold
Condition that defines when the learining process should be stopped.
random_state
Seed for random state (by default is None, current system time is used).
def get_winner_number(self)
Calculates number of winner at the last step of learning process.
def __setstate__(self, som_state)
init_radius
Initial radius.
def simulate(self, input_pattern)
Processes input pattern (no learining) and returns index of neuron-winner.
def capture_objects(self)
Returns indexes of captured objects by each neuron.
def __del__(self)
Destructor of the self-organized feature map.
def __init__(self)
Creates SOM parameters.
def __upload_common_part(self, state_dump)
Represents self-organized feature map (SOM).