3 @brief Neural Network: Self-Organized Feature Map 4 @details Implementation based on paper @cite article::nnet::som::1, @cite article::nnet::som::2. 6 @authors Andrei Novikov (pyclustering@yandex.ru) 8 @copyright GNU Public License 10 @cond GNU_PUBLIC_LICENSE 11 PyClustering is free software: you can redistribute it and/or modify 12 it under the terms of the GNU General Public License as published by 13 the Free Software Foundation, either version 3 of the License, or 14 (at your option) any later version. 16 PyClustering is distributed in the hope that it will be useful, 17 but WITHOUT ANY WARRANTY; without even the implied warranty of 18 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 19 GNU General Public License for more details. 21 You should have received a copy of the GNU General Public License 22 along with this program. If not, see <http://www.gnu.org/licenses/>. 33 import matplotlib.pyplot
as plt
34 except Exception
as error_instance:
35 warnings.warn(
"Impossible to import matplotlib (please, install 'matplotlib'), pyclustering's visualization " 36 "functionality is not available (details: '%s')." % str(error_instance))
38 import pyclustering.core.som_wrapper
as wrapper
40 from pyclustering.core.wrapper
import ccore_library
43 from pyclustering.utils.dimension
import dimension_info
45 from enum
import IntEnum
50 @brief Enumeration of connection types for SOM. 71 @brief Enumeration of initialization types for SOM. 92 @brief Represents SOM parameters. 98 @brief Constructor container of SOM parameters. 117 @brief Represents self-organized feature map (SOM). 118 @details The self-organizing feature map (SOM) method is a powerful tool for the visualization of 119 of high-dimensional data. It converts complex, nonlinear statistical relationships between 120 high-dimensional data into simple geometric relationships on a low-dimensional display. 122 @details CCORE option can be used to use the pyclustering core - C/C++ shared library for processing that significantly increases performance. 128 from pyclustering.utils import read_sample 129 from pyclustering.nnet.som import som, type_conn, type_init, som_parameters 130 from pyclustering.samples.definitions import FCPS_SAMPLES 132 # read sample 'Lsun' from file 133 sample = read_sample(FCPS_SAMPLES.SAMPLE_LSUN) 135 # create SOM parameters 136 parameters = som_parameters() 138 # create self-organized feature map with size 7x7 139 rows = 10 # five rows 140 cols = 10 # five columns 141 structure = type_conn.grid_four; # each neuron has max. four neighbors. 142 network = som(rows, cols, structure, parameters) 144 # train network on 'Lsun' sample during 100 epouchs. 145 network.train(sample, 100) 147 # simulate trained network using randomly modified point from input dataset. 148 index_point = random.randint(0, len(sample) - 1) 149 point = sample[index_point] # obtain randomly point from data 150 point[0] += random.random() * 0.2 # change randomly X-coordinate 151 point[1] += random.random() * 0.2 # change randomly Y-coordinate 152 index_winner = network.simulate(point) 154 # check what are objects from input data are much close to randomly modified. 155 index_similar_objects = network.capture_objects[index_winner] 157 # neuron contains information of encoded objects 158 print("Point '%s' is similar to objects with indexes '%s'." % (str(point), str(index_similar_objects))) 159 print("Coordinates of similar objects:") 160 for index in index_similar_objects: print("\tPoint:", sample[index]) 162 # result visualization: 163 # show distance matrix (U-matrix). 164 network.show_distance_matrix() 166 # show density matrix (P-matrix). 167 network.show_density_matrix() 169 # show winner matrix. 170 network.show_winner_matrix() 172 # show self-organized map. 173 network.show_network() 176 There is a visualization of 'Target' sample that was done by the self-organized feature map: 177 @image html target_som_processing.png 185 @brief Return size of self-organized map that is defined by total number of neurons. 187 @return (uint) Size of self-organized map (number of neurons). 199 @brief Return weight of each neuron. 201 @return (list) Weights of each neuron. 213 @brief Return amount of captured objects by each neuron after training. 215 @return (list) Amount of captured objects by each neuron. 229 @brief Returns indexes of captured objects by each neuron. 230 @details For example, network with size 2x2 has been trained on 5 sample, we neuron #1 has won one object with 231 index '1', neuron #2 - objects with indexes '0', '3', '4', neuron #3 - nothing, neuron #4 - object 232 with index '2'. Thus, output is [ [1], [0, 3, 4], [], [2] ]. 234 @return (list) Indexes of captured objects by each neuron. 244 def __init__(self, rows, cols, conn_type = type_conn.grid_eight, parameters = None, ccore = True):
246 @brief Constructor of self-organized map. 248 @param[in] rows (uint): Number of neurons in the column (number of rows). 249 @param[in] cols (uint): Number of neurons in the row (number of columns). 250 @param[in] conn_type (type_conn): Type of connection between oscillators in the network (grid four, grid eight, honeycomb, function neighbour). 251 @param[in] parameters (som_parameters): Other specific parameters. 252 @param[in] ccore (bool): If True simulation is performed by CCORE library (C++ implementation of pyclustering). 261 self.
_size = cols * rows
275 if parameters
is not None:
280 if self.
_params.init_radius
is None:
283 if (ccore
is True)
and ccore_library.workable():
303 if conn_type != type_conn.func_neighbor:
309 @brief Destructor of the self-organized feature map. 319 @brief Returns size of the network that defines by amount of neuron in it. 321 @return (uint) Size of self-organized map (amount of neurons). 330 @brief Returns state of SOM network that can be used to store network. 342 @brief Set state of SOM network that can be used to load network. 345 if som_state[
'ccore']
is True and ccore_library.workable():
351 def __initialize_initial_radius(self, rows, cols):
353 @brief Initialize initial radius using map sizes. 355 @param[in] rows (uint): Number of neurons in the column (number of rows). 356 @param[in] cols (uint): Number of neurons in the row (number of columns). 358 @return (list) Value of initial radius. 362 if (cols + rows) / 4.0 > 1.0:
365 elif (cols > 1)
and (rows > 1):
372 def __initialize_locations(self, rows, cols):
374 @brief Initialize locations (coordinates in SOM grid) of each neurons in the map. 376 @param[in] rows (uint): Number of neurons in the column (number of rows). 377 @param[in] cols (uint): Number of neurons in the row (number of columns). 379 @return (list) List of coordinates of each neuron in map. 384 for i
in range(rows):
385 for j
in range(cols):
386 location.append([float(i), float(j)])
391 def __initialize_distances(self, size, location):
393 @brief Initialize distance matrix in SOM grid. 395 @param[in] size (uint): Amount of neurons in the network. 396 @param[in] location (list): List of coordinates of each neuron in the network. 398 @return (list) Distance matrix between neurons in the network. 401 sqrt_distances = [ [ []
for i
in range(size) ]
for j
in range(size) ]
402 for i
in range(size):
403 for j
in range(i, size, 1):
404 dist = euclidean_distance_square(location[i], location[j])
405 sqrt_distances[i][j] = dist
406 sqrt_distances[j][i] = dist
408 return sqrt_distances
411 def _create_initial_weights(self, init_type):
413 @brief Creates initial weights for neurons in line with the specified initialization. 415 @param[in] init_type (type_init): Type of initialization of initial neuron weights (random, random in center of the input data, random distributed in data, ditributed in line with uniform grid). 419 dim_info = dimension_info(self.
_data)
421 step_x = dim_info.get_center()[0]
422 if self.
_rows > 1: step_x = dim_info.get_width()[0] / (self.
_rows - 1);
425 if dim_info.get_dimensions() > 1:
426 step_y = dim_info.get_center()[1]
427 if self.
_cols > 1: step_y = dim_info.get_width()[1] / (self.
_cols - 1);
433 if init_type == type_init.uniform_grid:
435 self.
_weights = [ [ []
for i
in range(dim_info.get_dimensions()) ]
for j
in range(self.
_size)]
436 for i
in range(self.
_size):
438 for dim
in range(dim_info.get_dimensions()):
441 self.
_weights[i][dim] = dim_info.get_minimum_coordinate()[dim] + step_x * location[dim]
443 self.
_weights[i][dim] = dim_info.get_center()[dim]
447 self.
_weights[i][dim] = dim_info.get_minimum_coordinate()[dim] + step_y * location[dim]
449 self.
_weights[i][dim] = dim_info.get_center()[dim]
451 self.
_weights[i][dim] = dim_info.get_center()[dim]
453 elif init_type == type_init.random_surface:
455 self.
_weights = [[random.uniform(dim_info.get_minimum_coordinate()[i], dim_info.get_maximum_coordinate()[i])
for i
in range(dim_info.get_dimensions())]
for _
in range(self.
_size)]
457 elif init_type == type_init.random_centroid:
459 self.
_weights = [[(random.random() + dim_info.get_center()[i])
for i
in range(dim_info.get_dimensions())]
for _
in range(self.
_size)]
463 self.
_weights = [[random.random()
for i
in range(dim_info.get_dimensions())]
for _
in range(self.
_size)]
466 def _create_connections(self, conn_type):
468 @brief Create connections in line with input rule (grid four, grid eight, honeycomb, function neighbour). 470 @param[in] conn_type (type_conn): Type of connection between oscillators in the network. 476 for index
in range(0, self.
_size, 1):
477 upper_index = index - self.
_cols 478 upper_left_index = index - self.
_cols - 1
479 upper_right_index = index - self.
_cols + 1
481 lower_index = index + self.
_cols 482 lower_left_index = index + self.
_cols - 1
483 lower_right_index = index + self.
_cols + 1
485 left_index = index - 1
486 right_index = index + 1
488 node_row_index = math.floor(index / self.
_cols)
489 upper_row_index = node_row_index - 1
490 lower_row_index = node_row_index + 1
492 if (conn_type == type_conn.grid_eight)
or (conn_type == type_conn.grid_four):
496 if lower_index < self.
_size:
499 if (conn_type == type_conn.grid_eight)
or (conn_type == type_conn.grid_four)
or (conn_type == type_conn.honeycomb):
500 if (left_index >= 0)
and (math.floor(left_index / self.
_cols) == node_row_index):
503 if (right_index < self.
_size)
and (math.floor(right_index / self.
_cols) == node_row_index):
507 if conn_type == type_conn.grid_eight:
508 if (upper_left_index >= 0)
and (math.floor(upper_left_index / self.
_cols) == upper_row_index):
509 self.
_neighbors[index].append(upper_left_index)
511 if (upper_right_index >= 0)
and (math.floor(upper_right_index / self.
_cols) == upper_row_index):
512 self.
_neighbors[index].append(upper_right_index)
514 if (lower_left_index < self.
_size)
and (math.floor(lower_left_index / self.
_cols) == lower_row_index):
515 self.
_neighbors[index].append(lower_left_index)
517 if (lower_right_index < self.
_size)
and (math.floor(lower_right_index / self.
_cols) == lower_row_index):
518 self.
_neighbors[index].append(lower_right_index)
521 if conn_type == type_conn.honeycomb:
522 if (node_row_index % 2) == 0:
523 upper_left_index = index - self.
_cols 524 upper_right_index = index - self.
_cols + 1
526 lower_left_index = index + self.
_cols 527 lower_right_index = index + self.
_cols + 1
529 upper_left_index = index - self.
_cols - 1
530 upper_right_index = index - self.
_cols 532 lower_left_index = index + self.
_cols - 1
533 lower_right_index = index + self.
_cols 535 if (upper_left_index >= 0)
and (math.floor(upper_left_index / self.
_cols) == upper_row_index):
536 self.
_neighbors[index].append(upper_left_index)
538 if (upper_right_index >= 0)
and (math.floor(upper_right_index / self.
_cols) == upper_row_index):
539 self.
_neighbors[index].append(upper_right_index)
541 if (lower_left_index < self.
_size)
and (math.floor(lower_left_index / self.
_cols) == lower_row_index):
542 self.
_neighbors[index].append(lower_left_index)
544 if (lower_right_index < self.
_size)
and (math.floor(lower_right_index / self.
_cols) == lower_row_index):
545 self.
_neighbors[index].append(lower_right_index)
548 def _competition(self, x):
550 @brief Calculates neuron winner (distance, neuron index). 552 @param[in] x (list): Input pattern from the input data set, for example it can be coordinates of point. 554 @return (uint) Returns index of neuron that is winner. 559 minimum = euclidean_distance_square(self.
_weights[0], x)
561 for i
in range(1, self.
_size, 1):
562 candidate = euclidean_distance_square(self.
_weights[i], x)
563 if candidate < minimum:
570 def _adaptation(self, index, x):
572 @brief Change weight of neurons in line with won neuron. 574 @param[in] index (uint): Index of neuron-winner. 575 @param[in] x (list): Input pattern from the input data set. 581 if self.
_conn_type == type_conn.func_neighbor:
582 for neuron_index
in range(self.
_size):
586 influence = math.exp(-(distance / (2.0 * self.
_local_radius)))
588 for i
in range(dimension):
592 for i
in range(dimension):
598 influence = math.exp(-(distance / (2.0 * self.
_local_radius)))
600 for i
in range(dimension):
604 def train(self, data, epochs, autostop=False):
606 @brief Trains self-organized feature map (SOM). 608 @param[in] data (list): Input data - list of points where each point is represented by list of features, for example coordinates. 609 @param[in] epochs (uint): Number of epochs for training. 610 @param[in] autostop (bool): Automatic termination of learining process when adaptation is not occurred. 612 @return (uint) Number of learining iterations. 623 for i
in range(self.
_size):
630 previous_weights =
None 632 for epoch
in range(1, epochs + 1):
639 for i
in range(self.
_size):
643 for i
in range(len(self.
_data)):
651 if (autostop ==
True)
or (epoch == epochs):
657 if previous_weights
is not None:
659 if maximal_adaptation < self.
_params.adaptation_threshold:
662 previous_weights = [item[:]
for item
in self.
_weights]
669 @brief Processes input pattern (no learining) and returns index of neuron-winner. 670 Using index of neuron winner catched object can be obtained using property capture_objects. 672 @param[in] input_pattern (list): Input pattern. 674 @return (uint) Returns index of neuron-winner. 686 def _get_maximal_adaptation(self, previous_weights):
688 @brief Calculates maximum changes of weight in line with comparison between previous weights and current weights. 690 @param[in] previous_weights (list): Weights from the previous step of learning process. 692 @return (double) Value that represents maximum changes of weight after adaptation process. 696 dimension = len(self.
_data[0])
697 maximal_adaptation = 0.0
699 for neuron_index
in range(self.
_size):
700 for dim
in range(dimension):
701 current_adaptation = previous_weights[neuron_index][dim] - self.
_weights[neuron_index][dim]
703 if current_adaptation < 0:
704 current_adaptation = -current_adaptation
706 if maximal_adaptation < current_adaptation:
707 maximal_adaptation = current_adaptation
709 return maximal_adaptation
714 @brief Calculates number of winner at the last step of learning process. 716 @return (uint) Number of winner. 724 for i
in range(self.
_size):
733 @brief Shows gray visualization of U-matrix (distance matrix). 735 @see get_distance_matrix() 740 plt.imshow(distance_matrix, cmap = plt.get_cmap(
'hot'), interpolation=
'kaiser')
741 plt.title(
"U-Matrix")
748 @brief Calculates distance matrix (U-matrix). 749 @details The U-Matrix visualizes based on the distance in input space between a weight vector and its neighbors on map. 751 @return (list) Distance matrix (U-matrix). 753 @see show_distance_matrix() 754 @see get_density_matrix() 760 if self.
_conn_type != type_conn.func_neighbor:
763 distance_matrix = [[0.0] * self.
_cols for i
in range(self.
_rows)]
765 for i
in range(self.
_rows):
766 for j
in range(self.
_cols):
767 neuron_index = i * self.
_cols + j
769 if self.
_conn_type == type_conn.func_neighbor:
772 for neighbor_index
in self.
_neighbors[neuron_index]:
773 distance_matrix[i][j] += euclidean_distance_square(self.
_weights[neuron_index], self.
_weights[neighbor_index])
775 distance_matrix[i][j] /= len(self.
_neighbors[neuron_index])
777 return distance_matrix
782 @brief Show density matrix (P-matrix) using kernel density estimation. 784 @param[in] surface_divider (double): Divider in each dimension that affect radius for density measurement. 786 @see show_distance_matrix() 791 plt.imshow(density_matrix, cmap = plt.get_cmap(
'hot'), interpolation=
'kaiser')
792 plt.title(
"P-Matrix")
799 @brief Calculates density matrix (P-Matrix). 801 @param[in] surface_divider (double): Divider in each dimension that affect radius for density measurement. 803 @return (list) Density matrix (P-Matrix). 805 @see get_distance_matrix() 812 density_matrix = [[0] * self.
_cols for i
in range(self.
_rows)]
815 dim_max = [ float(
'-Inf') ] * dimension
816 dim_min = [ float(
'Inf') ] * dimension
819 for index_dim
in range(dimension):
820 if weight[index_dim] > dim_max[index_dim]:
821 dim_max[index_dim] = weight[index_dim]
823 if weight[index_dim] < dim_min[index_dim]:
824 dim_min[index_dim] = weight[index_dim]
826 radius = [0.0] * len(self.
_weights[0])
827 for index_dim
in range(dimension):
828 radius[index_dim] = ( dim_max[index_dim] - dim_min[index_dim] ) / surface_divider
831 for point
in self.
_data:
832 for index_neuron
in range(len(self)):
835 for index_dim
in range(dimension):
836 if abs(point[index_dim] - self.
_weights[index_neuron][index_dim]) > radius[index_dim]:
837 point_covered =
False 840 row = int(math.floor(index_neuron / self.
_cols))
841 col = index_neuron - row * self.
_cols 843 if point_covered
is True:
844 density_matrix[row][col] += 1
846 return density_matrix
851 @brief Show winner matrix where each element corresponds to neuron and value represents 852 amount of won objects from input dataspace at the last training iteration. 854 @see show_distance_matrix() 861 (fig, ax) = plt.subplots()
862 winner_matrix = [[0] * self.
_cols for i
in range(self.
_rows)]
864 for i
in range(self.
_rows):
865 for j
in range(self.
_cols):
866 neuron_index = i * self.
_cols + j
868 winner_matrix[i][j] = self.
_award[neuron_index]
869 ax.text(i, j, str(winner_matrix[i][j]), va=
'center', ha=
'center')
871 ax.imshow(winner_matrix, cmap = plt.get_cmap(
'cool'), interpolation=
'none')
874 plt.title(
"Winner Matrix")
878 def show_network(self, awards = False, belongs = False, coupling = True, dataset = True, marker_type = 'o'):
880 @brief Shows neurons in the dimension of data. 882 @param[in] awards (bool): If True - displays how many objects won each neuron. 883 @param[in] belongs (bool): If True - marks each won object by according index of neuron-winner (only when dataset is displayed too). 884 @param[in] coupling (bool): If True - displays connections between neurons (except case when function neighbor is used). 885 @param[in] dataset (bool): If True - displays inputs data set. 886 @param[in] marker_type (string): Defines marker that is used for dispaying neurons in the network. 901 if (dimension == 1)
or (dimension == 2):
902 axes = fig.add_subplot(111)
904 axes = fig.gca(projection=
'3d')
906 raise NotImplementedError(
'Impossible to show network in data-space that is differ from 1D, 2D or 3D.')
908 if (self.
_data is not None)
and (dataset
is True):
911 axes.plot(x[0], 0.0,
'b|', ms = 30)
914 axes.plot(x[0], x[1],
'b.')
917 axes.scatter(x[0], x[1], x[2], c =
'b', marker =
'.')
920 for index
in range(self.
_size):
922 if self.
_award[index] == 0:
926 axes.plot(self.
_weights[index][0], 0.0, color + marker_type)
929 location =
'{0}'.format(self.
_award[index])
930 axes.text(self.
_weights[index][0], 0.0, location, color=
'black', fontsize = 10)
932 if belongs
and self.
_data is not None:
933 location =
'{0}'.format(index)
934 axes.text(self.
_weights[index][0], 0.0, location, color=
'black', fontsize = 12)
937 axes.text(point[0], 0.0, location, color=
'blue', fontsize = 10)
940 axes.plot(self.
_weights[index][0], self.
_weights[index][1], color + marker_type)
943 location =
'{0}'.format(self.
_award[index])
944 axes.text(self.
_weights[index][0], self.
_weights[index][1], location, color=
'black', fontsize=10)
946 if belongs
and self.
_data is not None:
947 location =
'{0}'.format(index)
948 axes.text(self.
_weights[index][0], self.
_weights[index][1], location, color=
'black', fontsize=12)
951 axes.text(point[0], point[1], location, color=
'blue', fontsize=10)
953 if (self.
_conn_type != type_conn.func_neighbor)
and (coupling !=
False):
961 axes.scatter(self.
_weights[index][0], self.
_weights[index][1], self.
_weights[index][2], c=color, marker=marker_type)
963 if (self.
_conn_type != type_conn.func_neighbor)
and (coupling !=
False):
971 plt.title(
"Network Structure")
976 def __get_dump_from_python(self, ccore_usage):
977 return {
'ccore': ccore_usage,
978 'state' : {
'cols': self.
_cols,
992 def __download_dump_from_ccore(self):
999 def __upload_common_part(self, state_dump):
1000 self.
_cols = state_dump[
'cols']
1001 self.
_rows = state_dump[
'rows']
1002 self.
_size = state_dump[
'size']
1007 self.
_params = state_dump[
'params']
1011 def __upload_dump_to_python(self, state_dump):
1017 self.
_weights = state_dump[
'weights']
1018 self.
_award = state_dump[
'award']
1025 def __upload_dump_to_ccore(self, state_dump):
1028 wrapper.som_load(self.
__ccore_som_pointer, state_dump[
'weights'], state_dump[
'award'], state_dump[
'capture_objects'])
def show_winner_matrix(self)
Show winner matrix where each element corresponds to neuron and value represents amount of won object...
def __upload_dump_to_ccore(self, state_dump)
def __get_dump_from_python(self, ccore_usage)
def awards(self)
Return amount of captured objects by each neuron after training.
def size(self)
Return size of self-organized map that is defined by total number of neurons.
init_learn_rate
Rate of learning.
def _adaptation(self, index, x)
Change weight of neurons in line with won neuron.
def show_density_matrix(self, surface_divider=20.0)
Show density matrix (P-matrix) using kernel density estimation.
def _get_maximal_adaptation(self, previous_weights)
Calculates maximum changes of weight in line with comparison between previous weights and current wei...
Enumeration of initialization types for SOM.
def __len__(self)
Returns size of the network that defines by amount of neuron in it.
Utils that are used by modules of pyclustering.
Represents SOM parameters.
def weights(self)
Return weight of each neuron.
def __initialize_initial_radius(self, rows, cols)
Initialize initial radius using map sizes.
def _create_connections(self, conn_type)
Create connections in line with input rule (grid four, grid eight, honeycomb, function neighbour)...
def show_distance_matrix(self)
Shows gray visualization of U-matrix (distance matrix).
def __initialize_locations(self, rows, cols)
Initialize locations (coordinates in SOM grid) of each neurons in the map.
def __initialize_distances(self, size, location)
Initialize distance matrix in SOM grid.
def train(self, data, epochs, autostop=False)
Trains self-organized feature map (SOM).
def show_network(self, awards=False, belongs=False, coupling=True, dataset=True, marker_type='o')
Shows neurons in the dimension of data.
init_type
Type of initialization of initial neuron weights (random, random in center of the input data...
def _create_initial_weights(self, init_type)
Creates initial weights for neurons in line with the specified initialization.
def get_distance_matrix(self)
Calculates distance matrix (U-matrix).
def __download_dump_from_ccore(self)
def get_density_matrix(self, surface_divider=20.0)
Calculates density matrix (P-Matrix).
def __upload_dump_to_python(self, state_dump)
def __init__(self, rows, cols, conn_type=type_conn.grid_eight, parameters=None, ccore=True)
Constructor of self-organized map.
Enumeration of connection types for SOM.
def _competition(self, x)
Calculates neuron winner (distance, neuron index).
adaptation_threshold
Condition when learining process should be stoped.
def get_winner_number(self)
Calculates number of winner at the last step of learning process.
def __setstate__(self, som_state)
init_radius
Initial radius (if not specified then will be calculated by SOM).
def simulate(self, input_pattern)
Processes input pattern (no learining) and returns index of neuron-winner.
def capture_objects(self)
Returns indexes of captured objects by each neuron.
def __del__(self)
Destructor of the self-organized feature map.
def __init__(self)
Constructor container of SOM parameters.
def __upload_common_part(self, state_dump)
Represents self-organized feature map (SOM).