pyclustering  0.10.1
pyclustring is a Python, C++ data mining library.
somsc.py
1 """!
2 
3 @brief Cluster analysis algorithm: SOM-SC (Self-Organized Feature Map for Simple Clustering)
4 @details There is no paper on which implementation is based. Algorithm SOM-SC is adaptation of SOM for cluster analysis in simple way.
5  Basic idea: amount of cluster that should be allocated is defines amount of neurons in the self-organized map. SOM-SC can be
6  considered as neural network implementation of K-Means algorithm.
7  Implementation based on paper @cite article::nnet::som::1.
8 
9 @authors Andrei Novikov (pyclustering@yandex.ru)
10 @date 2014-2020
11 @copyright BSD-3-Clause
12 
13 """
14 
15 
16 from pyclustering.core.wrapper import ccore_library
17 from pyclustering.cluster.encoder import type_encoding
18 from pyclustering.nnet.som import som, som_parameters
19 from pyclustering.nnet.som import type_conn
20 
21 
22 class somsc:
23  """!
24  @brief Class represents a simple clustering algorithm based on the self-organized feature map.
25  @details This algorithm uses amount of clusters that should be allocated as a size of SOM map. Captured
26  objects by neurons are considered as clusters. The algorithm is designed to process data with Gaussian
27  distribution that has spherical forms.
28 
29  Example:
30  @code
31  from pyclustering.cluster import cluster_visualizer
32  from pyclustering.cluster.somsc import somsc
33  from pyclustering.samples.definitions import FCPS_SAMPLES
34  from pyclustering.utils import read_sample
35 
36  # Load list of points for cluster analysis
37  sample = read_sample(FCPS_SAMPLES.SAMPLE_TWO_DIAMONDS)
38 
39  # Create instance of SOM-SC algorithm to allocated two clusters
40  somsc_instance = somsc(sample, 2)
41 
42  # Run cluster analysis and obtain results
43  somsc_instance.process()
44  clusters = somsc_instance.get_clusters()
45 
46  # Visualize clustering results.
47  visualizer = cluster_visualizer()
48  visualizer.append_clusters(clusters, sample)
49  visualizer.show()
50  @endcode
51 
52  """
53 
54  def __init__(self, data, amount_clusters, epouch=100, ccore=True, **kwargs):
55  """!
56  @brief Creates SOM-SC (Self Organized Map for Simple Clustering) algorithm for clustering analysis.
57 
58  @param[in] data (list): List of points that are used for processing.
59  @param[in] amount_clusters (uint): Amount of clusters that should be allocated.
60  @param[in] epouch (uint): Number of epochs for training of SOM.
61  @param[in] ccore (bool): If it is True then CCORE implementation will be used for clustering analysis.
62  @param[in] **kwargs: Arbitrary keyword arguments (available arguments: `random_state`).
63 
64  <b>Keyword Args:</b><br>
65  - random_state (int): Seed for random state (by default is `None`, current system time is used).
66 
67  """
68 
69  self.__data_pointer = data
70  self.__amount_clusters = amount_clusters
71  self.__epouch = epouch
72  self.__ccore = ccore
73  self.__random_state = kwargs.get('random_state', None)
74 
75  self.__network = None
76 
77  if self.__ccore is True:
78  self.__ccore = ccore_library.workable()
79 
80  self.__verify_arguments()
81 
82 
83  def process(self):
84  """!
85  @brief Performs cluster analysis by competition between neurons in self-organized map.
86 
87  @return (somsc) Returns itself (SOM Simple Clustering instance).
88 
89  @see get_clusters()
90 
91  """
92 
93  params = som_parameters()
94  params.random_state = self.__random_state
95 
96  self.__network = som(1, self.__amount_clusters, type_conn.grid_four, params, self.__ccore)
97  self.__network.train(self.__data_pointer, self.__epouch, True)
98 
99  return self
100 
101 
102  def predict(self, points):
103  """!
104  @brief Calculates the closest cluster to each point.
105 
106  @param[in] points (array_like): Points for which closest clusters are calculated.
107 
108  @return (list) List of closest clusters for each point. Each cluster is denoted by index. Return empty
109  collection if 'process()' method was not called.
110 
111  """
112 
113  result = []
114  for point in points:
115  index_cluster = self.__network.simulate(point)
116  result.append(index_cluster)
117 
118  return result
119 
120 
121  def get_clusters(self):
122  """!
123  @brief Returns list of allocated clusters, each cluster contains indexes of objects in list of data.
124 
125  @see process()
126 
127  """
128 
129  return self.__network.capture_objects
130 
131 
133  """!
134  @brief Returns clustering result representation type that indicate how clusters are encoded.
135 
136  @return (type_encoding) Clustering result representation.
137 
138  @see get_clusters()
139 
140  """
141 
142  return type_encoding.CLUSTER_INDEX_LIST_SEPARATION
143 
144 
145  def __verify_arguments(self):
146  """!
147  @brief Verify input parameters for the algorithm and throw exception in case of incorrectness.
148 
149  """
150  if len(self.__data_pointer) == 0:
151  raise ValueError("Input data is empty (size: '%d')." % len(self.__data_pointer))
152 
153  if self.__amount_clusters <= 0:
154  raise ValueError("Amount of clusters (current value: '%d') should be greater than 0." %
155  self.__amount_clusters)
156 
157  if self.__epouch < 0:
158  raise ValueError("Amount of epouch (current value: '%d') should be greater or equal to 0." %
159  self.__epouch)
pyclustering.cluster.somsc.somsc.predict
def predict(self, points)
Calculates the closest cluster to each point.
Definition: somsc.py:102
pyclustering.cluster.somsc.somsc.__network
__network
Definition: somsc.py:75
pyclustering.nnet.som.som
Represents self-organized feature map (SOM).
Definition: som.py:97
pyclustering.cluster.somsc.somsc.get_cluster_encoding
def get_cluster_encoding(self)
Returns clustering result representation type that indicate how clusters are encoded.
Definition: somsc.py:132
pyclustering.cluster.somsc.somsc.__epouch
__epouch
Definition: somsc.py:71
pyclustering.cluster.somsc.somsc.__init__
def __init__(self, data, amount_clusters, epouch=100, ccore=True, **kwargs)
Creates SOM-SC (Self Organized Map for Simple Clustering) algorithm for clustering analysis.
Definition: somsc.py:54
pyclustering.cluster.somsc.somsc.__amount_clusters
__amount_clusters
Definition: somsc.py:70
pyclustering.cluster.somsc.somsc
Class represents a simple clustering algorithm based on the self-organized feature map.
Definition: somsc.py:22
pyclustering.cluster.somsc.somsc.__random_state
__random_state
Definition: somsc.py:73
pyclustering.cluster.somsc.somsc.__verify_arguments
def __verify_arguments(self)
Verify input parameters for the algorithm and throw exception in case of incorrectness.
Definition: somsc.py:145
pyclustering.nnet.som
Neural Network: Self-Organized Feature Map.
Definition: som.py:1
pyclustering.cluster.somsc.somsc.get_clusters
def get_clusters(self)
Returns list of allocated clusters, each cluster contains indexes of objects in list of data.
Definition: somsc.py:121
pyclustering.cluster.somsc.somsc.process
def process(self)
Performs cluster analysis by competition between neurons in self-organized map.
Definition: somsc.py:83
pyclustering.cluster.somsc.somsc.__ccore
__ccore
Definition: somsc.py:72
pyclustering.nnet.som.som_parameters
Represents SOM parameters.
Definition: som.py:69
pyclustering.cluster.encoder
Module for representing clustering results.
Definition: encoder.py:1
pyclustering.cluster.somsc.somsc.__data_pointer
__data_pointer
Definition: somsc.py:69