 pyclustering  0.10.1 pyclustring is a Python, C++ data mining library.
silhouette.py
1 """!
2
3 @brief Silhouette - method of interpretation and validation of consistency.
4 @details Implementation based on paper @cite article::cluster::silhouette::1.
5
6 @authors Andrei Novikov (pyclustering@yandex.ru)
7 @date 2014-2020
9
10 """
11
12
13 from enum import IntEnum
14
15 import numpy
16
17 from pyclustering.cluster.kmeans import kmeans
18 from pyclustering.cluster.kmedians import kmedians
19 from pyclustering.cluster.kmedoids import kmedoids
20 from pyclustering.cluster.center_initializer import kmeans_plusplus_initializer
21
22 from pyclustering.utils.metric import distance_metric, type_metric
23
24 from pyclustering.core.wrapper import ccore_library
25 from pyclustering.core.metric_wrapper import metric_wrapper
26
27 import pyclustering.core.silhouette_wrapper as wrapper
28
29
30 class silhouette:
31  """!
32  @brief Represents Silhouette method that is used interpretation and validation of consistency.
33  @details The silhouette value is a measure of how similar an object is to its own cluster compared to other clusters.
34  Be aware that silhouette method is applicable for K algorithm family, such as K-Means, K-Medians,
35  K-Medoids, X-Means, etc., not not applicable for DBSCAN, OPTICS, CURE, etc. The Silhouette value is
36  calculated using following formula:
37  \f[s\left ( i \right )=\frac{ b\left ( i \right ) - a\left ( i \right ) }{ max\left \{ a\left ( i \right ), b\left ( i \right ) \right \}}\f]
38  where \f$a\left ( i \right )\f$ - is average distance from object i to objects in its own cluster,
39  \f$b\left ( i \right )\f$ - is average distance from object i to objects in the nearest cluster (the appropriate among other clusters).
40
41  Here is an example where Silhouette score is calculated for K-Means's clustering result:
42  @code
43  from pyclustering.cluster.center_initializer import kmeans_plusplus_initializer
44  from pyclustering.cluster.kmeans import kmeans
45  from pyclustering.cluster.silhouette import silhouette
46
47  from pyclustering.samples.definitions import SIMPLE_SAMPLES
49
50  # Read data 'SampleSimple3' from Simple Sample collection.
52
53  # Prepare initial centers
54  centers = kmeans_plusplus_initializer(sample, 4).initialize()
55
56  # Perform cluster analysis
57  kmeans_instance = kmeans(sample, centers)
58  kmeans_instance.process()
59  clusters = kmeans_instance.get_clusters()
60
61  # Calculate Silhouette score
62  score = silhouette(sample, clusters).process().get_score()
63  @endcode
64
65  Let's perform clustering of the same sample by K-Means algorithm using different K values (2, 4, 6 and 8) and
66  estimate clustering results using Silhouette method.
67  @code
68  from pyclustering.cluster.kmeans import kmeans
69  from pyclustering.cluster.center_initializer import kmeans_plusplus_initializer
70  from pyclustering.cluster.silhouette import silhouette
71
72  from pyclustering.samples.definitions import SIMPLE_SAMPLES
74
75  import matplotlib.pyplot as plt
76
77  def get_score(sample, amount_clusters):
78  # Prepare initial centers for K-Means algorithm.
79  centers = kmeans_plusplus_initializer(sample, amount_clusters).initialize()
80
81  # Perform cluster analysis.
82  kmeans_instance = kmeans(sample, centers)
83  kmeans_instance.process()
84  clusters = kmeans_instance.get_clusters()
85
86  # Calculate Silhouette score.
87  return silhouette(sample, clusters).process().get_score()
88
89  def draw_score(figure, position, title, score):
91  ax.bar(range(0, len(score)), score, width=0.7)
92  ax.set_title(title)
93  ax.set_xlim(0, len(score))
94  ax.set_xticklabels([])
95  ax.grid()
96
97  # Read data 'SampleSimple3' from Simple Sample collection.
99
100  # Perform cluster analysis and estimation by Silhouette.
101  score_2 = get_score(sample, 2) # K = 2 (amount of clusters).
102  score_4 = get_score(sample, 4) # K = 4 - optimal.
103  score_6 = get_score(sample, 6) # K = 6.
104  score_8 = get_score(sample, 8) # K = 8.
105
106  # Visualize results.
107  figure = plt.figure()
108
109  # Visualize each result separately.
110  draw_score(figure, 221, 'K = 2', score_2)
111  draw_score(figure, 222, 'K = 4 (optimal)', score_4)
112  draw_score(figure, 223, 'K = 6', score_6)
113  draw_score(figure, 224, 'K = 8', score_8)
114
115  # Show a plot with visualized results.
116  plt.show()
117  @endcode
118
119  There is visualized results that were done by Silhouette method. K = 4 is the optimal amount of clusters in line
120  with Silhouette method because the score for each point is close to 1.0 and the average score for K = 4 is
121  biggest value among others K.
122
123  @image html silhouette_score_for_various_K.png "Fig. 1. Silhouette scores for various K."
124
125  @see kmeans, kmedoids, kmedians, xmeans, elbow
126
127  """
128
129  def __init__(self, data, clusters, **kwargs):
130  """!
131  @brief Initializes Silhouette method for analysis.
132
133  @param[in] data (array_like): Input data that was used for cluster analysis and that is presented as list of
134  points or distance matrix (defined by parameter 'data_type', by default data is considered as a list
135  of points).
136  @param[in] clusters (list): Clusters that have been obtained after cluster analysis.
137  @param[in] **kwargs: Arbitrary keyword arguments (available arguments: 'metric').
138
139  <b>Keyword Args:</b><br>
140  - metric (distance_metric): Metric that was used for cluster analysis and should be used for Silhouette
141  score calculation (by default Square Euclidean distance).
142  - data_type (string): Data type of input sample 'data' that is processed by the algorithm ('points', 'distance_matrix').
143  - ccore (bool): If True then CCORE (C++ implementation of pyclustering library) is used (by default True).
144
145  """
146  self.__data = data
147  self.__clusters = clusters
148  self.__metric = kwargs.get('metric', distance_metric(type_metric.EUCLIDEAN_SQUARE))
149  self.__data_type = kwargs.get('data_type', 'points')
150
151  if self.__metric.get_type() != type_metric.USER_DEFINED:
152  self.__metric.enable_numpy_usage()
153  else:
154  self.__metric.disable_numpy_usage()
155
156  self.__score = [0.0] * len(data)
157
158  self.__ccore = kwargs.get('ccore', True) and self.__metric.get_type() != type_metric.USER_DEFINED
159  if self.__ccore:
160  self.__ccore = ccore_library.workable()
161
162  if self.__ccore is False:
163  self.__data = numpy.array(data)
164
165  self.__verify_arguments()
166
167
168  def process(self):
169  """!
170  @brief Calculates Silhouette score for each object from input data.
171
172  @return (silhouette) Instance of the method (self).
173
174  """
175  if self.__ccore is True:
176  self.__process_by_ccore()
177  else:
178  self.__process_by_python()
179
180  return self
181
182
183  def __process_by_ccore(self):
184  """!
185  @brief Performs processing using CCORE (C/C++ part of pyclustering library).
186
187  """
188  ccore_metric = metric_wrapper.create_instance(self.__metric)
189  self.__score = wrapper.silhoeutte(self.__data, self.__clusters, ccore_metric.get_pointer(), self.__data_type)
190
191
192  def __process_by_python(self):
193  """!
194  @brief Performs processing using python code.
195
196  """
197  for index_cluster in range(len(self.__clusters)):
198  for index_point in self.__clusters[index_cluster]:
199  self.__score[index_point] = self.__calculate_score(index_point, index_cluster)
200
201
202  def get_score(self):
203  """!
204  @brief Returns Silhouette score for each object from input data.
205
206  @see process
207
208  """
209  return self.__score
210
211
212  def __calculate_score(self, index_point, index_cluster):
213  """!
214  @brief Calculates Silhouette score for the specific object defined by index_point.
215
216  @param[in] index_point (uint): Index point from input data for which Silhouette score should be calculated.
217  @param[in] index_cluster (uint): Index cluster to which the point belongs to.
218
219  @return (float) Silhouette score for the object.
220
221  """
222  if self.__data_type == 'points':
223  difference = self.__calculate_dataset_difference(index_point)
224  else:
225  difference = self.__data[index_point]
226
227  a_score = self.__calculate_within_cluster_score(index_cluster, difference)
228  b_score = self.__caclulate_optimal_neighbor_cluster_score(index_cluster, difference)
229
230  return (b_score - a_score) / max(a_score, b_score)
231
232
233  def __calculate_within_cluster_score(self, index_cluster, difference):
234  """!
235  @brief Calculates 'A' score for the specific object in cluster to which it belongs to.
236
237  @param[in] index_point (uint): Index point from input data for which 'A' score should be calculated.
238  @param[in] index_cluster (uint): Index cluster to which the point is belong to.
239
240  @return (float) 'A' score for the object.
241
242  """
243
244  score = self.__calculate_cluster_difference(index_cluster, difference)
245  if len(self.__clusters[index_cluster]) == 1:
246  return float('nan')
247  return score / (len(self.__clusters[index_cluster]) - 1)
248
249
250  def __calculate_cluster_score(self, index_cluster, difference):
251  """!
252  @brief Calculates 'B*' score for the specific object for specific cluster.
253
254  @param[in] index_point (uint): Index point from input data for which 'B*' score should be calculated.
255  @param[in] index_cluster (uint): Index cluster to which the point is belong to.
256
257  @return (float) 'B*' score for the object for specific cluster.
258
259  """
260
261  score = self.__calculate_cluster_difference(index_cluster, difference)
262  return score / len(self.__clusters[index_cluster])
263
264
265  def __caclulate_optimal_neighbor_cluster_score(self, index_cluster, difference):
266  """!
267  @brief Calculates 'B' score for the specific object for the nearest cluster.
268
269  @param[in] index_point (uint): Index point from input data for which 'B' score should be calculated.
270  @param[in] index_cluster (uint): Index cluster to which the point is belong to.
271
272  @return (float) 'B' score for the object.
273
274  """
275
276  optimal_score = float('inf')
277  for index_neighbor_cluster in range(len(self.__clusters)):
278  if index_cluster != index_neighbor_cluster:
279  candidate_score = self.__calculate_cluster_score(index_neighbor_cluster, difference)
280  if candidate_score < optimal_score:
281  optimal_score = candidate_score
282
283  if optimal_score == float('inf'):
284  optimal_score = -1.0
285
286  return optimal_score
287
288
289  def __calculate_cluster_difference(self, index_cluster, difference):
290  """!
291  @brief Calculates distance from each object in specified cluster to specified object.
292
293  @param[in] index_point (uint): Index point for which difference is calculated.
294
295  @return (list) Distance from specified object to each object from input data in specified cluster.
296
297  """
298  cluster_difference = 0.0
299  for index_point in self.__clusters[index_cluster]:
300  cluster_difference += difference[index_point]
301
302  return cluster_difference
303
304
305  def __calculate_dataset_difference(self, index_point):
306  """!
307  @brief Calculate distance from each object to specified object.
308
309  @param[in] index_point (uint): Index point for which difference with other points is calculated.
310
311  @return (list) Distance to each object from input data from the specified.
312
313  """
314
315  if self.__metric.get_type() != type_metric.USER_DEFINED:
316  dataset_differences = self.__metric(self.__data, self.__data[index_point])
317  else:
318  dataset_differences = [self.__metric(point, self.__data[index_point]) for point in self.__data]
319
320  return dataset_differences
321
322
323  def __verify_arguments(self):
324  """!
325  @brief Verify input parameters for the algorithm and throw exception in case of incorrectness.
326
327  """
328  if len(self.__data) == 0:
329  raise ValueError("Input data is empty (size: '%d')." % len(self.__data))
330
331  if len(self.__clusters) == 0:
332  raise ValueError("Input clusters are empty (size: '%d')." % len(self.__clusters))
333
334
335
336 class silhouette_ksearch_type(IntEnum):
337  """!
338  @brief Defines algorithms that can be used to find optimal number of cluster using Silhouette method.
339
340  @see silhouette_ksearch
341
342  """
343
344
345  KMEANS = 0
346
347
348  KMEDIANS = 1
349
350
351  KMEDOIDS = 2
352
353  def get_type(self):
354  """!
355  @brief Returns algorithm type that corresponds to specified enumeration value.
356
357  @return (type) Algorithm type for cluster analysis.
358
359  """
360  if self == silhouette_ksearch_type.KMEANS:
361  return kmeans
362  elif self == silhouette_ksearch_type.KMEDIANS:
363  return kmedians
364  elif self == silhouette_ksearch_type.KMEDOIDS:
365  return kmedoids
366  else:
367  return None
368
369
370
372  """!
373  @brief Represent algorithm for searching optimal number of clusters using specified K-algorithm (K-Means,
374  K-Medians, K-Medoids) that is based on Silhouette method.
375
376  @details This algorithm uses average value of scores for estimation and applicable for clusters that are well
377  separated. Here is an example where clusters are well separated (sample 'Hepta'):
378  @code
379  from pyclustering.cluster import cluster_visualizer
380  from pyclustering.cluster.center_initializer import kmeans_plusplus_initializer
381  from pyclustering.cluster.kmeans import kmeans
382  from pyclustering.cluster.silhouette import silhouette_ksearch_type, silhouette_ksearch
383  from pyclustering.samples.definitions import FCPS_SAMPLES
385
387  search_instance = silhouette_ksearch(sample, 2, 10, algorithm=silhouette_ksearch_type.KMEANS).process()
388
389  amount = search_instance.get_amount()
390  scores = search_instance.get_scores()
391
392  print("Scores: '%s'" % str(scores))
393
394  initial_centers = kmeans_plusplus_initializer(sample, amount).initialize()
395  kmeans_instance = kmeans(sample, initial_centers).process()
396
397  clusters = kmeans_instance.get_clusters()
398
399  visualizer = cluster_visualizer()
400  visualizer.append_clusters(clusters, sample)
401  visualizer.show()
402  @endcode
403
404  Obtained Silhouette scores for each K:
405  @code
406  Scores: '{2: 0.418434, 3: 0.450906, 4: 0.534709, 5: 0.689970, 6: 0.588460, 7: 0.882674, 8: 0.804725, 9: 0.780189}'
407  @endcode
408
409  K = 7 has the bigger average Silhouette score and it means that it is optimal amount of clusters:
410  @image html silhouette_ksearch_hepta.png "Silhouette ksearch's analysis with further K-Means clustering (sample 'Hepta')."
411
412  @see silhouette_ksearch_type
413
414  """
415
416  def __init__(self, data, kmin, kmax, **kwargs):
417  """!
418  @brief Initialize Silhouette search algorithm to find out optimal amount of clusters.
419
420  @param[in] data (array_like): Input data that is used for searching optimal amount of clusters.
421  @param[in] kmin (uint): Minimum amount of clusters that might be allocated. Should be equal or greater than 2.
422  @param[in] kmax (uint): Maximum amount of clusters that might be allocated. Should be equal or less than amount
423  of points in input data.
424  @param[in] **kwargs: Arbitrary keyword arguments (available arguments: algorithm, random_state).
425
426  <b>Keyword Args:</b><br>
427  - algorithm (silhouette_ksearch_type): Defines algorithm that is used for searching optimal number of
428  clusters (by default K-Means).
429  - ccore (bool): If True then CCORE (C++ implementation of pyclustering library) is used (by default True).
430
431  """
432  self.__data = data
433  self.__kmin = kmin
434  self.__kmax = kmax
435
436  self.__algorithm = kwargs.get('algorithm', silhouette_ksearch_type.KMEANS)
437  self.__random_state = kwargs.get('random_state', None)
438  self.__return_index = self.__algorithm == silhouette_ksearch_type.KMEDOIDS
439
440  self.__amount = -1
441  self.__score = -1.0
442  self.__scores = {}
443
444  self.__verify_arguments()
445
446  self.__ccore = kwargs.get('ccore', True)
447  if self.__ccore:
448  self.__ccore = ccore_library.workable()
449
450
451  def process(self):
452  """!
453  @brief Performs analysis to find optimal amount of clusters.
454
455  @see get_amount, get_score, get_scores
456
457  @return (silhouette_search) Itself instance (silhouette_search)
458
459  """
460  if self.__ccore is True:
461  self.__process_by_ccore()
462  else:
463  self.__process_by_python()
464
465  return self
466
467
468  def __process_by_ccore(self):
469  """!
470  @brief Performs processing using CCORE (C/C++ part of pyclustering library).
471
472  """
473  results = wrapper.silhoeutte_ksearch(self.__data, self.__kmin, self.__kmax, self.__algorithm, self.__random_state)
474
475  self.__amount = results
476  self.__score = results
477
478  scores_list = results
479  self.__scores = {}
480  for i in range(len(scores_list)):
481  self.__scores[self.__kmin + i] = scores_list[i]
482
483
484  def __process_by_python(self):
485  """!
486  @brief Performs processing using python code.
487
488  """
489  self.__scores = {}
490
491  for k in range(self.__kmin, self.__kmax):
492  clusters = self.__calculate_clusters(k)
493  if len(clusters) != k:
494  self.__scores[k] = float('nan')
495  continue
496
497  score = silhouette(self.__data, clusters).process().get_score()
498
499  self.__scores[k] = sum(score) / len(score)
500
501  if self.__scores[k] > self.__score:
502  self.__score = self.__scores[k]
503  self.__amount = k
504
505
506  def get_amount(self):
507  """!
508  @brief Returns optimal amount of clusters that has been found during analysis.
509
510  @return (uint) Optimal amount of clusters.
511
512  @see process
513
514  """
515  return self.__amount
516
517
518  def get_score(self):
519  """!
520  @brief Returns silhouette score that belongs to optimal amount of clusters (k).
521
522  @return (float) Score that belong to optimal amount of clusters.
523
524  @see process, get_scores
525
526  """
527  return self.__score
528
529
530  def get_scores(self):
531  """!
532  @brief Returns silhouette score for each K value (amount of clusters).
533
534  @return (dict) Silhouette score for each K value, where key is a K value and value is a silhouette score.
535
536  @see process, get_score
537
538  """
539  return self.__scores
540
541
542  def __calculate_clusters(self, k):
543  """!
544  @brief Performs cluster analysis using specified K value.
545
546  @param[in] k (uint): Amount of clusters that should be allocated.
547
548  @return (array_like) Allocated clusters.
549
550  """
551  initial_values = kmeans_plusplus_initializer(self.__data, k, random_state=self.__random_state).initialize(return_index=self.__return_index)
552  algorithm_type = self.__algorithm.get_type()
553  return algorithm_type(self.__data, initial_values).process().get_clusters()
554
555
556  def __verify_arguments(self):
557  """!
558  @brief Checks algorithm's arguments and if some of them is incorrect then exception is thrown.
559
560  """
561  if self.__kmax > len(self.__data):
562  raise ValueError("K max value '" + str(self.__kmax) + "' is bigger than amount of objects '" +
563  str(len(self.__data)) + "' in input data.")
564
565  if self.__kmin <= 1:
566  raise ValueError("K min value '" + str(self.__kmin) + "' should be greater than 1 (impossible to provide "
567  "silhouette score for only one cluster).")
pyclustering.cluster.center_initializer.kmeans_plusplus_initializer
K-Means++ is an algorithm for choosing the initial centers for algorithms like K-Means or X-Means.
Definition: center_initializer.py:95
pyclustering.cluster.silhouette.silhouette_ksearch.__random_state
__random_state
Definition: silhouette.py:437
pyclustering.cluster.kmedoids
Cluster analysis algorithm: K-Medoids.
Definition: kmedoids.py:1
pyclustering.cluster.silhouette.silhouette.__calculate_cluster_score
def __calculate_cluster_score(self, index_cluster, difference)
Calculates 'B*' score for the specific object for specific cluster.
Definition: silhouette.py:250
pyclustering.cluster.silhouette.silhouette_ksearch
Represent algorithm for searching optimal number of clusters using specified K-algorithm (K-Means,...
Definition: silhouette.py:371
pyclustering.cluster.silhouette.silhouette.__calculate_dataset_difference
def __calculate_dataset_difference(self, index_point)
Calculate distance from each object to specified object.
Definition: silhouette.py:305
pyclustering.cluster.silhouette.silhouette.get_score
def get_score(self)
Returns Silhouette score for each object from input data.
Definition: silhouette.py:202
pyclustering.cluster.center_initializer
Collection of center initializers for algorithm that uses initial centers, for example,...
Definition: center_initializer.py:1
pyclustering.cluster.silhouette.silhouette.__ccore
__ccore
Definition: silhouette.py:158
pyclustering.cluster.silhouette.silhouette_ksearch.__ccore
__ccore
Definition: silhouette.py:446
pyclustering.cluster.silhouette.silhouette_ksearch_type.get_type
def get_type(self)
Returns algorithm type that corresponds to specified enumeration value.
Definition: silhouette.py:353
pyclustering.cluster.silhouette.silhouette_ksearch.__process_by_ccore
def __process_by_ccore(self)
Performs processing using CCORE (C/C++ part of pyclustering library).
Definition: silhouette.py:468
pyclustering.cluster.silhouette.silhouette.__score
__score
Definition: silhouette.py:156
pyclustering.cluster.silhouette.silhouette_ksearch.__calculate_clusters
def __calculate_clusters(self, k)
Performs cluster analysis using specified K value.
Definition: silhouette.py:542
pyclustering.cluster.silhouette.silhouette.__caclulate_optimal_neighbor_cluster_score
def __caclulate_optimal_neighbor_cluster_score(self, index_cluster, difference)
Calculates 'B' score for the specific object for the nearest cluster.
Definition: silhouette.py:265
pyclustering.cluster.silhouette.silhouette.__init__
def __init__(self, data, clusters, **kwargs)
Initializes Silhouette method for analysis.
Definition: silhouette.py:129
pyclustering.cluster.silhouette.silhouette.__data_type
__data_type
Definition: silhouette.py:149
pyclustering.cluster.silhouette.silhouette_ksearch.__return_index
__return_index
Definition: silhouette.py:438
pyclustering.utils.metric.distance_metric
Distance metric performs distance calculation between two points in line with encapsulated function,...
Definition: metric.py:52
pyclustering.cluster.silhouette.silhouette_ksearch.__score
__score
Definition: silhouette.py:441
pyclustering.cluster.silhouette.silhouette.__clusters
__clusters
Definition: silhouette.py:147
pyclustering.cluster.silhouette.silhouette_ksearch.__verify_arguments
def __verify_arguments(self)
Checks algorithm's arguments and if some of them is incorrect then exception is thrown.
Definition: silhouette.py:556
pyclustering.cluster.silhouette.silhouette.__process_by_python
def __process_by_python(self)
Performs processing using python code.
Definition: silhouette.py:192
pyclustering.cluster.silhouette.silhouette.__process_by_ccore
def __process_by_ccore(self)
Performs processing using CCORE (C/C++ part of pyclustering library).
Definition: silhouette.py:183
pyclustering.cluster.silhouette.silhouette_ksearch.get_scores
def get_scores(self)
Returns silhouette score for each K value (amount of clusters).
Definition: silhouette.py:530
pyclustering.cluster.kmeans
The module contains K-Means algorithm and other related services.
Definition: kmeans.py:1
pyclustering.cluster.silhouette.silhouette.__data
__data
Definition: silhouette.py:146
pyclustering.cluster.silhouette.silhouette_ksearch.__amount
__amount
Definition: silhouette.py:440
pyclustering.cluster.silhouette.silhouette_ksearch.__init__
def __init__(self, data, kmin, kmax, **kwargs)
Initialize Silhouette search algorithm to find out optimal amount of clusters.
Definition: silhouette.py:416
pyclustering.cluster.silhouette.silhouette_ksearch.__process_by_python
def __process_by_python(self)
Performs processing using python code.
Definition: silhouette.py:484
pyclustering.cluster.silhouette.silhouette_ksearch.get_amount
def get_amount(self)
Returns optimal amount of clusters that has been found during analysis.
Definition: silhouette.py:506
pyclustering.cluster.silhouette.silhouette.__calculate_within_cluster_score
def __calculate_within_cluster_score(self, index_cluster, difference)
Calculates 'A' score for the specific object in cluster to which it belongs to.
Definition: silhouette.py:233
pyclustering.cluster.silhouette.silhouette_ksearch_type
Defines algorithms that can be used to find optimal number of cluster using Silhouette method.
Definition: silhouette.py:336
pyclustering.cluster.silhouette.silhouette.__calculate_cluster_difference
def __calculate_cluster_difference(self, index_cluster, difference)
Calculates distance from each object in specified cluster to specified object.
Definition: silhouette.py:289
pyclustering.cluster.silhouette.silhouette
Represents Silhouette method that is used interpretation and validation of consistency.
Definition: silhouette.py:30
pyclustering.cluster.silhouette.silhouette_ksearch.__scores
__scores
Definition: silhouette.py:442
pyclustering.cluster.silhouette.silhouette.process
def process(self)
Calculates Silhouette score for each object from input data.
Definition: silhouette.py:168
pyclustering.cluster.silhouette.silhouette.__calculate_score
def __calculate_score(self, index_point, index_cluster)
Calculates Silhouette score for the specific object defined by index_point.
Definition: silhouette.py:212
pyclustering.cluster.silhouette.silhouette_ksearch.process
def process(self)
Performs analysis to find optimal amount of clusters.
Definition: silhouette.py:451
pyclustering.cluster.silhouette.silhouette_ksearch.__algorithm
__algorithm
Definition: silhouette.py:436
pyclustering.cluster.silhouette.silhouette.__metric
__metric
Definition: silhouette.py:148
pyclustering.utils.metric
Module provides various distance metrics - abstraction of the notion of distance in a metric space.
Definition: metric.py:1
pyclustering.cluster.kmedians
Cluster analysis algorithm: K-Medians.
Definition: kmedians.py:1
pyclustering.cluster.silhouette.silhouette_ksearch.get_score
def get_score(self)
Returns silhouette score that belongs to optimal amount of clusters (k).
Definition: silhouette.py:518
pyclustering.cluster.silhouette.silhouette.__verify_arguments
def __verify_arguments(self)
Verify input parameters for the algorithm and throw exception in case of incorrectness.
Definition: silhouette.py:323