Represents Silhouette method that is used interpretation and validation of consistency. More...

Public Member Functions
def	__init__ (self, data, clusters, kwargs)
	Initializes Silhouette method for analysis. More...

def	process (self)
	Calculates Silhouette score for each object from input data. More...

def	get_score (self)
	Returns Silhouette score for each object from input data. More...

Detailed Description

Represents Silhouette method that is used interpretation and validation of consistency.

The silhouette value is a measure of how similar an object is to its own cluster compared to other clusters. Be aware that silhouette method is applicable for K algorithm family, such as K-Means, K-Medians, K-Medoids, X-Means, etc., not not applicable for DBSCAN, OPTICS, CURE, etc. The Silhouette value is calculated using following formula:

$s\left ( i \right )=\frac{ b\left ( i \right ) - a\left ( i \right ) }{ max\left \{ a\left ( i \right ), b\left ( i \right ) \right \}}$

where $a\left ( i \right )$ - is average distance from object i to objects in its own cluster, $b\left ( i \right )$ - is average distance from object i to objects in the nearest cluster (the appropriate among other clusters).

Here is an example where Silhouette score is calculated for K-Means's clustering result:

from pyclustering.cluster.center_initializer import kmeans_plusplus_initializer
from pyclustering.cluster.kmeans import kmeans
from pyclustering.cluster.silhouette import silhouette
from pyclustering.samples.definitions import SIMPLE_SAMPLES
from pyclustering.utils import read_sample
# Read data 'SampleSimple3' from Simple Sample collection.
sample = read_sample(SIMPLE_SAMPLES.SAMPLE_SIMPLE3);
# Prepare initial centers
centers = kmeans_plusplus_initializer(sample, 4).initialize();
# Perform cluster analysis
kmeans_instance = kmeans(sample, centers);
kmeans_instance.process();
clusters = kmeans_instance.get_clusters();
# Calculate Silhouette score
score = silhouette(sample, clusters).process().get_score()

See also: kmeans, kmedoids, kmedians, xmeans, elbow

Definition at line 45 of file silhouette.py.

Constructor & Destructor Documentation

◆ init()

def pyclustering.cluster.silhouette.silhouette.__init__	(	self,
		data,
		clusters,
		kwargs
	)

Initializes Silhouette method for analysis.

Parameters

[in]	data	(array_like): Input data that was used for cluster analysis and that is presented as list of points or distance matrix (defined by parameter 'data_type', by default data is considered as a list of points).
[in]	clusters	(list): Cluster that have been obtained after cluster analysis.
[in]	**kwargs	Arbitrary keyword arguments (available arguments: 'metric').

Keyword Args:

metric (distance_metric): Metric that was used for cluster analysis and should be used for Silhouette score calculation (by default Square Euclidean distance).
data_type (string): Data type of input sample 'data' that is processed by the algorithm ('points', 'distance_matrix').
ccore (bool): If True then CCORE (C++ implementation of pyclustering library) is used (by default True).

Definition at line 84 of file silhouette.py.

Member Function Documentation

◆ get_score()

def pyclustering.cluster.silhouette.silhouette.get_score ( self )

Returns Silhouette score for each object from input data.

See also: process

Definition at line 157 of file silhouette.py.

◆ process()

def pyclustering.cluster.silhouette.silhouette.process ( self )

Calculates Silhouette score for each object from input data.

Returns: (silhouette) Instance of the method (self).

Definition at line 123 of file silhouette.py.

The documentation for this class was generated from the following file:

pyclustering/cluster/silhouette.py