Class represents clustering algorithm K-Medoids (another one title is PAM - Partitioning Around Medoids). More...

Public Member Functions
def	__init__ (self, data, initial_index_medoids, tolerance=0.001, ccore=True, kwargs)
	Constructor of clustering algorithm K-Medoids. More...

def	process (self)
	Performs cluster analysis in line with rules of K-Medoids algorithm. More...

def	get_clusters (self)
	Returns list of allocated clusters, each cluster contains indexes of objects in list of data. More...

def	get_medoids (self)
	Returns list of medoids of allocated clusters represented by indexes from the input data. More...

def	get_cluster_encoding (self)
	Returns clustering result representation type that indicate how clusters are encoded. More...

Detailed Description

Class represents clustering algorithm K-Medoids (another one title is PAM - Partitioning Around Medoids).

The algorithm is less sensitive to outliers tham K-Means. The principle difference between K-Medoids and K-Medians is that K-Medoids uses existed points from input data space as medoids, but median in K-Medians can be unreal object (not from input data space).

CCORE option can be used to use core pyclustering - C/C++ shared library for processing that significantly increases performance.

Clustering example:

# load list of points for cluster analysis
sample = read_sample(path)
# set random initial medoids
initial_medoids = [1, 10]
# create instance of K-Medoids algorithm
kmedoids_instance = kmedoids(sample, initial_medoids)
# run cluster analysis and obtain results
kmedoids_instance.process();
clusters = kmedoids_instance.get_clusters()
# show allocated clusters
print(clusters)

Metric for calculation distance between points can be specified by parameter additional 'metric':

# create Minkowski distance metric with degree equals to '2'
metric = distance_metric(type_metric.MINKOWSKI, degree=2)
# create K-Medoids algorithm with specific distance metric
kmedoids_instance = kmedoids(sample, initial_medoids, metric=metric)
# run cluster analysis and obtain results
kmedoids_instance.process()
clusters = kmedoids_instance.get_clusters()

Distance matrix can be used instead of sequence of points to increase performance and for that purpose parameter 'data_type' should be used:

# calculate distance matrix for sample
sample = read_sample(path_to_sample)
matrix = calculate_distance_matrix(sample)
# create K-Medoids algorithm for processing distance matrix instead of points
kmedoids_instance = kmedoids(matrix, initial_medoids, data_type='distance_matrix')
# run cluster analysis and obtain results
kmedoids_instance.process()
clusters = kmedoids_instance.get_clusters()
medoids = kmedoids_instance.get_medoids()

Definition at line 41 of file kmedoids.py.

Constructor & Destructor Documentation

◆ init()

def pyclustering.cluster.kmedoids.kmedoids.__init__	(	self,
		data,
		initial_index_medoids,
		tolerance = `0.001`,
		ccore = `True`,
		kwargs
	)

Constructor of clustering algorithm K-Medoids.

Parameters

[in]	data	(list): Input data that is presented as list of points (objects), each point should be represented by list or tuple.
[in]	initial_index_medoids	(list): Indexes of intial medoids (indexes of points in input data).
[in]	tolerance	(double): Stop condition: if maximum value of distance change of medoids of clusters is less than tolerance than algorithm will stop processing.
[in]	ccore	(bool): If specified than CCORE library (C++ pyclustering library) is used for clustering instead of Python code.
[in]	**kwargs	Arbitrary keyword arguments (available arguments: 'metric', 'data_type').

Keyword Args:

metric (distance_metric): Metric that is used for distance calculation between two points.
data_type (string): Data type of input sample 'data' that is processed by the algorithm ('points', 'distance_matrix').

Definition at line 101 of file kmedoids.py.

Member Function Documentation

◆ get_cluster_encoding()

def pyclustering.cluster.kmedoids.kmedoids.get_cluster_encoding ( self )

Returns clustering result representation type that indicate how clusters are encoded.

Returns: (type_encoding) Clustering result representation.

See also: get_clusters()

Definition at line 187 of file kmedoids.py.

◆ get_clusters()

def pyclustering.cluster.kmedoids.kmedoids.get_clusters ( self )

Returns list of allocated clusters, each cluster contains indexes of objects in list of data.

See also: process(); get_medoids()

Definition at line 163 of file kmedoids.py.

Referenced by pyclustering.samples.answer_reader.get_cluster_lengths(), and pyclustering.cluster.optics.optics.process().

◆ get_medoids()

def pyclustering.cluster.kmedoids.kmedoids.get_medoids ( self )

Returns list of medoids of allocated clusters represented by indexes from the input data.

See also: process(); get_clusters()

Definition at line 175 of file kmedoids.py.

◆ process()

def pyclustering.cluster.kmedoids.kmedoids.process ( self )

Performs cluster analysis in line with rules of K-Medoids algorithm.

Returns: (kmedoids) Returns itself (K-Medoids instance).

Remarks: Results of clustering can be obtained using corresponding get methods.

See also: get_clusters(); get_medoids()

Definition at line 130 of file kmedoids.py.

The documentation for this class was generated from the following file:

pyclustering/cluster/kmedoids.py

Public Member Functions