pyclustering.cluster.birch.birch Class Reference

Class represents clustering algorithm BIRCH. More...

Public Member Functions

def __init__ (self, data, number_clusters, branching_factor=5, max_node_entries=5, initial_diameter=0.1, type_measurement=measurement_type.CENTROID_EUCLIDEAN_DISTANCE, entry_size_limit=200, diameter_multiplier=1.5, ccore=True)
 Constructor of clustering algorithm BIRCH. More...
 
def process (self)
 Performs cluster analysis in line with rules of BIRCH algorithm. More...
 
def get_clusters (self)
 Returns list of allocated clusters, each cluster contains indexes of objects in list of data. More...
 
def get_cluster_encoding (self)
 Returns clustering result representation type that indicate how clusters are encoded. More...
 

Detailed Description

Class represents clustering algorithm BIRCH.

Example:

# sample for cluster analysis (represented by list)
sample = read_sample(path_to_sample);
# create object of birch that uses CCORE for processing
birch_instance = birch(sample, 2, 5, 5, 0.05, measurement_type.CENTROID_EUCLIDIAN_DISTANCE, 200, True);
# cluster analysis
birch_instance.process();
# obtain results of clustering
clusters = birch_instance.get_clusters();

Definition at line 35 of file birch.py.

Constructor & Destructor Documentation

◆ __init__()

def pyclustering.cluster.birch.birch.__init__ (   self,
  data,
  number_clusters,
  branching_factor = 5,
  max_node_entries = 5,
  initial_diameter = 0.1,
  type_measurement = measurement_type.CENTROID_EUCLIDEAN_DISTANCE,
  entry_size_limit = 200,
  diameter_multiplier = 1.5,
  ccore = True 
)

Constructor of clustering algorithm BIRCH.

Parameters
[in]data(list): Input data presented as list of points (objects), where each point should be represented by list or tuple.
[in]number_clusters(uint): Number of clusters that should be allocated.
[in]branching_factor(uint): Maximum number of successor that might be contained by each non-leaf node in CF-Tree.
[in]max_node_entries(uint): Maximum number of entries that might be contained by each leaf node in CF-Tree.
[in]initial_diameter(double): Initial diameter that used for CF-Tree construction, it can be increase if entry_size_limit is exceeded.
[in]type_measurement(measurement_type): Type measurement used for calculation distance metrics.
[in]entry_size_limit(uint): Maximum number of entries that can be stored in CF-Tree, if it is exceeded during creation then diameter is increased and CF-Tree is rebuilt.
[in]diameter_multiplier(double): Multiplier that is used for increasing diameter when entry_size_limit is exceeded.
[in]ccore(bool): If True than DLL CCORE (C++ solution) will be used for solving the problem.
Remarks
Despite eight arguments only the first two is mandatory, others can be ommitted. In this case default values are used for instance creation.

Example:

birch_instance1 = birch(sample1, 2); # two clusters should be allocated
birch_instance2 = birch(sample2, 5); # five clusters should be allocated
# three clusters should be allocated, but also each leaf node can have maximum 5
# entries and each entry can have maximum 5 descriptors with initial diameter 0.05.
birch_instance3 = birch(sample3, 3, 5, 5, 0.05);

Definition at line 56 of file birch.py.

Member Function Documentation

◆ get_cluster_encoding()

def pyclustering.cluster.birch.birch.get_cluster_encoding (   self)

Returns clustering result representation type that indicate how clusters are encoded.

Returns
(type_encoding) Clustering result representation.
See also
get_clusters()

Definition at line 143 of file birch.py.

◆ get_clusters()

def pyclustering.cluster.birch.birch.get_clusters (   self)

Returns list of allocated clusters, each cluster contains indexes of objects in list of data.

Remarks
Allocated noise can be returned only after data processing (use method process() before). Otherwise empty list is returned.
Returns
(list) List of allocated clusters.
See also
process()
get_noise()

Definition at line 127 of file birch.py.

Referenced by pyclustering.samples.answer_reader.get_cluster_lengths(), and pyclustering.cluster.optics.optics.process().

◆ process()

def pyclustering.cluster.birch.birch.process (   self)

Performs cluster analysis in line with rules of BIRCH algorithm.

Remarks
Results of clustering can be obtained using corresponding gets methods.
See also
get_clusters()

Definition at line 99 of file birch.py.


The documentation for this class was generated from the following file: