pyclustering.cluster.birch.birch Class Reference

Class represents clustering algorithm BIRCH. More...

Public Member Functions

def __init__ (self, data, number_clusters, branching_factor=5, max_node_entries=5, initial_diameter=0.1, type_measurement=measurement_type.CENTROID_EUCLIDEAN_DISTANCE, entry_size_limit=200, diameter_multiplier=1.5, ccore=True)
 Constructor of clustering algorithm BIRCH. More...
 
def process (self)
 Performs cluster analysis in line with rules of BIRCH algorithm. More...
 
def get_clusters (self)
 Returns list of allocated clusters, each cluster contains indexes of objects in list of data. More...
 
def get_cluster_encoding (self)
 Returns clustering result representation type that indicate how clusters are encoded. More...
 

Detailed Description

Class represents clustering algorithm BIRCH.

Example how to extract clusters from 'OldFaithful' sample using BIRCH algorithm:

from pyclustering.cluster.birch import birch, measurement_type
from pyclustering.cluster import cluster_visualizer
from pyclustering.utils import read_sample
from pyclustering.samples.definitions import FAMOUS_SAMPLES
# Sample for cluster analysis (represented by list)
sample = read_sample(FAMOUS_SAMPLES.SAMPLE_OLD_FAITHFUL)
# Create BIRCH algorithm
birch_instance = birch(sample, 2)
# Cluster analysis
birch_instance.process()
# Obtain results of clustering
clusters = birch_instance.get_clusters()
# Visualize allocated clusters
visualizer = cluster_visualizer()
visualizer.append_clusters(clusters, sample)
visualizer.show()

Definition at line 35 of file birch.py.

Constructor & Destructor Documentation

◆ __init__()

def pyclustering.cluster.birch.birch.__init__ (   self,
  data,
  number_clusters,
  branching_factor = 5,
  max_node_entries = 5,
  initial_diameter = 0.1,
  type_measurement = measurement_type.CENTROID_EUCLIDEAN_DISTANCE,
  entry_size_limit = 200,
  diameter_multiplier = 1.5,
  ccore = True 
)

Constructor of clustering algorithm BIRCH.

Parameters
[in]data(list): Input data presented as list of points (objects), where each point should be represented by list or tuple.
[in]number_clusters(uint): Number of clusters that should be allocated.
[in]branching_factor(uint): Maximum number of successor that might be contained by each non-leaf node in CF-Tree.
[in]max_node_entries(uint): Maximum number of entries that might be contained by each leaf node in CF-Tree.
[in]initial_diameter(double): Initial diameter that used for CF-Tree construction, it can be increase if entry_size_limit is exceeded.
[in]type_measurement(measurement_type): Type measurement used for calculation distance metrics.
[in]entry_size_limit(uint): Maximum number of entries that can be stored in CF-Tree, if it is exceeded during creation then diameter is increased and CF-Tree is rebuilt.
[in]diameter_multiplier(double): Multiplier that is used for increasing diameter when entry_size_limit is exceeded.
[in]ccore(bool): If True than CCORE (C++ part of the library) will be used for solving the problem.
Remarks
Despite eight arguments only the first two is mandatory, others can be ommitted. In this case default values are used for instance creation.

Definition at line 70 of file birch.py.

Member Function Documentation

◆ get_cluster_encoding()

def pyclustering.cluster.birch.birch.get_cluster_encoding (   self)

Returns clustering result representation type that indicate how clusters are encoded.

Returns
(type_encoding) Clustering result representation.
See also
get_clusters()

Definition at line 150 of file birch.py.

◆ get_clusters()

def pyclustering.cluster.birch.birch.get_clusters (   self)

Returns list of allocated clusters, each cluster contains indexes of objects in list of data.

Remarks
Allocated noise can be returned only after data processing (use method process() before). Otherwise empty list is returned.
Returns
(list) List of allocated clusters.
See also
process()
get_noise()

Definition at line 134 of file birch.py.

Referenced by pyclustering.samples.answer_reader.get_cluster_lengths(), and pyclustering.cluster.optics.optics.process().

◆ process()

def pyclustering.cluster.birch.birch.process (   self)

Performs cluster analysis in line with rules of BIRCH algorithm.

Returns
(birch) Returns itself (BIRCH instance).
See also
get_clusters()

Definition at line 105 of file birch.py.


The documentation for this class was generated from the following file: