pyclustering  0.10.1
pyclustring is a Python, C++ data mining library. Class Reference

Class represents Genetic clustering algorithm. More...

Public Member Functions

def __init__ (self, data, count_clusters, chromosome_count, population_count, **kwargs)
 Initialize genetic clustering algorithm. More...
def process (self)
 Perform clustering procedure in line with rule of genetic clustering algorithm. More...
def get_observer (self)
 Returns genetic algorithm observer.
def get_clusters (self)
 Returns list of allocated clusters, each cluster contains indexes of objects from the data. More...

Detailed Description

Class represents Genetic clustering algorithm.

The searching capability of genetic algorithms is exploited in order to search for appropriate cluster centres.

Example of clustering using genetic algorithm:

from import genetic_algorithm, ga_observer
from pyclustering.utils import read_sample
from pyclustering.samples.definitions import SIMPLE_SAMPLES
# Read input data for clustering
sample = read_sample(SIMPLE_SAMPLES.SAMPLE_SIMPLE4)
# Create instance of observer that will collect all information:
observer_instance = ga_observer(True, True, True)
# Create genetic algorithm for clustering
ga_instance = genetic_algorithm(data=sample,
# Start processing
# Obtain results
clusters = ga_instance.get_clusters()
# Print cluster to console
print("Amount of clusters: '%d'. Clusters: '%s'" % (len(clusters), clusters))

There is an example of clustering results (fitness function evolution and allocated clusters) that were visualized by 'ga_visualizer':

See also

Definition at line 304 of file

Constructor & Destructor Documentation

◆ __init__()

def (   self,
**  kwargs 

Initialize genetic clustering algorithm.

[in]data(numpy.array|list): Input data for clustering that is represented by two dimensional array where each row is a point, for example, [[0.0, 2.1], [0.1, 2.0], [-0.2, 2.4]].
[in]count_clusters(uint): The amount of clusters that should be allocated in the data.
[in]chromosome_count(uint): The amount of chromosomes in each population.
[in]population_count(uint): The amount of populations that essentially defines the amount of iterations.
[in]**kwargsArbitrary keyword arguments (available arguments: count_mutation_gens, coeff_mutation_count, select_coeff, crossover_rate, observer, random_state).

Keyword Args:

  • count_mutation_gens (uint): Amount of genes in chromosome that is mutated on each step.
  • coeff_mutation_count (float): Percent of chromosomes for mutation, distributed in range (0, 1] and thus amount of chromosomes is defined as follows: chromosome_count * coeff_mutation_count
  • select_coeff (float): Exponential coefficient for selection procedure that is used as follows: math.exp(1 + fitness(chromosome) * select_coeff).
  • crossover_rate (float): Crossover rate.
  • observer (ga_observer): Observer that is used for collecting information of about clustering process on each step.
  • random_state (int): Seed for random state (by default is None, current system time is used).

Definition at line 350 of file

Member Function Documentation

◆ get_clusters()

def (   self)

Returns list of allocated clusters, each cluster contains indexes of objects from the data.

(list) List of allocated clusters.
See also

Definition at line 474 of file

Referenced by pyclustering.samples.answer_reader.get_cluster_lengths(), and pyclustering.cluster.optics.optics.process().

◆ process()

def (   self)

Perform clustering procedure in line with rule of genetic clustering algorithm.

See also

Definition at line 411 of file

The documentation for this class was generated from the following file:
Cluster analysis algorithm: Genetic clustering algorithm (GA).
Utils that are used by modules of pyclustering.
def read_sample(filename)
Returns data sample from simple text file.