pyclustering  0.10.1
pyclustring is a Python, C++ data mining library.
pyclustering.cluster.ga.genetic_algorithm Class Reference

Class represents Genetic clustering algorithm. More...

Public Member Functions

def __init__ (self, data, count_clusters, chromosome_count, population_count, **kwargs)
 Initialize genetic clustering algorithm. More...
 
def process (self)
 Perform clustering procedure in line with rule of genetic clustering algorithm. More...
 
def get_observer (self)
 Returns genetic algorithm observer.
 
def get_clusters (self)
 Returns list of allocated clusters, each cluster contains indexes of objects from the data. More...
 

Detailed Description

Class represents Genetic clustering algorithm.

The searching capability of genetic algorithms is exploited in order to search for appropriate cluster centres.

Example of clustering using genetic algorithm:

from pyclustering.cluster.ga import genetic_algorithm, ga_observer
from pyclustering.utils import read_sample
from pyclustering.samples.definitions import SIMPLE_SAMPLES
# Read input data for clustering
sample = read_sample(SIMPLE_SAMPLES.SAMPLE_SIMPLE4)
# Create instance of observer that will collect all information:
observer_instance = ga_observer(True, True, True)
# Create genetic algorithm for clustering
ga_instance = genetic_algorithm(data=sample,
count_clusters=4,
chromosome_count=100,
population_count=200,
count_mutation_gens=1)
# Start processing
ga_instance.process()
# Obtain results
clusters = ga_instance.get_clusters()
# Print cluster to console
print("Amount of clusters: '%d'. Clusters: '%s'" % (len(clusters), clusters))

There is an example of clustering results (fitness function evolution and allocated clusters) that were visualized by 'ga_visualizer':

See also
ga_visualizer
ga_observer

Definition at line 304 of file ga.py.

Constructor & Destructor Documentation

◆ __init__()

def pyclustering.cluster.ga.genetic_algorithm.__init__ (   self,
  data,
  count_clusters,
  chromosome_count,
  population_count,
**  kwargs 
)

Initialize genetic clustering algorithm.

Parameters
[in]data(numpy.array|list): Input data for clustering that is represented by two dimensional array where each row is a point, for example, [[0.0, 2.1], [0.1, 2.0], [-0.2, 2.4]].
[in]count_clusters(uint): The amount of clusters that should be allocated in the data.
[in]chromosome_count(uint): The amount of chromosomes in each population.
[in]population_count(uint): The amount of populations that essentially defines the amount of iterations.
[in]**kwargsArbitrary keyword arguments (available arguments: count_mutation_gens, coeff_mutation_count, select_coeff, crossover_rate, observer, random_state).

Keyword Args:

  • count_mutation_gens (uint): Amount of genes in chromosome that is mutated on each step.
  • coeff_mutation_count (float): Percent of chromosomes for mutation, distributed in range (0, 1] and thus amount of chromosomes is defined as follows: chromosome_count * coeff_mutation_count
  • select_coeff (float): Exponential coefficient for selection procedure that is used as follows: math.exp(1 + fitness(chromosome) * select_coeff).
  • crossover_rate (float): Crossover rate.
  • observer (ga_observer): Observer that is used for collecting information of about clustering process on each step.
  • random_state (int): Seed for random state (by default is None, current system time is used).

Definition at line 350 of file ga.py.

Member Function Documentation

◆ get_clusters()

def pyclustering.cluster.ga.genetic_algorithm.get_clusters (   self)

Returns list of allocated clusters, each cluster contains indexes of objects from the data.

Returns
(list) List of allocated clusters.
See also
process()

Definition at line 474 of file ga.py.

Referenced by pyclustering.samples.answer_reader.get_cluster_lengths(), and pyclustering.cluster.optics.optics.process().

◆ process()

def pyclustering.cluster.ga.genetic_algorithm.process (   self)

Perform clustering procedure in line with rule of genetic clustering algorithm.

See also
get_clusters()

Definition at line 411 of file ga.py.


The documentation for this class was generated from the following file:
pyclustering.cluster.ga
Cluster analysis algorithm: Genetic clustering algorithm (GA).
Definition: ga.py:1
pyclustering.utils
Utils that are used by modules of pyclustering.
Definition: __init__.py:1
pyclustering.utils.read_sample
def read_sample(filename)
Returns data sample from simple text file.
Definition: __init__.py:30