pyclustering.cluster.ga.genetic_algorithm Class Reference

Class represents Genetic clustering algorithm. More...

Public Member Functions

def __init__ (self, data, count_clusters, chromosome_count, population_count, count_mutation_gens=2, coeff_mutation_count=0.25, select_coeff=1.0, observer=ga_observer())
 Initialize genetic clustering algorithm for cluster analysis. More...
 
def process (self)
 Perform clustering procedure in line with rule of genetic clustering algorithm. More...
 
def get_observer (self)
 Returns genetic algorithm observer.
 
def get_clusters (self)
 Returns list of allocated clusters, each cluster contains indexes of objects from the data. More...
 

Detailed Description

Class represents Genetic clustering algorithm.

The searching capability of genetic algorithms is exploited in order to search for appropriate cluster centres.

Example of clustering using genetic algorithm:

# Read input data for clustering
sample = read_sample(SIMPLE_SAMPLES.SAMPLE_SIMPLE4);
# Create genetic algorithm for clustering
ga_instance = genetic_algorithm(data=sample,
count_clusters=4,
chromosome_count=100,
population_count=200,
count_mutation_gens=1);
# Start clustering process
ga_instance.process();
# Extract extracted clusters by the algorithm
clusters = ga_instance.get_clusters();
print(clusters);

There is an example of clustering results (fitness function evolution and allocated clusters) that were visualized by 'ga_visualizer':

ga_clustering_sample_simple_04.png
See also
ga_visualizer
ga_observer

Definition at line 314 of file ga.py.

Constructor & Destructor Documentation

◆ __init__()

def pyclustering.cluster.ga.genetic_algorithm.__init__ (   self,
  data,
  count_clusters,
  chromosome_count,
  population_count,
  count_mutation_gens = 2,
  coeff_mutation_count = 0.25,
  select_coeff = 1.0,
  observer = ga_observer() 
)

Initialize genetic clustering algorithm for cluster analysis.

Parameters
[in]data(numpy.array|list): Input data for clustering that is represented by two dimensional array where each row is a point, for example, [[0.0, 2.1], [0.1, 2.0], [-0.2, 2.4]].
[in]count_clusters(uint): Amount of clusters that should be allocated in the data.
[in]chromosome_count(uint): Amount of chromosomes in each population.
[in]population_count(uint): Amount of populations.
[in]count_mutation_gens(uint): Amount of genes in chromosome that is mutated on each step.
[in]coeff_mutation_count(float): Percent of chromosomes for mutation, destributed in range (0, 1] and thus amount of chromosomes is defined as follows: 'chromosome_count' * 'coeff_mutation_count'.
[in]select_coeff(float): Exponential coefficient for selection procedure that is used as follows: math.exp(1 + fitness(chromosome) * select_coeff).
[in]observer(ga_observer): Observer that is used for collecting information of about clustering process on each step.

Definition at line 351 of file ga.py.

Member Function Documentation

◆ get_clusters()

def pyclustering.cluster.ga.genetic_algorithm.get_clusters (   self)

Returns list of allocated clusters, each cluster contains indexes of objects from the data.

Returns
(list) List of allocated clusters.
See also
process()

Definition at line 469 of file ga.py.

Referenced by pyclustering.samples.answer_reader.get_cluster_lengths(), and pyclustering.cluster.optics.optics.process().

◆ process()

def pyclustering.cluster.ga.genetic_algorithm.process (   self)

Perform clustering procedure in line with rule of genetic clustering algorithm.

See also
get_clusters()

Definition at line 406 of file ga.py.


The documentation for this class was generated from the following file: