pyclustering  0.10.1
pyclustring is a Python, C++ data mining library.
pyclustering.cluster.clique.clique Class Reference

Class implements CLIQUE grid based clustering algorithm. More...

Public Member Functions

def __init__ (self, data, amount_intervals, density_threshold, **kwargs)
 Create CLIQUE clustering algorithm. More...
def process (self)
 Performs clustering process in line with rules of CLIQUE clustering algorithm. More...
def get_clusters (self)
 Returns allocated clusters. More...
def get_noise (self)
 Returns allocated noise. More...
def get_cells (self)
 Returns CLIQUE blocks that are formed during clustering process. More...
def get_cluster_encoding (self)
 Returns clustering result representation type that indicate how clusters are encoded. More...

Detailed Description

Class implements CLIQUE grid based clustering algorithm.

CLIQUE automatically finds subspaces with high-density clusters. It produces identical results irrespective of the order in which the input records are presented and it does not presume any canonical distribution for input data [1].

Here is an example where data in two-dimensional space is clustered using CLIQUE algorithm:

from pyclustering.cluster.clique import clique, clique_visualizer
from pyclustering.utils import read_sample
from pyclustering.samples.definitions import FCPS_SAMPLES
# read two-dimensional input data 'Target'
data = read_sample(FCPS_SAMPLES.SAMPLE_TARGET)
# create CLIQUE algorithm for processing
intervals = 10 # defines amount of cells in grid in each dimension
threshold = 0 # lets consider each point as non-outlier
clique_instance = clique(data, intervals, threshold)
# start clustering process and obtain results
clusters = clique_instance.get_clusters() # allocated clusters
noise = clique_instance.get_noise() # points that are considered as outliers (in this example should be empty)
cells = clique_instance.get_cells() # CLIQUE blocks that forms grid
print("Amount of clusters:", len(clusters))
# visualize clustering results
clique_visualizer.show_grid(cells, data) # show grid that has been formed by the algorithm
clique_visualizer.show_clusters(data, clusters, noise) # show clustering results

In this example 6 clusters are allocated including four small cluster where each such small cluster consists of three points. There are visualized clustering results - grid that has been formed by CLIQUE algorithm with density and clusters itself:

Fig. 1. CLIQUE clustering results (grid and clusters itself).

Sometimes such small clusters should be considered as outliers taking into account fact that two clusters in the central are relatively huge. To treat them as a noise threshold value should be increased:

intervals = 10
threshold = 3 # block that contains 3 or less points is considered as a outlier as well as its points
clique_instance = clique(data, intervals, threshold)

Two clusters are allocated, but in this case some points in cluster-"circle" are also considered as outliers, because CLIQUE operates with blocks, not with points:

Fig. 2. Noise allocation by CLIQUE.

Definition at line 415 of file

Constructor & Destructor Documentation

◆ __init__()

def pyclustering.cluster.clique.clique.__init__ (   self,
**  kwargs 

Create CLIQUE clustering algorithm.

[in]data(list): Input data (list of points) that should be clustered.
[in]amount_intervals(uint): Amount of intervals in each dimension that defines amount of CLIQUE blocks as

\[N_{blocks} = intervals^{dimensions}\]

[in]density_threshold(uint): Minimum number of points that should contain CLIQUE block to consider its points as non-outliers.
[in]**kwargsArbitrary keyword arguments (available arguments: 'ccore').

Keyword Args:

  • ccore (bool): By default is True. If True then C++ implementation is used for cluster analysis, otherwise Python implementation is used.

Definition at line 468 of file

Member Function Documentation

◆ get_cells()

def pyclustering.cluster.clique.clique.get_cells (   self)

Returns CLIQUE blocks that are formed during clustering process.

CLIQUE blocks can be used for visualization purposes. Each CLIQUE block contain its logical location in grid, spatial location in data space and points that belong to block.

(list) List of CLIQUE blocks.

Definition at line 551 of file

◆ get_cluster_encoding()

def pyclustering.cluster.clique.clique.get_cluster_encoding (   self)

Returns clustering result representation type that indicate how clusters are encoded.

(type_encoding) Clustering result representation.
See also

Definition at line 563 of file

◆ get_clusters()

def pyclustering.cluster.clique.clique.get_clusters (   self)

Returns allocated clusters.

Allocated clusters are returned only after data processing (method process()). Otherwise empty list is returned.
(list) List of allocated clusters, each cluster contains indexes of objects in list of data.
See also

Definition at line 521 of file

Referenced by pyclustering.samples.answer_reader.get_cluster_lengths(), and pyclustering.cluster.optics.optics.process().

◆ get_noise()

def pyclustering.cluster.clique.clique.get_noise (   self)

Returns allocated noise.

Allocated noise is returned only after data processing (method process()). Otherwise empty list is returned.
(list) List of indexes that are marked as a noise.
See also

Definition at line 536 of file

◆ process()

def pyclustering.cluster.clique.clique.process (   self)

Performs clustering process in line with rules of CLIQUE clustering algorithm.

(clique) Returns itself (CLIQUE instance).
See also

Definition at line 501 of file

The documentation for this class was generated from the following file:
Utils that are used by modules of pyclustering.
Cluster analysis algorithm: CLIQUE.
def read_sample(filename)
Returns data sample from simple text file.