pyclustering  0.10.1
pyclustring is a Python, C++ data mining library.
pyclustering.cluster.dbscan.dbscan Class Reference

Class represents clustering algorithm DBSCAN. More...

Public Member Functions

def __init__ (self, data, eps, neighbors, ccore=True, **kwargs)
 Constructor of clustering algorithm DBSCAN. More...
 
def __getstate__ (self)
 Returns current state of the algorithm. More...
 
def __setstate__ (self, state)
 Set current state of the algorithm. More...
 
def process (self)
 Performs cluster analysis in line with rules of DBSCAN algorithm. More...
 
def get_clusters (self)
 Returns allocated clusters. More...
 
def get_noise (self)
 Returns allocated noise. More...
 
def get_cluster_encoding (self)
 Returns clustering result representation type that indicate how clusters are encoded. More...
 

Detailed Description

Class represents clustering algorithm DBSCAN.

This DBSCAN algorithm is KD-tree optimized.

     By default C/C++ pyclustering library is used for processing that significantly increases performance.

Clustering example where DBSCAN algorithm is used to process Chainlink data from FCPS collection:

from pyclustering.cluster.dbscan import dbscan
from pyclustering.cluster import cluster_visualizer
from pyclustering.utils import read_sample
from pyclustering.samples.definitions import FCPS_SAMPLES
# Sample for cluster analysis.
sample = read_sample(FCPS_SAMPLES.SAMPLE_CHAINLINK)
# Create DBSCAN algorithm.
dbscan_instance = dbscan(sample, 0.7, 3)
# Start processing by DBSCAN.
dbscan_instance.process()
# Obtain results of clustering.
clusters = dbscan_instance.get_clusters()
noise = dbscan_instance.get_noise()
# Visualize clustering results
visualizer = cluster_visualizer()
visualizer.append_clusters(clusters, sample)
visualizer.append_cluster(noise, sample, marker='x')
visualizer.show()

Definition at line 22 of file dbscan.py.

Constructor & Destructor Documentation

◆ __init__()

def pyclustering.cluster.dbscan.dbscan.__init__ (   self,
  data,
  eps,
  neighbors,
  ccore = True,
**  kwargs 
)

Constructor of clustering algorithm DBSCAN.

Parameters
[in]data(list): Input data that is presented as list of points or distance matrix (defined by parameter 'data_type', by default data is considered as a list of points).
[in]eps(double): Connectivity radius between points, points may be connected if distance between them less then the radius.
[in]neighbors(uint): minimum number of shared neighbors that is required for establish links between points.
[in]ccore(bool): if True than DLL CCORE (C++ solution) will be used for solving the problem.
[in]**kwargsArbitrary keyword arguments (available arguments: 'data_type').

Keyword Args:

  • data_type (string): Data type of input sample 'data' that is processed by the algorithm ('points', 'distance_matrix').

Definition at line 58 of file dbscan.py.

Member Function Documentation

◆ __getstate__()

def pyclustering.cluster.dbscan.dbscan.__getstate__ (   self)

Returns current state of the algorithm.

It does not return internal temporal variables that are not visible for a user.

Returns
(tuple) Current state of the algorithm.

Definition at line 95 of file dbscan.py.

◆ __setstate__()

def pyclustering.cluster.dbscan.dbscan.__setstate__ (   self,
  state 
)

Set current state of the algorithm.

Set state method checks if C++ pyclustering is available for the current platform, as a result ccore state might be different if state is moved between platforms.

Definition at line 107 of file dbscan.py.

◆ get_cluster_encoding()

def pyclustering.cluster.dbscan.dbscan.get_cluster_encoding (   self)

Returns clustering result representation type that indicate how clusters are encoded.

Returns
(type_encoding) Clustering result representation.
See also
get_clusters()

Definition at line 188 of file dbscan.py.

◆ get_clusters()

def pyclustering.cluster.dbscan.dbscan.get_clusters (   self)

Returns allocated clusters.

Remarks
Allocated clusters can be returned only after data processing (use method process()). Otherwise empty list is returned.
Returns
(list) List of allocated clusters, each cluster contains indexes of objects in list of data.
See also
process()
get_noise()

Definition at line 156 of file dbscan.py.

Referenced by pyclustering.samples.answer_reader.get_cluster_lengths(), and pyclustering.cluster.optics.optics.process().

◆ get_noise()

def pyclustering.cluster.dbscan.dbscan.get_noise (   self)

Returns allocated noise.

Remarks
Allocated noise can be returned only after data processing (use method process() before). Otherwise empty list is returned.
Returns
(list) List of indexes that are marked as a noise.
See also
process()
get_clusters()

Definition at line 172 of file dbscan.py.

◆ process()

def pyclustering.cluster.dbscan.dbscan.process (   self)

Performs cluster analysis in line with rules of DBSCAN algorithm.

Returns
(dbscan) Returns itself (DBSCAN instance).
See also
get_clusters()
get_noise()

Definition at line 120 of file dbscan.py.


The documentation for this class was generated from the following file:
pyclustering.cluster.dbscan
Cluster analysis algorithm: DBSCAN.
Definition: dbscan.py:1
pyclustering.cluster
pyclustering module for cluster analysis.
Definition: __init__.py:1
pyclustering.utils
Utils that are used by modules of pyclustering.
Definition: __init__.py:1
pyclustering.utils.read_sample
def read_sample(filename)
Returns data sample from simple text file.
Definition: __init__.py:30