pyclustering
0.10.1
pyclustring is a Python, C++ data mining library.

Class represents clustering algorithm OPTICS (Ordering Points To Identify Clustering Structure) with KDtree optimization (ccore options is supported). More...
Public Member Functions  
def  __init__ (self, sample, eps, minpts, amount_clusters=None, ccore=True, **kwargs) 
Constructor of clustering algorithm OPTICS. More...  
def  process (self) 
Performs cluster analysis in line with rules of OPTICS algorithm. More...  
def  get_clusters (self) 
Returns list of allocated clusters, where each cluster contains indexes of objects and each cluster is represented by list. More...  
def  get_noise (self) 
Returns list of noise that contains indexes of objects that corresponds to input data. More...  
def  get_ordering (self) 
Returns clustering ordering information about the input data set. More...  
def  get_optics_objects (self) 
Returns OPTICS objects where each object contains information about index of point from processed data, core distance and reachability distance. More...  
def  get_radius (self) 
Returns connectivity radius that is calculated and used for clustering by the algorithm. More...  
def  get_cluster_encoding (self) 
Returns clustering result representation type that indicate how clusters are encoded. More...  
Class represents clustering algorithm OPTICS (Ordering Points To Identify Clustering Structure) with KDtree optimization (ccore options is supported).
OPTICS is a densitybased algorithm. Purpose of the algorithm is to provide explicit clusters, but create clusteringordering representation of the input data. Clusteringordering information contains information about internal structures of data set in terms of density and proper connectivity radius can be obtained for allocation required amount of clusters using this diagram. In case of usage additional input parameter 'amount of clusters' connectivity radius should be bigger than real  because it will be calculated by the algorithms if requested amount of clusters is not allocated.
Clustering example using sample 'Chainlink':
Amount of clusters that should be allocated can be also specified. In this case connectivity radius should be greater than real, for example:
Here is an example where OPTICS extracts outliers from sample 'Tetra':
Visualization result of allocated clusters and outliers is presented on the image below:
def pyclustering.cluster.optics.optics.__init__  (  self,  
sample,  
eps,  
minpts,  
amount_clusters = None , 

ccore = True , 

**  kwargs  
) 
Constructor of clustering algorithm OPTICS.
[in]  sample  (list): Input data that is presented as a list of points (objects), where each point is represented by list or tuple. 
[in]  eps  (double): Connectivity radius between points, points may be connected if distance between them less than the radius. 
[in]  minpts  (uint): Minimum number of shared neighbors that is required for establishing links between points. 
[in]  amount_clusters  (uint): Optional parameter where amount of clusters that should be allocated is specified. In case of usage 'amount_clusters' connectivity radius can be greater than real, in other words, there is place for mistake in connectivity radius usage. 
[in]  ccore  (bool): if True than DLL CCORE (C++ solution) will be used for solving the problem. 
[in]  **kwargs  Arbitrary keyword arguments (available arguments: 'data_type'). 
Keyword Args:
def pyclustering.cluster.optics.optics.get_cluster_encoding  (  self  ) 
Returns clustering result representation type that indicate how clusters are encoded.
def pyclustering.cluster.optics.optics.get_clusters  (  self  ) 
Returns list of allocated clusters, where each cluster contains indexes of objects and each cluster is represented by list.
Definition at line 508 of file optics.py.
Referenced by pyclustering.samples.answer_reader.get_cluster_lengths(), and pyclustering.cluster.optics.optics.process().
def pyclustering.cluster.optics.optics.get_noise  (  self  ) 
Returns list of noise that contains indexes of objects that corresponds to input data.
def pyclustering.cluster.optics.optics.get_optics_objects  (  self  ) 
Returns OPTICS objects where each object contains information about index of point from processed data, core distance and reachability distance.
def pyclustering.cluster.optics.optics.get_ordering  (  self  ) 
Returns clustering ordering information about the input data set.
Clustering ordering of dataset contains the information about the internal clustering structure in line with connectivity radius.
Definition at line 540 of file optics.py.
Referenced by pyclustering.cluster.optics.optics.process().
def pyclustering.cluster.optics.optics.get_radius  (  self  ) 
Returns connectivity radius that is calculated and used for clustering by the algorithm.
Connectivity radius may be changed only in case of usage additional parameter of the algorithm  amount of clusters for allocation.
def pyclustering.cluster.optics.optics.process  (  self  ) 
Performs cluster analysis in line with rules of OPTICS algorithm.