pyclustering  0.10.1
pyclustring is a Python, C++ data mining library.
pyclustering.cluster.cluster_visualizer_multidim Class Reference

Visualizer for cluster in multi-dimensional data. More...

Public Member Functions

def __init__ (self)
 Constructs cluster visualizer for multidimensional data. More...
 
def append_cluster (self, cluster, data=None, marker='.', markersize=None, color=None)
 Appends cluster for visualization. More...
 
def append_clusters (self, clusters, data=None, marker='.', markersize=None)
 Appends list of cluster for visualization. More...
 
def save (self, filename, **kwargs)
 Saves figure to the specified file. More...
 
def show (self, pair_filter=None, **kwargs)
 Shows clusters (visualize) in multi-dimensional space. More...
 

Detailed Description

Visualizer for cluster in multi-dimensional data.

This cluster visualizer is useful for clusters in data whose dimension is greater than 3. The multidimensional visualizer helps to overcome 'cluster_visualizer' shortcoming - ability to display clusters in 1D, 2D or 3D dimensional data space.

Example of clustering results visualization where 'Iris' is used:

from pyclustering.utils import read_sample
from pyclustering.samples.definitions import FAMOUS_SAMPLES
from pyclustering.cluster import cluster_visualizer_multidim
# load 4D data sample 'Iris'
sample_4d = read_sample(FAMOUS_SAMPLES.SAMPLE_IRIS)
# initialize 3 initial centers using K-Means++ algorithm
centers = kmeans_plusplus_initializer(sample_4d, 3).initialize()
# performs cluster analysis using X-Means
xmeans_instance = xmeans(sample_4d, centers)
xmeans_instance.process()
clusters = xmeans_instance.get_clusters()
# visualize obtained clusters in multi-dimensional space
visualizer = cluster_visualizer_multidim()
visualizer.append_clusters(clusters, sample_4d)
visualizer.show(max_row_size=3)

Visualized clustering results of 'Iris' data (multi-dimensional data):

Fig. 1. X-Means clustering results (data 'Iris').

Sometimes no need to display results in all dimensions. Parameter 'filter' can be used to display only interesting coordinate pairs. Here is an example of visualization of pair coordinates (x0, x1) and (x0, x2) for previous clustering results:

visualizer = cluster_visualizer_multidim()
visualizer.append_clusters(clusters, sample_4d)
visualizer.show(pair_filter=[[0, 1], [0, 2]])

Visualized results of specified coordinate pairs:

Fig. 2. X-Means clustering results (x0, x1) and (x0, x2) (data 'Iris').

Definition at line 56 of file __init__.py.

Constructor & Destructor Documentation

◆ __init__()

def pyclustering.cluster.cluster_visualizer_multidim.__init__ (   self)

Constructs cluster visualizer for multidimensional data.

The visualizer is suitable more data whose dimension is bigger than 3.

Definition at line 103 of file __init__.py.

Member Function Documentation

◆ append_cluster()

def pyclustering.cluster.cluster_visualizer_multidim.append_cluster (   self,
  cluster,
  data = None,
  marker = '.',
  markersize = None,
  color = None 
)

Appends cluster for visualization.

Parameters
[in]cluster(list): cluster that may consist of indexes of objects from the data or object itself.
[in]data(list): If defines that each element of cluster is considered as a index of object from the data.
[in]marker(string): Marker that is used for displaying objects from cluster on the canvas.
[in]markersize(uint): Size of marker.
[in]color(string): Color of marker.
Returns
Returns index of cluster descriptor on the canvas.

Definition at line 114 of file __init__.py.

Referenced by pyclustering.cluster.cluster_visualizer_multidim.append_clusters(), and pyclustering.cluster.cluster_visualizer.append_clusters().

◆ append_clusters()

def pyclustering.cluster.cluster_visualizer_multidim.append_clusters (   self,
  clusters,
  data = None,
  marker = '.',
  markersize = None 
)

Appends list of cluster for visualization.

Parameters
[in]clusters(list): List of clusters where each cluster may consist of indexes of objects from the data or object itself.
[in]data(list): If defines that each element of cluster is considered as a index of object from the data.
[in]marker(string): Marker that is used for displaying objects from clusters on the canvas.
[in]markersize(uint): Size of marker.

Definition at line 139 of file __init__.py.

◆ save()

def pyclustering.cluster.cluster_visualizer_multidim.save (   self,
  filename,
**  kwargs 
)

Saves figure to the specified file.

Parameters
[in]filename(string): File where the visualized clusters should be stored.
[in]**kwargsArbitrary keyword arguments (available arguments: 'visible_axis' 'visible_labels', 'visible_grid', 'row_size', 'show').

Keyword Args:

  • visible_axis (bool): Defines visibility of axes on each canvas, if True - axes are visible. By default axis of each canvas are not displayed.
  • visible_labels (bool): Defines visibility of labels on each canvas, if True - labels is displayed. By default labels of each canvas are displayed.
  • visible_grid (bool): Defines visibility of grid on each canvas, if True - grid is displayed. By default grid of each canvas is displayed.
  • max_row_size (uint): Maximum number of canvases on one row.

Definition at line 154 of file __init__.py.

◆ show()

def pyclustering.cluster.cluster_visualizer_multidim.show (   self,
  pair_filter = None,
**  kwargs 
)

Shows clusters (visualize) in multi-dimensional space.

Parameters
[in]pair_filter(list): List of coordinate pairs that should be displayed. This argument is used as a filter.
[in]**kwargsArbitrary keyword arguments (available arguments: 'visible_axis' 'visible_labels', 'visible_grid', 'row_size', 'show').

Keyword Args:

  • visible_axis (bool): Defines visibility of axes on each canvas, if True - axes are visible. By default axis of each canvas are not displayed.
  • visible_labels (bool): Defines visibility of labels on each canvas, if True - labels is displayed. By default labels of each canvas are displayed.
  • visible_grid (bool): Defines visibility of grid on each canvas, if True - grid is displayed. By default grid of each canvas is displayed.
  • max_row_size (uint): Maximum number of canvases on one row. By default the maximum value is 4.
  • show (bool): If True - then displays visualized clusters. By default is True.

Definition at line 184 of file __init__.py.

Referenced by pyclustering.cluster.cluster_visualizer_multidim.save(), and pyclustering.cluster.cluster_visualizer.save().


The documentation for this class was generated from the following file:
pyclustering.cluster
pyclustering module for cluster analysis.
Definition: __init__.py:1
pyclustering.utils
Utils that are used by modules of pyclustering.
Definition: __init__.py:1
pyclustering.utils.read_sample
def read_sample(filename)
Returns data sample from simple text file.
Definition: __init__.py:30