pyclustering
0.10.1
pyclustring is a Python, C++ data mining library.

Utils that are used by modules of pyclustering. More...
Namespaces  
color  
Colors used by pyclustering library for visualization.  
graph  
Graph representation (uses format GRPR).  
metric  
Module provides various distance metrics  abstraction of the notion of distance in a metric space.  
sampling  
Module provides various random sampling algorithms.  
Functions  
def  read_sample (filename) 
Returns data sample from simple text file. More...  
def  calculate_distance_matrix (sample, metric=distance_metric(type_metric.EUCLIDEAN)) 
Calculates distance matrix for data sample (sequence of points) using specified metric (by default Euclidean distance). More...  
def  read_image (filename) 
Returns image as Ndimension (depends on the input image) matrix, where one element of list describes pixel. More...  
def  rgb2gray (image_rgb_array) 
Returns image as 1dimension (gray colored) matrix, where one element of list describes pixel. More...  
def  stretch_pattern (image_source) 
Returns stretched content as 1dimension (gray colored) matrix with size of input image. More...  
def  gray_pattern_borders (image) 
Returns coordinates of gray image content on the input image. More...  
def  average_neighbor_distance (points, num_neigh) 
Returns average distance for establish links between specified number of nearest neighbors. More...  
def  medoid (data, indexes=None, **kwargs) 
Calculate medoid for input points. More...  
def  euclidean_distance (a, b) 
Calculate Euclidean distance between vector a and b. More...  
def  euclidean_distance_square (a, b) 
Calculate square Euclidian distance between vector a and b. More...  
def  manhattan_distance (a, b) 
Calculate Manhattan distance between vector a and b. More...  
def  average_inter_cluster_distance (cluster1, cluster2, data=None) 
Calculates average intercluster distance between two clusters. More...  
def  average_intra_cluster_distance (cluster1, cluster2, data=None) 
Calculates average intracluster distance between two clusters. More...  
def  variance_increase_distance (cluster1, cluster2, data=None) 
Calculates variance increase distance between two clusters. More...  
def  calculate_ellipse_description (covariance, scale=2.0) 
Calculates description of ellipse using covariance matrix. More...  
def  data_corners (data, data_filter=None) 
Finds maximum and minimum corner in each dimension of the specified data. More...  
def  norm_vector (vector) 
Calculates norm of an input vector that is known as a vector length. More...  
def  heaviside (value) 
Calculates Heaviside function that represents step function. More...  
def  timedcall (executable_function, *args, **kwargs) 
Executes specified method or function with measuring of execution time. More...  
def  extract_number_oscillations (osc_dyn, index=0, amplitude_threshold=1.0) 
Extracts number of oscillations of specified oscillator. More...  
def  allocate_sync_ensembles (dynamic, tolerance=0.1, threshold=1.0, ignore=None) 
Allocate clusters in line with ensembles of synchronous oscillators where each synchronous ensemble corresponds to only one cluster. More...  
def  draw_clusters (data, clusters, noise=[], marker_descr='.', hide_axes=False, axes=None, display_result=True) 
Displays clusters for data in 2D or 3D. More...  
def  draw_dynamics (t, dyn, x_title=None, y_title=None, x_lim=None, y_lim=None, x_labels=True, y_labels=True, separate=False, axes=None) 
Draw dynamics of neurons (oscillators) in the network. More...  
def  set_ax_param (ax, x_title=None, y_title=None, x_lim=None, y_lim=None, x_labels=True, y_labels=True, grid=True) 
Sets parameters for matplotlib ax. More...  
def  draw_dynamics_set (dynamics, xtitle=None, ytitle=None, xlim=None, ylim=None, xlabels=False, ylabels=False) 
Draw lists of dynamics of neurons (oscillators) in the network. More...  
def  draw_image_color_segments (source, clusters, hide_axes=True) 
Shows image segments using colored image. More...  
def  draw_image_mask_segments (source, clusters, hide_axes=True) 
Shows image segments using black masks. More...  
def  find_left_element (sorted_data, right, comparator) 
Returns the element's index at the left side from the right border with the same value as the last element in the range sorted_data . More...  
def  linear_sum (list_vector) 
Calculates linear sum of vector that is represented by list, each element can be represented by list  multidimensional elements. More...  
def  square_sum (list_vector) 
Calculates square sum of vector that is represented by list, each element can be represented by list  multidimensional elements. More...  
def  list_math_subtraction (a, b) 
Calculates subtraction of two lists. More...  
def  list_math_substraction_number (a, b) 
Calculates subtraction between list and number. More...  
def  list_math_addition (a, b) 
Addition of two lists. More...  
def  list_math_addition_number (a, b) 
Addition between list and number. More...  
def  list_math_division_number (a, b) 
Division between list and number. More...  
def  list_math_division (a, b) 
Division of two lists. More...  
def  list_math_multiplication_number (a, b) 
Multiplication between list and number. More...  
def  list_math_multiplication (a, b) 
Multiplication of two lists. More...  
Variables  
float  pi = 3.1415926535 
The number \(pi\) is a mathematical constant, the ratio of a circle's circumference to its diameter.  
Utils that are used by modules of pyclustering.
def pyclustering.utils.allocate_sync_ensembles  (  dynamic,  
tolerance = 0.1 , 

threshold = 1.0 , 

ignore = None 

) 
Allocate clusters in line with ensembles of synchronous oscillators where each synchronous ensemble corresponds to only one cluster.
[in]  dynamic  (dynamic): Dynamic of each oscillator. 
[in]  tolerance  (double): Maximum error for allocation of synchronous ensemble oscillators. 
[in]  threshold  (double): Amlitude trigger when spike is taken into account. 
[in]  ignore  (bool): Set of indexes that shouldn't be taken into account. 
Definition at line 631 of file __init__.py.
Referenced by pyclustering.nnet.fsync.fsync_dynamic.allocate_sync_ensembles().
def pyclustering.utils.average_inter_cluster_distance  (  cluster1,  
cluster2,  
data = None 

) 
Calculates average intercluster distance between two clusters.
Clusters can be represented by list of coordinates (in this case data shouldn't be specified), or by list of indexes of points from the data (represented by list of points), in this case data should be specified.
[in]  cluster1  (list): The first cluster where each element can represent index from the data or object itself. 
[in]  cluster2  (list): The second cluster where each element can represent index from the data or object itself. 
[in]  data  (list): If specified than elements of clusters will be used as indexes, otherwise elements of cluster will be considered as points. 
Definition at line 331 of file __init__.py.
def pyclustering.utils.average_intra_cluster_distance  (  cluster1,  
cluster2,  
data = None 

) 
Calculates average intracluster distance between two clusters.
Clusters can be represented by list of coordinates (in this case data shouldn't be specified), or by list of indexes of points from the data (represented by list of points), in this case data should be specified.
[in]  cluster1  (list): The first cluster. 
[in]  cluster2  (list): The second cluster. 
[in]  data  (list): If specified than elements of clusters will be used as indexes, otherwise elements of cluster will be considered as points. 
Definition at line 362 of file __init__.py.
def pyclustering.utils.average_neighbor_distance  (  points,  
num_neigh  
) 
Returns average distance for establish links between specified number of nearest neighbors.
[in]  points  (list): Input data, list of points where each point represented by list. 
[in]  num_neigh  (uint): Number of neighbors that should be used for distance calculation. 
Definition at line 180 of file __init__.py.
def pyclustering.utils.calculate_distance_matrix  (  sample,  
metric = distance_metric(type_metric.EUCLIDEAN) 

) 
Calculates distance matrix for data sample (sequence of points) using specified metric (by default Euclidean distance).
[in]  sample  (array_like): Data points that are used for distance calculation. 
[in]  metric  (distance_metric): Metric that is used for distance calculation between two points. 
Definition at line 54 of file __init__.py.
Referenced by pyclustering.utils.sampling.reservoir_x().
def pyclustering.utils.calculate_ellipse_description  (  covariance,  
scale = 2.0 

) 
Calculates description of ellipse using covariance matrix.
[in]  covariance  (numpy.array): Covariance matrix for which ellipse area should be calculated. 
[in]  scale  (float): Scale of the ellipse. 
Definition at line 482 of file __init__.py.
def pyclustering.utils.data_corners  (  data,  
data_filter = None 

) 
Finds maximum and minimum corner in each dimension of the specified data.
[in]  data  (list): List of points that should be analysed. 
[in]  data_filter  (list): List of indexes of the data that should be analysed, if it is 'None' then whole 'data' is analysed to obtain corners. 
Definition at line 506 of file __init__.py.
def pyclustering.utils.draw_clusters  (  data,  
clusters,  
noise = [] , 

marker_descr = '.' , 

hide_axes = False , 

axes = None , 

display_result = True 

) 
Displays clusters for data in 2D or 3D.
[in]  data  (list): Points that are described by coordinates represented. 
[in]  clusters  (list): Clusters that are represented by lists of indexes where each index corresponds to point in data. 
[in]  noise  (list): Points that are regarded to noise. 
[in]  marker_descr  (string): Marker for displaying points. 
[in]  hide_axes  (bool): If True  axes is not displayed. 
[in]  axes  (ax) Matplotlib axes where clusters should be drawn, if it is not specified (None) then new plot will be created. 
[in]  display_result  (bool): If specified then matplotlib axes will be used for drawing and plot will not be shown. 
Definition at line 727 of file __init__.py.
Referenced by pyclustering.utils.sampling.reservoir_x().
def pyclustering.utils.draw_dynamics  (  t,  
dyn,  
x_title = None , 

y_title = None , 

x_lim = None , 

y_lim = None , 

x_labels = True , 

y_labels = True , 

separate = False , 

axes = None 

) 
Draw dynamics of neurons (oscillators) in the network.
It draws if matplotlib is not specified (None), othewise it should be performed manually.
[in]  t  (list): Values of time (used by x axis). 
[in]  dyn  (list): Values of output of oscillators (used by y axis). 
[in]  x_title  (string): Title for Y. 
[in]  y_title  (string): Title for X. 
[in]  x_lim  (double): X limit. 
[in]  y_lim  (double): Y limit. 
[in]  x_labels  (bool): If True  shows X labels. 
[in]  y_labels  (bool): If True  shows Y labels. 
[in]  separate  (list): Consists of lists of oscillators where each such list consists of oscillator indexes that will be shown on separated stage. 
[in]  axes  (ax): If specified then matplotlib axes will be used for drawing and plot will not be shown. 
Definition at line 829 of file __init__.py.
Referenced by pyclustering.utils.draw_dynamics_set(), pyclustering.utils.sampling.reservoir_x(), and pyclustering.nnet.fsync.fsync_visualizer.show_output_dynamic().
def pyclustering.utils.draw_dynamics_set  (  dynamics,  
xtitle = None , 

ytitle = None , 

xlim = None , 

ylim = None , 

xlabels = False , 

ylabels = False 

) 
Draw lists of dynamics of neurons (oscillators) in the network.
[in]  dynamics  (list): List of network outputs that are represented by values of output of oscillators (used by y axis). 
[in]  xtitle  (string): Title for Y. 
[in]  ytitle  (string): Title for X. 
[in]  xlim  (double): X limit. 
[in]  ylim  (double): Y limit. 
[in]  xlabels  (bool): If True  shows X labels. 
[in]  ylabels  (bool): If True  shows Y labels. 
Definition at line 957 of file __init__.py.
Referenced by pyclustering.nnet.fsync.fsync_visualizer.show_output_dynamics().
def pyclustering.utils.draw_image_color_segments  (  source,  
clusters,  
hide_axes = True 

) 
Shows image segments using colored image.
Each color on result image represents allocated segment. The first image is initial and other is result of segmentation.
[in]  source  (string): Path to image. 
[in]  clusters  (list): List of clusters (allocated segments of image) where each cluster consists of indexes of pixel from source image. 
[in]  hide_axes  (bool): If True then axes will not be displayed. 
Definition at line 1002 of file __init__.py.
def pyclustering.utils.draw_image_mask_segments  (  source,  
clusters,  
hide_axes = True 

) 
Shows image segments using black masks.
Each black mask of allocated segment is presented on separate plot. The first image is initial and others are black masks of segments.
[in]  source  (string): Path to image. 
[in]  clusters  (list): List of clusters (allocated segments of image) where each cluster consists of indexes of pixel from source image. 
[in]  hide_axes  (bool): If True then axes will not be displayed. 
Definition at line 1054 of file __init__.py.
Referenced by pyclustering.utils.sampling.reservoir_x().
def pyclustering.utils.euclidean_distance  (  a,  
b  
) 
Calculate Euclidean distance between vector a and b.
The Euclidean between vectors (points) a and b is calculated by following formula:
\[ dist(a, b) = \sqrt{ \sum_{i=0}^{N}(b_{i}  a_{i})^{2}) }; \]
Where N is a length of each vector.
[in]  a  (list): The first vector. 
[in]  b  (list): The second vector. 
Definition at line 263 of file __init__.py.
Referenced by pyclustering.utils.average_neighbor_distance().
def pyclustering.utils.euclidean_distance_square  (  a,  
b  
) 
Calculate square Euclidian distance between vector a and b.
[in]  a  (list): The first vector. 
[in]  b  (list): The second vector. 
Definition at line 287 of file __init__.py.
Referenced by pyclustering.utils.average_inter_cluster_distance(), pyclustering.utils.average_intra_cluster_distance(), pyclustering.utils.euclidean_distance(), and pyclustering.utils.variance_increase_distance().
def pyclustering.utils.extract_number_oscillations  (  osc_dyn,  
index = 0 , 

amplitude_threshold = 1.0 

) 
Extracts number of oscillations of specified oscillator.
[in]  osc_dyn  (list): Dynamic of oscillators. 
[in]  index  (uint): Index of oscillator in dynamic. 
[in]  amplitude_threshold  (double): Amplitude threshold when oscillation is taken into account, for example, when oscillator amplitude is greater than threshold then oscillation is incremented. 
Definition at line 592 of file __init__.py.
Referenced by pyclustering.nnet.fsync.fsync_dynamic.extract_number_oscillations().
def pyclustering.utils.find_left_element  (  sorted_data,  
right,  
comparator  
) 
Returns the element's index at the left side from the right border with the same value as the last element in the range sorted_data
.
The element at the right is considered as target to search. sorted_data
must be sorted collection. The complexity of the algorithm is O(log(n))
. The algorithm is based on the binary search algorithm.
[in]  sorted_data  input data to find the element. 
[in]  right  the index of the right element from that search is started. 
[in]  comparator  comparison function object which returns True if the first argument is less than the second. 
sorted_data
. Definition at line 1132 of file __init__.py.
def pyclustering.utils.gray_pattern_borders  (  image  ) 
Returns coordinates of gray image content on the input image.
[in]  image  (Image): PIL Image instance that is processed. 
Definition at line 138 of file __init__.py.
Referenced by pyclustering.utils.stretch_pattern().
def pyclustering.utils.heaviside  (  value  ) 
Calculates Heaviside function that represents step function.
If input value is greater than 0 then returns 1, otherwise returns 0.
[in]  value  (double): Argument of Heaviside function. 
Definition at line 557 of file __init__.py.
def pyclustering.utils.linear_sum  (  list_vector  ) 
Calculates linear sum of vector that is represented by list, each element can be represented by list  multidimensional elements.
[in]  list_vector  (list): Input vector. 
Definition at line 1169 of file __init__.py.
def pyclustering.utils.list_math_addition  (  a,  
b  
) 
Addition of two lists.
Each element from list 'a' is added to element from list 'b' accordingly.
[in]  a  (list): List of elements that supports mathematic addition.. 
[in]  b  (list): List of elements that supports mathematic addition.. 
Definition at line 1246 of file __init__.py.
Referenced by pyclustering.utils.variance_increase_distance().
def pyclustering.utils.list_math_addition_number  (  a,  
b  
) 
Addition between list and number.
Each element from list 'a' is added to number 'b'.
[in]  a  (list): List of elements that supports mathematic addition. 
[in]  b  (double): Value that supports mathematic addition. 
Definition at line 1260 of file __init__.py.
def pyclustering.utils.list_math_division  (  a,  
b  
) 
Division of two lists.
Each element from list 'a' is divided by element from list 'b' accordingly.
[in]  a  (list): List of elements that supports mathematic division. 
[in]  b  (list): List of elements that supports mathematic division. 
Definition at line 1288 of file __init__.py.
def pyclustering.utils.list_math_division_number  (  a,  
b  
) 
Division between list and number.
Each element from list 'a' is divided by number 'b'.
[in]  a  (list): List of elements that supports mathematic division. 
[in]  b  (double): Value that supports mathematic division. 
Definition at line 1274 of file __init__.py.
Referenced by pyclustering.utils.variance_increase_distance().
def pyclustering.utils.list_math_multiplication  (  a,  
b  
) 
Multiplication of two lists.
Each element from list 'a' is multiplied by element from list 'b' accordingly.
[in]  a  (list): List of elements that supports mathematic multiplication. 
[in]  b  (list): List of elements that supports mathematic multiplication. 
Definition at line 1316 of file __init__.py.
Referenced by pyclustering.utils.square_sum().
def pyclustering.utils.list_math_multiplication_number  (  a,  
b  
) 
Multiplication between list and number.
Each element from list 'a' is multiplied by number 'b'.
[in]  a  (list): List of elements that supports mathematic division. 
[in]  b  (double): Number that supports mathematic division. 
Definition at line 1302 of file __init__.py.
def pyclustering.utils.list_math_substraction_number  (  a,  
b  
) 
Calculates subtraction between list and number.
Each element from list 'a' is subtracted by number 'b'.
[in]  a  (list): List of elements that supports mathematical subtraction. 
[in]  b  (list): Value that supports mathematical subtraction. 
Definition at line 1232 of file __init__.py.
def pyclustering.utils.list_math_subtraction  (  a,  
b  
) 
Calculates subtraction of two lists.
Each element from list 'a' is subtracted by element from list 'b' accordingly.
[in]  a  (list): List of elements that supports mathematical subtraction. 
[in]  b  (list): List of elements that supports mathematical subtraction. 
Definition at line 1218 of file __init__.py.
def pyclustering.utils.manhattan_distance  (  a,  
b  
) 
Calculate Manhattan distance between vector a and b.
[in]  a  (list): The first cluster. 
[in]  b  (list): The second cluster. 
Definition at line 308 of file __init__.py.
def pyclustering.utils.medoid  (  data,  
indexes = None , 

**  kwargs  
) 
Calculate medoid for input points.
[in]  data  (list): Set of points for that median should be calculated. 
[in]  indexes  (list): Indexes of input set of points that will be taken into account during median calculation. 
[in]  **kwargs  Arbitrary keyword arguments (available arguments: 'metric', 'data_type'). 
Keyword Args:
Definition at line 213 of file __init__.py.
def pyclustering.utils.norm_vector  (  vector  ) 
Calculates norm of an input vector that is known as a vector length.
[in]  vector  (list): The input vector whose length is calculated. 
Definition at line 538 of file __init__.py.
def pyclustering.utils.read_image  (  filename  ) 
Returns image as Ndimension (depends on the input image) matrix, where one element of list describes pixel.
[in]  filename  (string): Path to image. 
Definition at line 69 of file __init__.py.
Referenced by pyclustering.utils.sampling.reservoir_x().
def pyclustering.utils.read_sample  (  filename  ) 
Returns data sample from simple text file.
This function should be used for text file with following format:
[in]  filename  (string): Path to file with data. 
Definition at line 30 of file __init__.py.
Referenced by pyclustering.utils.sampling.reservoir_x().
def pyclustering.utils.rgb2gray  (  image_rgb_array  ) 
Returns image as 1dimension (gray colored) matrix, where one element of list describes pixel.
Luma coding is used for transformation and that is calculated directly from gammacompressed primary intensities as a weighted sum:
\[Y = 0.2989R + 0.587G + 0.114B\]
[in]  image_rgb_array  (list): Image represented by RGB list. 
Definition at line 84 of file __init__.py.
Referenced by pyclustering.utils.sampling.reservoir_x(), and pyclustering.utils.stretch_pattern().
def pyclustering.utils.set_ax_param  (  ax,  
x_title = None , 

y_title = None , 

x_lim = None , 

y_lim = None , 

x_labels = True , 

y_labels = True , 

grid = True 

) 
Sets parameters for matplotlib ax.
[in]  ax  (Axes): Axes for which parameters should applied. 
[in]  x_title  (string): Title for Y. 
[in]  y_title  (string): Title for X. 
[in]  x_lim  (double): X limit. 
[in]  y_lim  (double): Y limit. 
[in]  x_labels  (bool): If True  shows X labels. 
[in]  y_labels  (bool): If True  shows Y labels. 
[in]  grid  (bool): If True  shows grid. 
Definition at line 913 of file __init__.py.
Referenced by pyclustering.utils.draw_dynamics().
def pyclustering.utils.square_sum  (  list_vector  ) 
Calculates square sum of vector that is represented by list, each element can be represented by list  multidimensional elements.
[in]  list_vector  (list): Input vector. 
Definition at line 1196 of file __init__.py.
def pyclustering.utils.stretch_pattern  (  image_source  ) 
Returns stretched content as 1dimension (gray colored) matrix with size of input image.
[in]  image_source  (Image): PIL Image instance. 
Definition at line 113 of file __init__.py.
def pyclustering.utils.timedcall  (  executable_function,  
*  args,  
**  kwargs  
) 
Executes specified method or function with measuring of execution time.
[in]  executable_function  (pointer): Pointer to a function or method that should be called. 
[in]  *args  Arguments of the called function or method. 
[in]  **kwargs  Arbitrary keyword arguments of the called function or method. 
Definition at line 573 of file __init__.py.
def pyclustering.utils.variance_increase_distance  (  cluster1,  
cluster2,  
data = None 

) 
Calculates variance increase distance between two clusters.
Clusters can be represented by list of coordinates (in this case data shouldn't be specified), or by list of indexes of points from the data (represented by list of points), in this case data should be specified.
[in]  cluster1  (list): The first cluster. 
[in]  cluster2  (list): The second cluster. 
[in]  data  (list): If specified than elements of clusters will be used as indexes, otherwise elements of cluster will be considered as points. 
Definition at line 413 of file __init__.py.