pyclustering  0.10.1 pyclustring is a Python, C++ data mining library.
pyclustering.utils Namespace Reference

Utils that are used by modules of pyclustering. More...

## Namespaces

color
Colors used by pyclustering library for visualization.

graph
Graph representation (uses format GRPR).

metric
Module provides various distance metrics - abstraction of the notion of distance in a metric space.

sampling
Module provides various random sampling algorithms.

## Functions

Returns data sample from simple text file. More...

def calculate_distance_matrix (sample, metric=distance_metric(type_metric.EUCLIDEAN))
Calculates distance matrix for data sample (sequence of points) using specified metric (by default Euclidean distance). More...

Returns image as N-dimension (depends on the input image) matrix, where one element of list describes pixel. More...

def rgb2gray (image_rgb_array)
Returns image as 1-dimension (gray colored) matrix, where one element of list describes pixel. More...

def stretch_pattern (image_source)
Returns stretched content as 1-dimension (gray colored) matrix with size of input image. More...

def gray_pattern_borders (image)
Returns coordinates of gray image content on the input image. More...

def average_neighbor_distance (points, num_neigh)
Returns average distance for establish links between specified number of nearest neighbors. More...

def medoid (data, indexes=None, **kwargs)
Calculate medoid for input points. More...

def euclidean_distance (a, b)
Calculate Euclidean distance between vector a and b. More...

def euclidean_distance_square (a, b)
Calculate square Euclidian distance between vector a and b. More...

def manhattan_distance (a, b)
Calculate Manhattan distance between vector a and b. More...

def average_inter_cluster_distance (cluster1, cluster2, data=None)
Calculates average inter-cluster distance between two clusters. More...

def average_intra_cluster_distance (cluster1, cluster2, data=None)
Calculates average intra-cluster distance between two clusters. More...

def variance_increase_distance (cluster1, cluster2, data=None)
Calculates variance increase distance between two clusters. More...

def calculate_ellipse_description (covariance, scale=2.0)
Calculates description of ellipse using covariance matrix. More...

def data_corners (data, data_filter=None)
Finds maximum and minimum corner in each dimension of the specified data. More...

def norm_vector (vector)
Calculates norm of an input vector that is known as a vector length. More...

def heaviside (value)
Calculates Heaviside function that represents step function. More...

def timedcall (executable_function, *args, **kwargs)
Executes specified method or function with measuring of execution time. More...

def extract_number_oscillations (osc_dyn, index=0, amplitude_threshold=1.0)
Extracts number of oscillations of specified oscillator. More...

def allocate_sync_ensembles (dynamic, tolerance=0.1, threshold=1.0, ignore=None)
Allocate clusters in line with ensembles of synchronous oscillators where each synchronous ensemble corresponds to only one cluster. More...

def draw_clusters (data, clusters, noise=[], marker_descr='.', hide_axes=False, axes=None, display_result=True)
Displays clusters for data in 2D or 3D. More...

def draw_dynamics (t, dyn, x_title=None, y_title=None, x_lim=None, y_lim=None, x_labels=True, y_labels=True, separate=False, axes=None)
Draw dynamics of neurons (oscillators) in the network. More...

def set_ax_param (ax, x_title=None, y_title=None, x_lim=None, y_lim=None, x_labels=True, y_labels=True, grid=True)
Sets parameters for matplotlib ax. More...

def draw_dynamics_set (dynamics, xtitle=None, ytitle=None, xlim=None, ylim=None, xlabels=False, ylabels=False)
Draw lists of dynamics of neurons (oscillators) in the network. More...

def draw_image_color_segments (source, clusters, hide_axes=True)
Shows image segments using colored image. More...

Shows image segments using black masks. More...

def find_left_element (sorted_data, right, comparator)
Returns the element's index at the left side from the right border with the same value as the last element in the range sorted_data. More...

def linear_sum (list_vector)
Calculates linear sum of vector that is represented by list, each element can be represented by list - multidimensional elements. More...

def square_sum (list_vector)
Calculates square sum of vector that is represented by list, each element can be represented by list - multidimensional elements. More...

def list_math_subtraction (a, b)
Calculates subtraction of two lists. More...

def list_math_substraction_number (a, b)
Calculates subtraction between list and number. More...

Addition between list and number. More...

def list_math_division_number (a, b)
Division between list and number. More...

def list_math_division (a, b)
Division of two lists. More...

def list_math_multiplication_number (a, b)
Multiplication between list and number. More...

def list_math_multiplication (a, b)
Multiplication of two lists. More...

## Variables

float pi = 3.1415926535
The number $$pi$$ is a mathematical constant, the ratio of a circle's circumference to its diameter.

## Detailed Description

Utils that are used by modules of pyclustering.

Date
2014-2020

## ◆ allocate_sync_ensembles()

 def pyclustering.utils.allocate_sync_ensembles ( dynamic, tolerance = 0.1, threshold = 1.0, ignore = None )

Allocate clusters in line with ensembles of synchronous oscillators where each synchronous ensemble corresponds to only one cluster.

Parameters
 [in] dynamic (dynamic): Dynamic of each oscillator. [in] tolerance (double): Maximum error for allocation of synchronous ensemble oscillators. [in] threshold (double): Amlitude trigger when spike is taken into account. [in] ignore (bool): Set of indexes that shouldn't be taken into account.
Returns
(list) Grours (lists) of indexes of synchronous oscillators, for example, [ [index_osc1, index_osc3], [index_osc2], [index_osc4, index_osc5] ].

Definition at line 631 of file __init__.py.

## ◆ average_inter_cluster_distance()

 def pyclustering.utils.average_inter_cluster_distance ( cluster1, cluster2, data = None )

Calculates average inter-cluster distance between two clusters.

Clusters can be represented by list of coordinates (in this case data shouldn't be specified), or by list of indexes of points from the data (represented by list of points), in this case data should be specified.

Parameters
 [in] cluster1 (list): The first cluster where each element can represent index from the data or object itself. [in] cluster2 (list): The second cluster where each element can represent index from the data or object itself. [in] data (list): If specified than elements of clusters will be used as indexes, otherwise elements of cluster will be considered as points.
Returns
(double) Average inter-cluster distance between two clusters.

Definition at line 331 of file __init__.py.

## ◆ average_intra_cluster_distance()

 def pyclustering.utils.average_intra_cluster_distance ( cluster1, cluster2, data = None )

Calculates average intra-cluster distance between two clusters.

Clusters can be represented by list of coordinates (in this case data shouldn't be specified), or by list of indexes of points from the data (represented by list of points), in this case data should be specified.

Parameters
 [in] cluster1 (list): The first cluster. [in] cluster2 (list): The second cluster. [in] data (list): If specified than elements of clusters will be used as indexes, otherwise elements of cluster will be considered as points.
Returns
(double) Average intra-cluster distance between two clusters.

Definition at line 362 of file __init__.py.

## ◆ average_neighbor_distance()

 def pyclustering.utils.average_neighbor_distance ( points, num_neigh )

Returns average distance for establish links between specified number of nearest neighbors.

Parameters
 [in] points (list): Input data, list of points where each point represented by list. [in] num_neigh (uint): Number of neighbors that should be used for distance calculation.
Returns
(double) Average distance for establish links between 'num_neigh' in data set 'points'.

Definition at line 180 of file __init__.py.

## ◆ calculate_distance_matrix()

 def pyclustering.utils.calculate_distance_matrix ( sample, metric = distance_metric(type_metric.EUCLIDEAN) )

Calculates distance matrix for data sample (sequence of points) using specified metric (by default Euclidean distance).

Parameters
 [in] sample (array_like): Data points that are used for distance calculation. [in] metric (distance_metric): Metric that is used for distance calculation between two points.
Returns
(list) Matrix distance between data points.

Definition at line 54 of file __init__.py.

Referenced by pyclustering.utils.sampling.reservoir_x().

## ◆ calculate_ellipse_description()

 def pyclustering.utils.calculate_ellipse_description ( covariance, scale = 2.0 )

Calculates description of ellipse using covariance matrix.

Parameters
 [in] covariance (numpy.array): Covariance matrix for which ellipse area should be calculated. [in] scale (float): Scale of the ellipse.
Returns
(float, float, float) Return ellipse description: angle, width, height.

Definition at line 482 of file __init__.py.

## ◆ data_corners()

 def pyclustering.utils.data_corners ( data, data_filter = None )

Finds maximum and minimum corner in each dimension of the specified data.

Parameters
 [in] data (list): List of points that should be analysed. [in] data_filter (list): List of indexes of the data that should be analysed, if it is 'None' then whole 'data' is analysed to obtain corners.
Returns
(list) Tuple of two points that corresponds to minimum and maximum corner (min_corner, max_corner).

Definition at line 506 of file __init__.py.

## ◆ draw_clusters()

 def pyclustering.utils.draw_clusters ( data, clusters, noise = [], marker_descr = '.', hide_axes = False, axes = None, display_result = True )

Displays clusters for data in 2D or 3D.

Parameters
 [in] data (list): Points that are described by coordinates represented. [in] clusters (list): Clusters that are represented by lists of indexes where each index corresponds to point in data. [in] noise (list): Points that are regarded to noise. [in] marker_descr (string): Marker for displaying points. [in] hide_axes (bool): If True - axes is not displayed. [in] axes (ax) Matplotlib axes where clusters should be drawn, if it is not specified (None) then new plot will be created. [in] display_result (bool): If specified then matplotlib axes will be used for drawing and plot will not be shown.
Returns
(ax) Matplotlib axes where drawn clusters are presented.

Definition at line 727 of file __init__.py.

Referenced by pyclustering.utils.sampling.reservoir_x().

## ◆ draw_dynamics()

 def pyclustering.utils.draw_dynamics ( t, dyn, x_title = None, y_title = None, x_lim = None, y_lim = None, x_labels = True, y_labels = True, separate = False, axes = None )

Draw dynamics of neurons (oscillators) in the network.

It draws if matplotlib is not specified (None), othewise it should be performed manually.

Parameters
 [in] t (list): Values of time (used by x axis). [in] dyn (list): Values of output of oscillators (used by y axis). [in] x_title (string): Title for Y. [in] y_title (string): Title for X. [in] x_lim (double): X limit. [in] y_lim (double): Y limit. [in] x_labels (bool): If True - shows X labels. [in] y_labels (bool): If True - shows Y labels. [in] separate (list): Consists of lists of oscillators where each such list consists of oscillator indexes that will be shown on separated stage. [in] axes (ax): If specified then matplotlib axes will be used for drawing and plot will not be shown.
Returns
(ax) Axes of matplotlib.

Definition at line 829 of file __init__.py.

## ◆ draw_dynamics_set()

 def pyclustering.utils.draw_dynamics_set ( dynamics, xtitle = None, ytitle = None, xlim = None, ylim = None, xlabels = False, ylabels = False )

Draw lists of dynamics of neurons (oscillators) in the network.

Parameters
 [in] dynamics (list): List of network outputs that are represented by values of output of oscillators (used by y axis). [in] xtitle (string): Title for Y. [in] ytitle (string): Title for X. [in] xlim (double): X limit. [in] ylim (double): Y limit. [in] xlabels (bool): If True - shows X labels. [in] ylabels (bool): If True - shows Y labels.

Definition at line 957 of file __init__.py.

## ◆ draw_image_color_segments()

 def pyclustering.utils.draw_image_color_segments ( source, clusters, hide_axes = True )

Shows image segments using colored image.

Each color on result image represents allocated segment. The first image is initial and other is result of segmentation.

Parameters
 [in] source (string): Path to image. [in] clusters (list): List of clusters (allocated segments of image) where each cluster consists of indexes of pixel from source image. [in] hide_axes (bool): If True then axes will not be displayed.

Definition at line 1002 of file __init__.py.

 def pyclustering.utils.draw_image_mask_segments ( source, clusters, hide_axes = True )

Shows image segments using black masks.

Each black mask of allocated segment is presented on separate plot. The first image is initial and others are black masks of segments.

Parameters
 [in] source (string): Path to image. [in] clusters (list): List of clusters (allocated segments of image) where each cluster consists of indexes of pixel from source image. [in] hide_axes (bool): If True then axes will not be displayed.

Definition at line 1054 of file __init__.py.

Referenced by pyclustering.utils.sampling.reservoir_x().

## ◆ euclidean_distance()

 def pyclustering.utils.euclidean_distance ( a, b )

Calculate Euclidean distance between vector a and b.

The Euclidean between vectors (points) a and b is calculated by following formula:

$dist(a, b) = \sqrt{ \sum_{i=0}^{N}(b_{i} - a_{i})^{2}) };$

Where N is a length of each vector.

Parameters
 [in] a (list): The first vector. [in] b (list): The second vector.
Returns
(double) Euclidian distance between two vectors.
Note
This function for calculation is faster then standard function in ~100 times!

Definition at line 263 of file __init__.py.

Referenced by pyclustering.utils.average_neighbor_distance().

## ◆ euclidean_distance_square()

 def pyclustering.utils.euclidean_distance_square ( a, b )

Calculate square Euclidian distance between vector a and b.

Parameters
 [in] a (list): The first vector. [in] b (list): The second vector.
Returns
(double) Square Euclidian distance between two vectors.

Definition at line 287 of file __init__.py.

## ◆ extract_number_oscillations()

 def pyclustering.utils.extract_number_oscillations ( osc_dyn, index = 0, amplitude_threshold = 1.0 )

Extracts number of oscillations of specified oscillator.

Parameters
 [in] osc_dyn (list): Dynamic of oscillators. [in] index (uint): Index of oscillator in dynamic. [in] amplitude_threshold (double): Amplitude threshold when oscillation is taken into account, for example, when oscillator amplitude is greater than threshold then oscillation is incremented.
Returns
(uint) Number of oscillations of specified oscillator.

Definition at line 592 of file __init__.py.

## ◆ find_left_element()

 def pyclustering.utils.find_left_element ( sorted_data, right, comparator )

Returns the element's index at the left side from the right border with the same value as the last element in the range sorted_data.

The element at the right is considered as target to search. sorted_data must be sorted collection. The complexity of the algorithm is O(log(n)). The algorithm is based on the binary search algorithm.

Parameters
 [in] sorted_data input data to find the element. [in] right the index of the right element from that search is started. [in] comparator comparison function object which returns True if the first argument is less than the second.
Returns
The element's index at the left side from the right border with the same value as the last element in the range sorted_data.

Definition at line 1132 of file __init__.py.

## ◆ gray_pattern_borders()

 def pyclustering.utils.gray_pattern_borders ( image )

Returns coordinates of gray image content on the input image.

Parameters
 [in] image (Image): PIL Image instance that is processed.
Returns
(tuple) Returns coordinates of gray image content as (width_start, height_start, width_end, height_end).

Definition at line 138 of file __init__.py.

Referenced by pyclustering.utils.stretch_pattern().

## ◆ heaviside()

 def pyclustering.utils.heaviside ( value )

Calculates Heaviside function that represents step function.

If input value is greater than 0 then returns 1, otherwise returns 0.

Parameters
 [in] value (double): Argument of Heaviside function.
Returns
(double) Value of Heaviside function.

Definition at line 557 of file __init__.py.

## ◆ linear_sum()

 def pyclustering.utils.linear_sum ( list_vector )

Calculates linear sum of vector that is represented by list, each element can be represented by list - multidimensional elements.

Parameters
 [in] list_vector (list): Input vector.
Returns
(list|double) Linear sum of vector that can be represented by list in case of multidimensional elements.

Definition at line 1169 of file __init__.py.

 def pyclustering.utils.list_math_addition ( a, b )

Each element from list 'a' is added to element from list 'b' accordingly.

Parameters
 [in] a (list): List of elements that supports mathematic addition.. [in] b (list): List of elements that supports mathematic addition..
Returns
(list) Results of addtion of two lists.

Definition at line 1246 of file __init__.py.

Referenced by pyclustering.utils.variance_increase_distance().

 def pyclustering.utils.list_math_addition_number ( a, b )

Each element from list 'a' is added to number 'b'.

Parameters
 [in] a (list): List of elements that supports mathematic addition. [in] b (double): Value that supports mathematic addition.
Returns
(list) Result of addtion of two lists.

Definition at line 1260 of file __init__.py.

## ◆ list_math_division()

 def pyclustering.utils.list_math_division ( a, b )

Division of two lists.

Each element from list 'a' is divided by element from list 'b' accordingly.

Parameters
 [in] a (list): List of elements that supports mathematic division. [in] b (list): List of elements that supports mathematic division.
Returns
(list) Result of division of two lists.

Definition at line 1288 of file __init__.py.

## ◆ list_math_division_number()

 def pyclustering.utils.list_math_division_number ( a, b )

Division between list and number.

Each element from list 'a' is divided by number 'b'.

Parameters
 [in] a (list): List of elements that supports mathematic division. [in] b (double): Value that supports mathematic division.
Returns
(list) Result of division between list and number.

Definition at line 1274 of file __init__.py.

Referenced by pyclustering.utils.variance_increase_distance().

## ◆ list_math_multiplication()

 def pyclustering.utils.list_math_multiplication ( a, b )

Multiplication of two lists.

Each element from list 'a' is multiplied by element from list 'b' accordingly.

Parameters
 [in] a (list): List of elements that supports mathematic multiplication. [in] b (list): List of elements that supports mathematic multiplication.
Returns
(list) Result of multiplication of elements in two lists.

Definition at line 1316 of file __init__.py.

Referenced by pyclustering.utils.square_sum().

## ◆ list_math_multiplication_number()

 def pyclustering.utils.list_math_multiplication_number ( a, b )

Multiplication between list and number.

Each element from list 'a' is multiplied by number 'b'.

Parameters
 [in] a (list): List of elements that supports mathematic division. [in] b (double): Number that supports mathematic division.
Returns
(list) Result of division between list and number.

Definition at line 1302 of file __init__.py.

## ◆ list_math_substraction_number()

 def pyclustering.utils.list_math_substraction_number ( a, b )

Calculates subtraction between list and number.

Each element from list 'a' is subtracted by number 'b'.

Parameters
 [in] a (list): List of elements that supports mathematical subtraction. [in] b (list): Value that supports mathematical subtraction.
Returns
(list) Results of subtraction between list and number.

Definition at line 1232 of file __init__.py.

## ◆ list_math_subtraction()

 def pyclustering.utils.list_math_subtraction ( a, b )

Calculates subtraction of two lists.

Each element from list 'a' is subtracted by element from list 'b' accordingly.

Parameters
 [in] a (list): List of elements that supports mathematical subtraction. [in] b (list): List of elements that supports mathematical subtraction.
Returns
(list) Results of subtraction of two lists.

Definition at line 1218 of file __init__.py.

## ◆ manhattan_distance()

 def pyclustering.utils.manhattan_distance ( a, b )

Calculate Manhattan distance between vector a and b.

Parameters
 [in] a (list): The first cluster. [in] b (list): The second cluster.
Returns
(double) Manhattan distance between two vectors.

Definition at line 308 of file __init__.py.

## ◆ medoid()

 def pyclustering.utils.medoid ( data, indexes = None, ** kwargs )

Calculate medoid for input points.

Parameters
 [in] data (list): Set of points for that median should be calculated. [in] indexes (list): Indexes of input set of points that will be taken into account during median calculation. [in] **kwargs Arbitrary keyword arguments (available arguments: 'metric', 'data_type').

Keyword Args:

• metric (distance_metric): Metric that is used for distance calculation between two points.
• data_type (string): Data type of input sample 'data' (available values: 'points', 'distance_matrix').
Returns
(uint) index of point in input set that corresponds to median.

Definition at line 213 of file __init__.py.

## ◆ norm_vector()

 def pyclustering.utils.norm_vector ( vector )

Calculates norm of an input vector that is known as a vector length.

Parameters
 [in] vector (list): The input vector whose length is calculated.
Returns
(double) vector norm known as vector length.

Definition at line 538 of file __init__.py.

Returns image as N-dimension (depends on the input image) matrix, where one element of list describes pixel.

Parameters
 [in] filename (string): Path to image.
Returns
(list) Pixels where each pixel described by list of RGB-values.

Definition at line 69 of file __init__.py.

Referenced by pyclustering.utils.sampling.reservoir_x().

Returns data sample from simple text file.

This function should be used for text file with following format:

point_1_coord_1 point_1_coord_2 ... point_1_coord_n
point_2_coord_1 point_2_coord_2 ... point_2_coord_n
... ...
Parameters
 [in] filename (string): Path to file with data.
Returns
(list) Points where each point represented by list of coordinates.

Definition at line 30 of file __init__.py.

Referenced by pyclustering.utils.sampling.reservoir_x().

## ◆ rgb2gray()

 def pyclustering.utils.rgb2gray ( image_rgb_array )

Returns image as 1-dimension (gray colored) matrix, where one element of list describes pixel.

Luma coding is used for transformation and that is calculated directly from gamma-compressed primary intensities as a weighted sum:

$Y = 0.2989R + 0.587G + 0.114B$

Parameters
 [in] image_rgb_array (list): Image represented by RGB list.
Returns
(list) Image as gray colored matrix, where one element of list describes pixel.
gray_image = rgb2gray(colored_image);

Definition at line 84 of file __init__.py.

## ◆ set_ax_param()

 def pyclustering.utils.set_ax_param ( ax, x_title = None, y_title = None, x_lim = None, y_lim = None, x_labels = True, y_labels = True, grid = True )

Sets parameters for matplotlib ax.

Parameters
 [in] ax (Axes): Axes for which parameters should applied. [in] x_title (string): Title for Y. [in] y_title (string): Title for X. [in] x_lim (double): X limit. [in] y_lim (double): Y limit. [in] x_labels (bool): If True - shows X labels. [in] y_labels (bool): If True - shows Y labels. [in] grid (bool): If True - shows grid.

Definition at line 913 of file __init__.py.

Referenced by pyclustering.utils.draw_dynamics().

## ◆ square_sum()

 def pyclustering.utils.square_sum ( list_vector )

Calculates square sum of vector that is represented by list, each element can be represented by list - multidimensional elements.

Parameters
 [in] list_vector (list): Input vector.
Returns
(double) Square sum of vector.

Definition at line 1196 of file __init__.py.

## ◆ stretch_pattern()

 def pyclustering.utils.stretch_pattern ( image_source )

Returns stretched content as 1-dimension (gray colored) matrix with size of input image.

Parameters
 [in] image_source (Image): PIL Image instance.
Returns
(list, Image) Stretched image as gray colored matrix and source image.

Definition at line 113 of file __init__.py.

## ◆ timedcall()

 def pyclustering.utils.timedcall ( executable_function, * args, ** kwargs )

Executes specified method or function with measuring of execution time.

Parameters
 [in] executable_function (pointer): Pointer to a function or method that should be called. [in] *args Arguments of the called function or method. [in] **kwargs Arbitrary keyword arguments of the called function or method.
Returns
(tuple) Execution time and result of execution of function or method (execution_time, result_execution).

Definition at line 573 of file __init__.py.

## ◆ variance_increase_distance()

 def pyclustering.utils.variance_increase_distance ( cluster1, cluster2, data = None )

Calculates variance increase distance between two clusters.

Clusters can be represented by list of coordinates (in this case data shouldn't be specified), or by list of indexes of points from the data (represented by list of points), in this case data should be specified.

Parameters
 [in] cluster1 (list): The first cluster. [in] cluster2 (list): The second cluster. [in] data (list): If specified than elements of clusters will be used as indexes, otherwise elements of cluster will be considered as points.
Returns
(double) Average variance increase distance between two clusters.

Definition at line 413 of file __init__.py.