pyclustering  0.10.1
pyclustring is a Python, C++ data mining library.
pyclustering.utils.metric Namespace Reference

Module provides various distance metrics - abstraction of the notion of distance in a metric space. More...

Classes

class  distance_metric
 Distance metric performs distance calculation between two points in line with encapsulated function, for example, euclidean distance or chebyshev distance, or even user-defined. More...
 
class  type_metric
 Enumeration of supported metrics in the module for distance calculation between two points. More...
 

Functions

def euclidean_distance (point1, point2)
 Calculate Euclidean distance between two vectors. More...
 
def euclidean_distance_numpy (object1, object2)
 Calculate Euclidean distance between two objects using numpy. More...
 
def euclidean_distance_square (point1, point2)
 Calculate square Euclidean distance between two vectors. More...
 
def euclidean_distance_square_numpy (object1, object2)
 Calculate square Euclidean distance between two objects using numpy. More...
 
def manhattan_distance (point1, point2)
 Calculate Manhattan distance between between two vectors. More...
 
def manhattan_distance_numpy (object1, object2)
 Calculate Manhattan distance between two objects using numpy. More...
 
def chebyshev_distance (point1, point2)
 Calculate Chebyshev distance (maximum metric) between between two vectors. More...
 
def chebyshev_distance_numpy (object1, object2)
 Calculate Chebyshev distance between two objects using numpy. More...
 
def minkowski_distance (point1, point2, degree=2)
 Calculate Minkowski distance between two vectors. More...
 
def minkowski_distance_numpy (object1, object2, degree=2)
 Calculate Minkowski distance between objects using numpy. More...
 
def canberra_distance (point1, point2)
 Calculate Canberra distance between two vectors. More...
 
def canberra_distance_numpy (object1, object2)
 Calculate Canberra distance between two objects using numpy. More...
 
def chi_square_distance (point1, point2)
 Calculate Chi square distance between two vectors. More...
 
def chi_square_distance_numpy (object1, object2)
 Calculate Chi square distance between two vectors using numpy. More...
 
def gower_distance (point1, point2, max_range)
 Calculate Gower distance between two vectors. More...
 
def gower_distance_numpy (point1, point2, max_range)
 Calculate Gower distance between two vectors using numpy. More...
 

Detailed Description

Module provides various distance metrics - abstraction of the notion of distance in a metric space.

Authors
Andrei Novikov (pyclu.nosp@m.ster.nosp@m.ing@y.nosp@m.ande.nosp@m.x.ru)
Date
2014-2020

Function Documentation

◆ canberra_distance()

def pyclustering.utils.metric.canberra_distance (   point1,
  point2 
)

Calculate Canberra distance between two vectors.

\[ dist(a, b) = \sum_{i=0}^{N}\frac{\left | a_{i} - b_{i} \right |}{\left | a_{i} \right | + \left | b_{i} \right |}; \]

Parameters
[in]point1(array_like): The first vector.
[in]point2(array_like): The second vector.
Returns
(float) Canberra distance between two objects.

Definition at line 501 of file metric.py.

◆ canberra_distance_numpy()

def pyclustering.utils.metric.canberra_distance_numpy (   object1,
  object2 
)

Calculate Canberra distance between two objects using numpy.

Parameters
[in]object1(array_like): The first vector.
[in]object2(array_like): The second vector.
Returns
(float) Canberra distance between two objects.

Definition at line 526 of file metric.py.

◆ chebyshev_distance()

def pyclustering.utils.metric.chebyshev_distance (   point1,
  point2 
)

Calculate Chebyshev distance (maximum metric) between between two vectors.

Chebyshev distance is a metric defined on a vector space where the distance between two vectors is the greatest of their differences along any coordinate dimension.

\[ dist(a, b) = \max_{}i\left (\left | a_{i} - b_{i} \right |\right ); \]

Parameters
[in]point1(array_like): The first vector.
[in]point2(array_like): The second vector.
Returns
(double) Chebyshev distance between two vectors.
See also
euclidean_distance_square, euclidean_distance, minkowski_distance

Definition at line 417 of file metric.py.

◆ chebyshev_distance_numpy()

def pyclustering.utils.metric.chebyshev_distance_numpy (   object1,
  object2 
)

Calculate Chebyshev distance between two objects using numpy.

Parameters
[in]object1(array_like): The first array_like object.
[in]object2(array_like): The second array_like object.
Returns
(double) Chebyshev distance between two objects.

Definition at line 444 of file metric.py.

◆ chi_square_distance()

def pyclustering.utils.metric.chi_square_distance (   point1,
  point2 
)

Calculate Chi square distance between two vectors.

\[ dist(a, b) = \sum_{i=0}^{N}\frac{\left ( a_{i} - b_{i} \right )^{2}}{\left | a_{i} \right | + \left | b_{i} \right |}; \]

Parameters
[in]point1(array_like): The first vector.
[in]point2(array_like): The second vector.
Returns
(float) Chi square distance between two objects.

Definition at line 545 of file metric.py.

◆ chi_square_distance_numpy()

def pyclustering.utils.metric.chi_square_distance_numpy (   object1,
  object2 
)

Calculate Chi square distance between two vectors using numpy.

Parameters
[in]object1(array_like): The first vector.
[in]object2(array_like): The second vector.
Returns
(float) Chi square distance between two objects.

Definition at line 568 of file metric.py.

◆ euclidean_distance()

def pyclustering.utils.metric.euclidean_distance (   point1,
  point2 
)

Calculate Euclidean distance between two vectors.

The Euclidean between vectors (points) a and b is calculated by following formula:

\[ dist(a, b) = \sqrt{ \sum_{i=0}^{N}(a_{i} - b_{i})^{2} }; \]

Where N is a length of each vector.

Parameters
[in]point1(array_like): The first vector.
[in]point2(array_like): The second vector.
Returns
(double) Euclidean distance between two vectors.
See also
euclidean_distance_square, manhattan_distance, chebyshev_distance

Definition at line 298 of file metric.py.

◆ euclidean_distance_numpy()

def pyclustering.utils.metric.euclidean_distance_numpy (   object1,
  object2 
)

Calculate Euclidean distance between two objects using numpy.

Parameters
[in]object1(array_like): The first array_like object.
[in]object2(array_like): The second array_like object.
Returns
(double) Euclidean distance between two objects.

Definition at line 321 of file metric.py.

◆ euclidean_distance_square()

def pyclustering.utils.metric.euclidean_distance_square (   point1,
  point2 
)

Calculate square Euclidean distance between two vectors.

\[ dist(a, b) = \sum_{i=0}^{N}(a_{i} - b_{i})^{2}; \]

Parameters
[in]point1(array_like): The first vector.
[in]point2(array_like): The second vector.
Returns
(double) Square Euclidean distance between two vectors.
See also
euclidean_distance, manhattan_distance, chebyshev_distance

Definition at line 337 of file metric.py.

Referenced by pyclustering.utils.metric.euclidean_distance().

◆ euclidean_distance_square_numpy()

def pyclustering.utils.metric.euclidean_distance_square_numpy (   object1,
  object2 
)

Calculate square Euclidean distance between two objects using numpy.

Parameters
[in]object1(array_like): The first array_like object.
[in]object2(array_like): The second array_like object.
Returns
(double) Square Euclidean distance between two objects.

Definition at line 360 of file metric.py.

◆ gower_distance()

def pyclustering.utils.metric.gower_distance (   point1,
  point2,
  max_range 
)

Calculate Gower distance between two vectors.

Implementation is based on the paper [13]. Gower distance is calculate using following formula:

\[ dist\left ( a, b \right )=\frac{1}{p}\sum_{i=0}^{p}\frac{\left | a_{i} - b_{i} \right |}{R_{i}}, \]

where \(R_{i}\) is a max range for ith dimension. \(R\) is defined in line following formula:

\[ R=max\left ( X \right )-min\left ( X \right ) \]

Parameters
[in]point1(array_like): The first vector.
[in]point2(array_like): The second vector.
[in]max_range(array_like): Max range in each data dimension.
Returns
(float) Gower distance between two objects.

Definition at line 587 of file metric.py.

Referenced by pyclustering.utils.metric.distance_metric.disable_numpy_usage().

◆ gower_distance_numpy()

def pyclustering.utils.metric.gower_distance_numpy (   point1,
  point2,
  max_range 
)

Calculate Gower distance between two vectors using numpy.

Parameters
[in]point1(array_like): The first vector.
[in]point2(array_like): The second vector.
[in]max_range(array_like): Max range in each data dimension.
Returns
(float) Gower distance between two objects.

Definition at line 618 of file metric.py.

Referenced by pyclustering.utils.metric.distance_metric.disable_numpy_usage().

◆ manhattan_distance()

def pyclustering.utils.metric.manhattan_distance (   point1,
  point2 
)

Calculate Manhattan distance between between two vectors.

\[ dist(a, b) = \sum_{i=0}^{N}\left | a_{i} - b_{i} \right |; \]

Parameters
[in]point1(array_like): The first vector.
[in]point2(array_like): The second vector.
Returns
(double) Manhattan distance between two vectors.
See also
euclidean_distance_square, euclidean_distance, chebyshev_distance

Definition at line 376 of file metric.py.

◆ manhattan_distance_numpy()

def pyclustering.utils.metric.manhattan_distance_numpy (   object1,
  object2 
)

Calculate Manhattan distance between two objects using numpy.

Parameters
[in]object1(array_like): The first array_like object.
[in]object2(array_like): The second array_like object.
Returns
(double) Manhattan distance between two objects.

Definition at line 401 of file metric.py.

◆ minkowski_distance()

def pyclustering.utils.metric.minkowski_distance (   point1,
  point2,
  degree = 2 
)

Calculate Minkowski distance between two vectors.

\[ dist(a, b) = \sqrt[p]{ \sum_{i=0}^{N}\left(a_{i} - b_{i}\right)^{p} }; \]

Parameters
[in]point1(array_like): The first vector.
[in]point2(array_like): The second vector.
[in]degree(numeric): Degree of that is used for Minkowski distance.
Returns
(double) Minkowski distance between two vectors.
See also
euclidean_distance

Definition at line 460 of file metric.py.

Referenced by pyclustering.utils.metric.distance_metric.disable_numpy_usage().

◆ minkowski_distance_numpy()

def pyclustering.utils.metric.minkowski_distance_numpy (   object1,
  object2,
  degree = 2 
)

Calculate Minkowski distance between objects using numpy.

Parameters
[in]object1(array_like): The first array_like object.
[in]object2(array_like): The second array_like object.
[in]degree(numeric): Degree of that is used for Minkowski distance.
Returns
(double) Minkowski distance between two object.

Definition at line 484 of file metric.py.

Referenced by pyclustering.utils.metric.distance_metric.disable_numpy_usage().