pyclustering
0.10.1
pyclustring is a Python, C++ data mining library.
|
Module provides various distance metrics - abstraction of the notion of distance in a metric space. More...
Classes | |
class | distance_metric |
Distance metric performs distance calculation between two points in line with encapsulated function, for example, euclidean distance or chebyshev distance, or even user-defined. More... | |
class | type_metric |
Enumeration of supported metrics in the module for distance calculation between two points. More... | |
Functions | |
def | euclidean_distance (point1, point2) |
Calculate Euclidean distance between two vectors. More... | |
def | euclidean_distance_numpy (object1, object2) |
Calculate Euclidean distance between two objects using numpy. More... | |
def | euclidean_distance_square (point1, point2) |
Calculate square Euclidean distance between two vectors. More... | |
def | euclidean_distance_square_numpy (object1, object2) |
Calculate square Euclidean distance between two objects using numpy. More... | |
def | manhattan_distance (point1, point2) |
Calculate Manhattan distance between between two vectors. More... | |
def | manhattan_distance_numpy (object1, object2) |
Calculate Manhattan distance between two objects using numpy. More... | |
def | chebyshev_distance (point1, point2) |
Calculate Chebyshev distance (maximum metric) between between two vectors. More... | |
def | chebyshev_distance_numpy (object1, object2) |
Calculate Chebyshev distance between two objects using numpy. More... | |
def | minkowski_distance (point1, point2, degree=2) |
Calculate Minkowski distance between two vectors. More... | |
def | minkowski_distance_numpy (object1, object2, degree=2) |
Calculate Minkowski distance between objects using numpy. More... | |
def | canberra_distance (point1, point2) |
Calculate Canberra distance between two vectors. More... | |
def | canberra_distance_numpy (object1, object2) |
Calculate Canberra distance between two objects using numpy. More... | |
def | chi_square_distance (point1, point2) |
Calculate Chi square distance between two vectors. More... | |
def | chi_square_distance_numpy (object1, object2) |
Calculate Chi square distance between two vectors using numpy. More... | |
def | gower_distance (point1, point2, max_range) |
Calculate Gower distance between two vectors. More... | |
def | gower_distance_numpy (point1, point2, max_range) |
Calculate Gower distance between two vectors using numpy. More... | |
Module provides various distance metrics - abstraction of the notion of distance in a metric space.
def pyclustering.utils.metric.canberra_distance | ( | point1, | |
point2 | |||
) |
Calculate Canberra distance between two vectors.
\[ dist(a, b) = \sum_{i=0}^{N}\frac{\left | a_{i} - b_{i} \right |}{\left | a_{i} \right | + \left | b_{i} \right |}; \]
[in] | point1 | (array_like): The first vector. |
[in] | point2 | (array_like): The second vector. |
def pyclustering.utils.metric.canberra_distance_numpy | ( | object1, | |
object2 | |||
) |
def pyclustering.utils.metric.chebyshev_distance | ( | point1, | |
point2 | |||
) |
Calculate Chebyshev distance (maximum metric) between between two vectors.
Chebyshev distance is a metric defined on a vector space where the distance between two vectors is the greatest of their differences along any coordinate dimension.
\[ dist(a, b) = \max_{}i\left (\left | a_{i} - b_{i} \right |\right ); \]
[in] | point1 | (array_like): The first vector. |
[in] | point2 | (array_like): The second vector. |
def pyclustering.utils.metric.chebyshev_distance_numpy | ( | object1, | |
object2 | |||
) |
def pyclustering.utils.metric.chi_square_distance | ( | point1, | |
point2 | |||
) |
Calculate Chi square distance between two vectors.
\[ dist(a, b) = \sum_{i=0}^{N}\frac{\left ( a_{i} - b_{i} \right )^{2}}{\left | a_{i} \right | + \left | b_{i} \right |}; \]
[in] | point1 | (array_like): The first vector. |
[in] | point2 | (array_like): The second vector. |
def pyclustering.utils.metric.chi_square_distance_numpy | ( | object1, | |
object2 | |||
) |
def pyclustering.utils.metric.euclidean_distance | ( | point1, | |
point2 | |||
) |
Calculate Euclidean distance between two vectors.
The Euclidean between vectors (points) a and b is calculated by following formula:
\[ dist(a, b) = \sqrt{ \sum_{i=0}^{N}(a_{i} - b_{i})^{2} }; \]
Where N is a length of each vector.
[in] | point1 | (array_like): The first vector. |
[in] | point2 | (array_like): The second vector. |
def pyclustering.utils.metric.euclidean_distance_numpy | ( | object1, | |
object2 | |||
) |
def pyclustering.utils.metric.euclidean_distance_square | ( | point1, | |
point2 | |||
) |
Calculate square Euclidean distance between two vectors.
\[ dist(a, b) = \sum_{i=0}^{N}(a_{i} - b_{i})^{2}; \]
[in] | point1 | (array_like): The first vector. |
[in] | point2 | (array_like): The second vector. |
Definition at line 337 of file metric.py.
Referenced by pyclustering.utils.metric.euclidean_distance().
def pyclustering.utils.metric.euclidean_distance_square_numpy | ( | object1, | |
object2 | |||
) |
def pyclustering.utils.metric.gower_distance | ( | point1, | |
point2, | |||
max_range | |||
) |
Calculate Gower distance between two vectors.
Implementation is based on the paper [13]. Gower distance is calculate using following formula:
\[ dist\left ( a, b \right )=\frac{1}{p}\sum_{i=0}^{p}\frac{\left | a_{i} - b_{i} \right |}{R_{i}}, \]
where \(R_{i}\) is a max range for ith dimension. \(R\) is defined in line following formula:
\[ R=max\left ( X \right )-min\left ( X \right ) \]
[in] | point1 | (array_like): The first vector. |
[in] | point2 | (array_like): The second vector. |
[in] | max_range | (array_like): Max range in each data dimension. |
Definition at line 587 of file metric.py.
Referenced by pyclustering.utils.metric.distance_metric.disable_numpy_usage().
def pyclustering.utils.metric.gower_distance_numpy | ( | point1, | |
point2, | |||
max_range | |||
) |
Calculate Gower distance between two vectors using numpy.
[in] | point1 | (array_like): The first vector. |
[in] | point2 | (array_like): The second vector. |
[in] | max_range | (array_like): Max range in each data dimension. |
Definition at line 618 of file metric.py.
Referenced by pyclustering.utils.metric.distance_metric.disable_numpy_usage().
def pyclustering.utils.metric.manhattan_distance | ( | point1, | |
point2 | |||
) |
Calculate Manhattan distance between between two vectors.
\[ dist(a, b) = \sum_{i=0}^{N}\left | a_{i} - b_{i} \right |; \]
[in] | point1 | (array_like): The first vector. |
[in] | point2 | (array_like): The second vector. |
def pyclustering.utils.metric.manhattan_distance_numpy | ( | object1, | |
object2 | |||
) |
def pyclustering.utils.metric.minkowski_distance | ( | point1, | |
point2, | |||
degree = 2 |
|||
) |
Calculate Minkowski distance between two vectors.
\[ dist(a, b) = \sqrt[p]{ \sum_{i=0}^{N}\left(a_{i} - b_{i}\right)^{p} }; \]
[in] | point1 | (array_like): The first vector. |
[in] | point2 | (array_like): The second vector. |
[in] | degree | (numeric): Degree of that is used for Minkowski distance. |
Definition at line 460 of file metric.py.
Referenced by pyclustering.utils.metric.distance_metric.disable_numpy_usage().
def pyclustering.utils.metric.minkowski_distance_numpy | ( | object1, | |
object2, | |||
degree = 2 |
|||
) |
Calculate Minkowski distance between objects using numpy.
[in] | object1 | (array_like): The first array_like object. |
[in] | object2 | (array_like): The second array_like object. |
[in] | degree | (numeric): Degree of that is used for Minkowski distance. |
Definition at line 484 of file metric.py.
Referenced by pyclustering.utils.metric.distance_metric.disable_numpy_usage().