3 @brief Module provides various distance metrics - abstraction of the notion of distance in a metric space. 5 @authors Andrei Novikov (pyclustering@yandex.ru) 7 @copyright GNU Public License 9 @cond GNU_PUBLIC_LICENSE 10 PyClustering is free software: you can redistribute it and/or modify 11 it under the terms of the GNU General Public License as published by 12 the Free Software Foundation, either version 3 of the License, or 13 (at your option) any later version. 15 PyClustering is distributed in the hope that it will be useful, 16 but WITHOUT ANY WARRANTY; without even the implied warranty of 17 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 18 GNU General Public License for more details. 20 You should have received a copy of the GNU General Public License 21 along with this program. If not, see <http://www.gnu.org/licenses/>. 29 from enum
import IntEnum
34 @brief Enumeration of supported metrics in the module for distance calculation between two points. 60 @brief Distance metric performs distance calculation between two points in line with encapsulated function, for 61 example, euclidean distance or chebyshev distance, or even user-defined. 65 Example of Euclidean distance metric: 67 metric = distance_metric(type_metric.EUCLIDEAN) 68 distance = metric([1.0, 2.5], [-1.2, 3.4]) 71 Example of Chebyshev distance metric: 73 metric = distance_metric(type_metric.CHEBYSHEV) 74 distance = metric([0.0, 0.0], [2.5, 6.0]) 77 In following example additional argument should be specified (generally, 'degree' is a optional argument that is 78 equal to '2' by default) that is specific for Minkowski distance: 80 metric = distance_metric(type_metric.MINKOWSKI, degree=4) 81 distance = metric([4.0, 9.2, 1.0], [3.4, 2.5, 6.2]) 84 User may define its own function for distance calculation: 86 user_function = lambda point1, point2: point1[0] + point2[0] + 2 87 metric = distance_metric(type_metric.USER_DEFINED, func=user_function) 88 distance = metric([2.0, 3.0], [1.0, 3.0]) 94 @brief Creates distance metric instance for calculation distance between two points. 96 @param[in] type (type_metric): 97 @param[in] **kwargs: Arbitrary keyword arguments (available arguments: 'numpy_usage' 'func' and corresponding additional argument for 98 for specific metric types). 100 <b>Keyword Args:</b><br> 101 - func (callable): Callable object with two arguments (point #1 and point #2) or (object #1 and object #2) in case of numpy usage. 102 This argument is used only if metric is 'type_metric.USER_DEFINED'. 103 - degree (numeric): Only for 'type_metric.MINKOWSKI' - degree of Minkowski equation. 104 - numpy_usage (bool): If True then numpy is used for calculation (by default is False). 117 @brief Calculates distance between two points. 119 @param[in] point1 (list): The first point. 120 @param[in] point2 (list): The second point. 122 @return (double) Distance between two points. 130 @brief Return type of distance metric that is used. 132 @return (type_metric) Type of distance metric. 140 @brief Return additional arguments that are used by distance metric. 142 @return (dict) Additional arguments. 150 @brief Return user-defined function for calculation distance metric. 152 @return (callable): User-defined distance metric function. 160 @brief Start numpy for distance calculation. 161 @details Useful in case matrices to increase performance. No effect in case of type_metric.USER_DEFINED type. 165 if self.
__type != type_metric.USER_DEFINED:
171 @brief Stop using numpy for distance calculation. 172 @details Useful in case of big amount of small data portion when numpy call is longer than calculation itself. 173 No effect in case of type_metric.USER_DEFINED type. 180 def __create_distance_calculator(self):
182 @brief Creates distance metric calculator. 184 @return (callable) Callable object of distance metric calculator. 193 def __create_distance_calculator_basic(self):
195 @brief Creates distance metric calculator that does not use numpy. 197 @return (callable) Callable object of distance metric calculator. 200 if self.
__type == type_metric.EUCLIDEAN:
201 return euclidean_distance
203 elif self.
__type == type_metric.EUCLIDEAN_SQUARE:
204 return euclidean_distance_square
206 elif self.
__type == type_metric.MANHATTAN:
207 return manhattan_distance
209 elif self.
__type == type_metric.CHEBYSHEV:
210 return chebyshev_distance
212 elif self.
__type == type_metric.MINKOWSKI:
215 elif self.
__type == type_metric.USER_DEFINED:
219 raise ValueError(
"Unknown type of metric: '%d'", self.
__type)
222 def __create_distance_calculator_numpy(self):
224 @brief Creates distance metric calculator that uses numpy. 226 @return (callable) Callable object of distance metric calculator. 229 if self.
__type == type_metric.EUCLIDEAN:
230 return euclidean_distance_numpy
232 elif self.
__type == type_metric.EUCLIDEAN_SQUARE:
233 return euclidean_distance_square_numpy
235 elif self.
__type == type_metric.MANHATTAN:
236 return manhattan_distance_numpy
238 elif self.
__type == type_metric.CHEBYSHEV:
239 return chebyshev_distance_numpy
241 elif self.
__type == type_metric.MINKOWSKI:
244 elif self.
__type == type_metric.USER_DEFINED:
248 raise ValueError(
"Unknown type of metric: '%d'", self.
__type)
254 @brief Calculate Euclidean distance between two vectors. 255 @details The Euclidean between vectors (points) a and b is calculated by following formula: 258 dist(a, b) = \sqrt{ \sum_{i=0}^{N}(a_{i} - b_{i})^{2} }; 261 Where N is a length of each vector. 263 @param[in] point1 (array_like): The first vector. 264 @param[in] point2 (array_like): The second vector. 266 @return (double) Euclidean distance between two vectors. 268 @see euclidean_distance_square, manhattan_distance, chebyshev_distance 272 return distance ** 0.5
277 @brief Calculate Euclidean distance between two objects using numpy. 279 @param[in] object1 (array_like): The first array_like object. 280 @param[in] object2 (array_like): The second array_like object. 282 @return (double) Euclidean distance between two objects. 285 return numpy.sum(numpy.sqrt(numpy.square(object1 - object2)), axis=1).T
290 @brief Calculate square Euclidean distance between two vectors. 293 dist(a, b) = \sum_{i=0}^{N}(a_{i} - b_{i})^{2}; 296 @param[in] point1 (array_like): The first vector. 297 @param[in] point2 (array_like): The second vector. 299 @return (double) Square Euclidean distance between two vectors. 301 @see euclidean_distance, manhattan_distance, chebyshev_distance 305 for i
in range(len(point1)):
306 distance += (point1[i] - point2[i]) ** 2.0
313 @brief Calculate square Euclidean distance between two objects using numpy. 315 @param[in] object1 (array_like): The first array_like object. 316 @param[in] object2 (array_like): The second array_like object. 318 @return (double) Square Euclidean distance between two objects. 321 return numpy.sum(numpy.square(object1 - object2), axis=1).T
326 @brief Calculate Manhattan distance between between two vectors. 329 dist(a, b) = \sum_{i=0}^{N}\left | a_{i} - b_{i} \right |; 332 @param[in] point1 (array_like): The first vector. 333 @param[in] point2 (array_like): The second vector. 335 @return (double) Manhattan distance between two vectors. 337 @see euclidean_distance_square, euclidean_distance, chebyshev_distance 341 dimension = len(point1)
343 for i
in range(dimension):
344 distance += abs(point1[i] - point2[i])
351 @brief Calculate Manhattan distance between two objects using numpy. 353 @param[in] object1 (array_like): The first array_like object. 354 @param[in] object2 (array_like): The second array_like object. 356 @return (double) Manhattan distance between two objects. 359 return numpy.sum(numpy.absolute(object1 - object2), axis=1).T
364 @brief Calculate Chebyshev distance between between two vectors. 367 dist(a, b) = \max_{}i\left (\left | a_{i} - b_{i} \right |\right ); 370 @param[in] point1 (array_like): The first vector. 371 @param[in] point2 (array_like): The second vector. 373 @return (double) Chebyshev distance between two vectors. 375 @see euclidean_distance_square, euclidean_distance, minkowski_distance 379 dimension = len(point1)
381 for i
in range(dimension):
382 distance = max(distance, abs(point1[i] - point2[i]))
389 @brief Calculate Chebyshev distance between two objects using numpy. 391 @param[in] object1 (array_like): The first array_like object. 392 @param[in] object2 (array_like): The second array_like object. 394 @return (double) Chebyshev distance between two objects. 397 return numpy.max(numpy.absolute(object1 - object2), axis=1).T
402 @brief Calculate Minkowski distance between two vectors. 405 dist(a, b) = \sqrt[p]{ \sum_{i=0}^{N}\left(a_{i} - b_{i}\right)^{p} }; 408 @param[in] point1 (array_like): The first vector. 409 @param[in] point2 (array_like): The second vector. 410 @param[in] degree (numeric): Degree of that is used for Minkowski distance. 412 @return (double) Minkowski distance between two vectors. 414 @see euclidean_distance 418 for i
in range(len(point1)):
419 distance += (point1[i] - point2[i]) ** degree
421 return distance ** (1.0 / degree)
426 @brief Calculate Minkowski distance between objects using numpy. 428 @param[in] object1 (array_like): The first array_like object. 429 @param[in] object2 (array_like): The second array_like object. 430 @param[in] degree (numeric): Degree of that is used for Minkowski distance. 432 @return (double) Minkowski distance between two object. 435 return numpy.sum(numpy.power(numpy.power(object1 - object2, degree), 1/degree), axis=1).T
def __create_distance_calculator_basic(self)
Creates distance metric calculator that does not use numpy.
def get_arguments(self)
Return additional arguments that are used by distance metric.
def euclidean_distance_square(point1, point2)
Calculate square Euclidean distance between two vectors.
def minkowski_distance_numpy(object1, object2, degree=2)
Calculate Minkowski distance between objects using numpy.
def __init__(self, type, kwargs)
Creates distance metric instance for calculation distance between two points.
def __create_distance_calculator(self)
Creates distance metric calculator.
def get_type(self)
Return type of distance metric that is used.
def chebyshev_distance_numpy(object1, object2)
Calculate Chebyshev distance between two objects using numpy.
Distance metric performs distance calculation between two points in line with encapsulated function...
def manhattan_distance_numpy(object1, object2)
Calculate Manhattan distance between two objects using numpy.
def get_function(self)
Return user-defined function for calculation distance metric.
def disable_numpy_usage(self)
Stop using numpy for distance calculation.
def euclidean_distance_square_numpy(object1, object2)
Calculate square Euclidean distance between two objects using numpy.
def __call__(self, point1, point2)
Calculates distance between two points.
def __create_distance_calculator_numpy(self)
Creates distance metric calculator that uses numpy.
def euclidean_distance(point1, point2)
Calculate Euclidean distance between two vectors.
def manhattan_distance(point1, point2)
Calculate Manhattan distance between between two vectors.
def minkowski_distance(point1, point2, degree=2)
Calculate Minkowski distance between two vectors.
def euclidean_distance_numpy(object1, object2)
Calculate Euclidean distance between two objects using numpy.
def enable_numpy_usage(self)
Start numpy for distance calculation.
Enumeration of supported metrics in the module for distance calculation between two points...
def chebyshev_distance(point1, point2)
Calculate Chebyshev distance between between two vectors.