Computational geometry: an introduction
Computational geometry: an introduction
Towards general measures of comparison of objects
Fuzzy Sets and Systems - Special issue dedicated to the memory of Professor Arnold Kaufmann
Discrimination power of measures of comparison
Fuzzy Sets and Systems
SIGIR '80 Proceedings of the 3rd annual ACM conference on Research and development in information retrieval
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Evaluating similarity measures: a large-scale study in the orkut social network
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Journal of the American Society for Information Science and Technology
Journal of the American Society for Information Science and Technology
Similarity measures for binary and numerical data: a survey
International Journal of Knowledge Engineering and Soft Data Paradigms
Bounds of Resemblance Measures for Binary (Presence/Absence) Variables
Journal of Classification
Comparing dissimilarity measures for content-based image retrieval
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Hi-index | 0.00 |
In many fields of application, the choice of proximity measure directly affects the results of data mining methods, whatever the task might be: clustering, comparing or structuring of a set of objects. Generally, in such fields of application, the user is obliged to choose one proximity measure from many possible alternatives. According to the notion of equivalence, such as the one based on pre-ordering, certain proximity measures are more or less equivalent, which means that they should produce almost the same results. This information on equivalence might be helpful for choosing one such measure. However, the complexity O (n 4 ) of this approach makes it intractable when the size n of the sample exceeds a few hundred. To cope with this limitation, we propose a new approach with less complexity O (n 2 ). This is based on topological equivalence and it exploits the concept of local neighbors. It defines equivalence between two proximity measures as having the same neighborhood structure on the objects. We illustrate our approach by considering 13 proximity measures used on datasets with continuous attributes.