Algorithms for clustering data
Algorithms for clustering data
Hi-index | 0.10 |
Despite the many coefficients accounting for the resemblance between pairs of objects based on presence/absence data, no one measure shows optimal characteristics. In this work the Positive Matching Index (PMI) is proposed as a new measure of similarity between lists of attributes. PMI fulfills the Tulloss' theoretical prerequisites for similarity coefficients, is easy to calculate and has an intrinsic meaning expressable into a natural language. PMI is bounded between 0 and 1 and represents the mean proportion of positive matches relative to the size of attribute lists, ranging this cardinality continuously from the smaller list to the larger one. PMI behaves correctly where alternative indices either fail, or only approximate to the desirable properties for a similarity index. Empirical examples associated to biomedical research are provided to show outperformance of PMI in relation to standard indices such as Jaccard and Dice coefficients.