Coupled nominal similarity in unsupervised learning

  • Authors:
  • Can Wang;Longbing Cao;Mingchun Wang;Jinjiu Li;Wei Wei;Yuming Ou

  • Affiliations:
  • University of Technology, Sydney, Sydney, Australia;University of Technology, Sydney, Sydney, Australia;Tianjin University of Technology and Education, Tianjin, China;University of Technology, Sydney, Sydney, Australia;University of Technology, Sydney, Sydney, Australia;University of Technology, Sydney, Sydney, Australia

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The similarity between nominal objects is not straightforward, especially in unsupervised learning. This paper proposes coupled similarity metrics for nominal objects, which consider not only intra-coupled similarity within an attribute (i.e., value frequency distribution) but also inter-coupled similarity between attributes (i.e. feature dependency aggregation). Four metrics are designed to calculate the inter-coupled similarity between two categorical values by considering their relationships with other attributes. The theoretical analysis reveals their equivalent accuracy and superior efficiency based on intersection against others, in particular for large-scale data. Substantial experiments on extensive UCI data sets verify the theoretical conclusions. In addition, experiments of clustering based on the derived dissimilarity metrics show a significant performance improvement.