MatchSim: a novel similarity measure based on maximum neighborhood matching

Authors:
Zhenjiang Lin;Michael R. Lyu;Irwin King
Affiliations:
The Chinese University of Hong Kong, Department of Computer Science and Engineering, Shatin, NT, Hong Kong;The Chinese University of Hong Kong, Department of Computer Science and Engineering, Shatin, NT, Hong Kong;The Chinese University of Hong Kong, Department of Computer Science and Engineering, Shatin, NT, Hong Kong
Venue:
Knowledge and Information Systems
Year:
2012

Citing 0
Cited 2

Learning colours from textures by sparse manifold embedding

Signal Processing
Local discriminative distance metrics ensemble learning

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Measuring object similarity in a graph is a fundamental data- mining problem in various application domains, including Web linkage mining, social network analysis, information retrieval, and recommender systems. In this paper, we focus on the neighbor-based approach that is based on the intuition that “similar objects have similar neighbors” and propose a novel similarity measure called MatchSim. Our method recursively defines the similarity between two objects by the average similarity of the maximum-matched similar neighbor pairs between them. We show that MatchSim conforms to the basic intuition of similarity; therefore, it can overcome the counterintuitive contradiction in SimRank. Moreover, MatchSim can be viewed as an extension of the traditional neighbor-counting scheme by taking the similarities between neighbors into account, leading to higher flexibility. We present the MatchSim score computation process and prove its convergence. We also analyze its time and space complexity and suggest two accelerating techniques: (1) proposing a simple pruning strategy and (2) adopting an approximation algorithm for maximum matching computation. Experimental results on real-world datasets show that although our method is less efficient computationally, it outperforms classic methods in terms of accuracy.