Algorithms for clustering data
Algorithms for clustering data
Automatic text processing
Finding related pages in the World Wide Web
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Modern Information Retrieval
SimRank: a measure of structural-context similarity
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Node similarity in networked information spaces
CASCON '01 Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research
The link prediction problem for social networks
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
A new suffix tree similarity measure for document clustering
Proceedings of the 16th international conference on World Wide Web
PageSim: A Novel Link-Based Similarity Measure for the World Wide Web
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
TF-IDF uncovered: a study of theories and probabilities
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Joke retrieval: recognizing the same joke told differently
Proceedings of the 17th ACM conference on Information and knowledge management
Enhancing link-based similarity through the use of non-numerical labels and prior information
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
SimRate: improve collaborative recommendation based on rating graph for sparsity
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Axiomatic ranking of network role similarity
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Introduction to social computing
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Scalable and axiomatic ranking of network role similarity
ACM Transactions on Knowledge Discovery from Data (TKDD) - Casin special issue
Hi-index | 0.00 |
The problem of measuring similarity between web pages arises in many important Web applications, such as search engines and Web directories. In this paper, we propose a novel neighbor-based similarity measure called MatchSim, which uses only the neighborhood structure of web pages. Technically, MatchSim recursively defines similarity between web pages by the average similarity of the maximum matching between their neighbors. Our method extends the traditional methods which simply count the numbers of common and/or different neighbors. It also successfully overcomes a severe counterintuitive loophole in SimRank, due to its strict consistency with the intuitions of similarity. We give the computational complexity of MatchSim iteration. The accuracy of MatchSim is compared against others on two real datasets. The results show that our method performs best in most cases.