Evaluating strategies for similarity search on the web
Proceedings of the 11th international conference on World Wide Web
Introduction to Information Retrieval
Introduction to Information Retrieval
Hi-index | 0.00 |
In this paper, we propose a novel approach for measuring similarity between web objects. Our similarity measure is defined based on the representation of a web object as a collection of tags. Precisely, we first construct a vector space in which multiple terms are mapped into a single dimension by using information available from Open Directory Project and Delicious.com. Then we position web objects in the vector space and apply the traditional cosine measure for similarity computation. We demonstrate that the proposed similarity computation method is able to overcome the limitation of traditional vector space approach while at the same time require less computational cost compares to LSI (Latent Semantic Indexing).