Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Proceedings of the 10th international conference on World Wide Web
Finding authorities and hubs from link structures on the World Wide Web
Proceedings of the 10th international conference on World Wide Web
Improvements of HITS Algorithms for Spam Links
IEICE - Transactions on Information and Systems
Improvements of HITS algorithms for spam links
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Mining communities on the web using a max-flow and a site-oriented framework
WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Hi-index | 0.00 |
Topic distillation is the process of finding representative pages relevant to a given query. Well-known topic distillation approaches such as the HITS algorithm have shown to be useful in identifying high quality pages. In this paper, we attempt to revisit the behaviour of HITS from a different point of view. Namely, a similarity-based analysis model is applied to observing the distillation procedure. By defining a generalized similarity, an algorithm is proposed, which can improve the quality of distillation using only hyperlinks. A topic exploration function is also integrated into the algorithm framework, which enables end-users to search less popular topics when multi-topics are involved in queries. The experimental results reveal two benefits from the new algorithm: the improvement of distillation quality without utilizing any content information of pages, and an additional ability to explore the topics emerging in the query results.