Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
WebQuery: searching and visualizing the Web through connectivity
Selected papers from the sixth international conference on World Wide Web
Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
WebBase: a repository of Web pages
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Automatically summarising Web sites: is there a way around it?
Proceedings of the ninth international conference on Information and knowledge management
Improvement of HITS-based algorithms on web documents
Proceedings of the 11th international conference on World Wide Web
Using web structure for classifying and describing web pages
Proceedings of the 11th international conference on World Wide Web
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Hyperlink Analysis for the Web
IEEE Internet Computing
Lucene in Action (In Action series)
Lucene in Action (In Action series)
Extraction of Semantic Text Portion Related to Anchor Link
IEICE - Transactions on Information and Systems
Hi-index | 0.00 |
Kleinberg's HITS algorithm is a popular algorithm to rank web pages. One of its problems is the topic drift problem. Previous researchers have tried to solve this problem using anchor-related text. We proposed another type of anchor-related text in our previous study. This is found by executing a deep analysis on the DOM structures of web pages. We call our anchor-related text DOM-based anchor-related text (DOM-text). In this paper, we investigate the effectiveness of using DOM-text for improving the HITS algorithm. We examine how much we can improve the HITS algorithm. We also compare DOM-text with anchor-related text of other kinds. The experimental results show that the use of DOM-text is the best for improving the HITS algorithm.