Passage-level evidence in document retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Results and challenges in Web search evaluation
WWW '99 Proceedings of the eighth international conference on World Wide Web
A vector space model for automatic indexing
Communications of the ACM
Proceedings of the 10th international conference on World Wide Web
Enhanced topic distillation using text, markup tags, and hyperlinks
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Template detection via data mining and its applications
Proceedings of the 11th international conference on World Wide Web
Discovering informative content blocks from Web documents
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Eliminating noisy information in Web pages for data mining
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning block importance models for web pages
Proceedings of the 13th international conference on World Wide Web
Using link analysis to improve layout on mobile devices
Proceedings of the 13th international conference on World Wide Web
Retrieval evaluation with incomplete information
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Simple BM25 extension to multiple weighted fields
Proceedings of the thirteenth ACM international conference on Information and knowledge management
A fast and robust method for web page template detection and removal
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Designing data-intensive web applications for content accessibility using web marts
Communications of the ACM
A densitometric approach to web page segmentation
Proceedings of the 17th ACM conference on Information and knowledge management
Entropy-Based Visual Tree Evaluation on Block Extraction
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Evidence of quality of textual features on the web 2.0
Proceedings of the 18th ACM conference on Information and knowledge management
Topic-Based Computing Model for Web Page Popularity and Website Influence
AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
Boilerplate detection using shallow text features
Proceedings of the third ACM international conference on Web search and data mining
Characterizing use and quality of textual attributes in Web 2.0 applications
WebMedia '09 Proceedings of the XV Brazilian Symposium on Multimedia and the Web
Automatic selection of print-worthy content for enhanced web page printing experience
Proceedings of the 10th ACM symposium on Document engineering
Evaluating importance of websites on news topics
PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
A site oriented method for segmenting web pages
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
DOM based content extraction via text density
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Advertisement selection for online videos
Proceedings of the 18th Brazilian symposium on Multimedia and the web
Effectiveness of template detection on noise reduction and websites summarization
Information Sciences: an International Journal
Hi-index | 0.00 |
In this paper we consider the problem of using the block structure of a Web page to improve ranking results when searching for information on Web sites. Given the block structure of the Web pages as input, we propose a method for computing the importance of each block (in the form of block weights) in a Web collection. As we show through experiments, the deployment of our method may allow a significant improvement in the quality of search results. We ran experiments to compare the quality of search results when using our method to the quality obtained when using no structure information. When compared to a ranking method that considered pages as monolithic units, our block-based ranking method led to improvements in the quality of search results in experiments with two sites with heterogeneous structures. Further, our method does not increase the cost of processing queries when compared to the systems using no structural information.