Techniques for improving web retrieval effectiveness

  • Authors:
  • Eui-Kyu Park;Dong-Yul Ra;Myung-Gil Jang

  • Affiliations:
  • Computer Science Department, Yonsei University, Wonju, Kangwon 220-710, Korea;Computer Science Department, Yonsei University, Wonju, Kangwon 220-710, Korea;Speech/Language Information Research Department, ETRI, Yuseong-gu, Daejeon 305-350, Korea

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper talks about several schemes for improving retrieval effectiveness that can be used in the named page finding tasks of web information retrieval (Overview of the TREC-2002 web track. In: Proceedings of the Eleventh Text Retrieval Conference TREC-2002, NIST Special Publication #500-251, 2003). These methods were applied on top of the basic information retrieval model as additional mechanisms to upgrade the system. Use of the title of web pages was found to be effective. It was confirmed that anchor texts of incoming links was beneficial as suggested in other works. Sentence-query similarity is a new type of information proposed by us and was identified to be the best information to take advantage of. Stratifying and re-ranking the retrieval list based on the maximum count of index terms in common between a sentence and a query resulted in significant improvement of performance. To demonstrate these facts a large-scale web information retrieval system was developed and used for experimentation.