RSTIndex: Indexing and Retrieving Web Document Using Computational and Linguistic Techniques

  • Authors:
  • Farhi Marir;Kamel Houam

  • Affiliations:
  • -;-

  • Venue:
  • IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The amount of information available on the Internet is currently growing at an incredible rate. However, the lack of efficient indexing is still a major barrier to effective information retrieval on the web. This paper presents a new technique for capturing the semantic of the document to be used for indexing and retrieval of relevant document from the Internet. It performs the conventional keyword based indexing and introduces a thematic relationship between parts of text using natural language understanding (NLU) and a linguistics theory called rhetorical structure theory (RST).