A Personal Web Information/Knowledge Retrieval System

  • Authors:
  • Hao Han;Takehiro Tokuda

  • Affiliations:
  • {han, tokuda}@tt.cs.titech.ac.jp, Department of Computer Science, Tokyo Institute of Technology, Meguro, Tokyo 152-8552, Japan;{han, tokuda}@tt.cs.titech.ac.jp, Department of Computer Science, Tokyo Institute of Technology, Meguro, Tokyo 152-8552, Japan

  • Venue:
  • Proceedings of the 2008 conference on Information Modelling and Knowledge Bases XIX
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Web is the richest source of information and knowledge. Unfortunately the current structure of Web pages makes it difficult for users to retrieve the information or knowledge in a systematic way. In this paper, using the tree approach, we propose a personal Web information/knowledge retrieval system for the extraction of structured parts from Web pages. First we get the layout pattern and paths of extraction parts of a typical Web page in target sites. Then we use the recorded layout pattern and paths to extract the structured parts from the rest of Web pages in target sites. We show the usefulness of our approach using the results of extracting structured parts of notable Web pages.