Integrating information visualization and retrieval for WWW information discovery

  • Authors:
  • Hayato Ohwada;Fumio Mizoguchi

  • Affiliations:
  • Tokyo University of Science, Noda, 278-8510 Chiba, Japan;Tokyo University of Science, Noda, 278-8510 Chiba, Japan

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2003

Quantified Score

Hi-index 5.23

Visualization

Abstract

An important technology in knowledge discovery is to access the desired information from the large amount of data stored on the WWW. At present, such information can be accessed by a browser itself or by using a keyword search function. However, browsing is a time consuming task where a user must access individual pages one by one. Furthermore, in keyword searches, it is difficult for users to provide reasonable keywords in knowledge discovery processes. This paper outlines an approach for integrating information visualization and retrieval into WWW information discovery. In this approach, the link structure of a web site is displayed in a 3-D hyperbolic tree in which the height of a node (corresponding to a web page) within the tree indicates a user's "interest" for each page. Here, interest is calculated by a fitting function between a page and a user-supplied query (nested keywords). This measure can be used to filter uninteresting pages, reducing the size of the link structure. Furthermore, each web page is modeled as semi-structured data and can also be displayed as a hyperbolic tree in which the result of query evaluation is visible. Such functions are incorporated within our browser, allowing us to interactively discover desired pages from a large web site. We selected typical web sites to show the performance of the proposed method with improved accuracy and efficiency in WWW information discovery. Here, accuracy indicates how surely the user accesses his/her desired documents, and efficiency indicates how quickly the user reaches the documents.