Graph-based text database for knowledge discovery

  • Authors:
  • Junji Tomita;Hidekazu Nakawatase;Megumi Ishii

  • Affiliations:
  • NTT Cyber Space Laboratories, Kanagawa, Japan;NTT Cyber Space Laboratories, Kanagawa, Japan;NTT Cyber Space Laboratories, Kanagawa, Japan

  • Venue:
  • Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

While we expect to discover knowledge in the texts available on the Web, such discovery usually requires many complex analysis steps, most of which require different text handling operations such as similar text search or text clustering. Drawing an analogy from the relational data model, we propose a text representation model that simplifies the steps. The model represents texts in a formal manner, Subject Graphs, described herein, provides text handling operations whose inputs and outputs are identical in form, i.e. a set of subject graphs. We develop a graph-based text database, which is based on the model, and an interactive knowledge discovery system. Trials of the system show that it allows the user to interactively and intuitively discover knowledge in Web pages by combining text handling operations defined on subject graphs in various orders.