Graph-based text database for knowledge discovery

Authors:
Junji Tomita;Hidekazu Nakawatase;Megumi Ishii
Affiliations:
NTT Cyber Space Laboratories, Kanagawa, Japan;NTT Cyber Space Laboratories, Kanagawa, Japan;NTT Cyber Space Laboratories, Kanagawa, Japan
Venue:
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Year:
2004

Citing 2
Cited 1

A relational model of data for large shared data banks

Communications of the ACM
Untangling text data mining

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics

Calculating similarity between texts using graph-based text representation model

Proceedings of the thirteenth ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

While we expect to discover knowledge in the texts available on the Web, such discovery usually requires many complex analysis steps, most of which require different text handling operations such as similar text search or text clustering. Drawing an analogy from the relational data model, we propose a text representation model that simplifies the steps. The model represents texts in a formal manner, Subject Graphs, described herein, provides text handling operations whose inputs and outputs are identical in form, i.e. a set of subject graphs. We develop a graph-based text database, which is based on the model, and an interactive knowledge discovery system. Trials of the system show that it allows the user to interactively and intuitively discover knowledge in Web pages by combining text handling operations defined on subject graphs in various orders.