Mining Text Using Keyword Distributions

  • Authors:
  • Ronen Feldman;Ido Dagan;Haym Hirsh

  • Affiliations:
  • Department of Mathematics and Computer Science Department, Bar-Ilan University, Ramat-Gan, ISRAEL. E-mail: feldman@cs.biu.ac.il, dagan@cs.biu.ac.il;Department of Mathematics and Computer Science Department, Bar-Ilan University, Ramat-Gan, ISRAEL. E-mail: feldman@cs.biu.ac.il, dagan@cs.biu.ac.il;Deptartment of Computer Science, Rutgers University, Piscataway, NJ USA 08855. E-mail: hirsh@cs.rutgers.edu

  • Venue:
  • Journal of Intelligent Information Systems
  • Year:
  • 1998

Quantified Score

Hi-index 0.01

Visualization

Abstract

Knowledge Discovery in Databases (KDD) focuses on thecomputerized exploration of large amounts of data and on thediscovery of interesting patterns within them. While most workon KDD has been concerned with structured databases, there hasbeen little work on handling the huge amount of information thatis available only in unstructured textual form. This paperdescribes the KDT system for Knowledge Discovery in Text, inwhich documents are labeled by keywords, and knowledge discoveryis performed by analyzing the co-occurrence frequencies of thevarious keywords labeling the documents. We show how thiskeyword-frequency approach supports a range of KDD operations,providing a suitable foundation for knowledge discovery andexploration for collections of unstructured text.