Textual Data Mining to Support Science and Technology Management

  • Authors:
  • Paul Losiewicz;Douglas W. Oard;Ronald N. Kostoff

  • Affiliations:
  • ACS Defense, Inc., Rome, NY. losiewiczp@rl.af.mil;College of Library and Information Services, University of Maryland, College Park, MD. oard@glue.umd.edu;Office of Naval Research, Arlington, VA. kostofr@onr.navy.mil

  • Venue:
  • Journal of Intelligent Information Systems
  • Year:
  • 2000

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper surveys applications of data mining techniques to large text collections, and illustrates how those techniques can be used to support the management of science and technology research. Specific issues that arise repeatedly in the conduct of research management are described, and a textual data mining architecture that extends a classic paradigm for knowledge discovery in databases is introduced. That architecture integrates information retrieval from text collections, information extraction to obtain data from individual texts, data warehousing for the extracted data, data mining to discover useful patterns in the data, and visualization of the resulting patterns. At the core of this architecture is a broad view of data mining—the process of discovering patterns in large collections of data—and that step is described in some detail. The final section of the paper illustrates how these ideas can be applied in practice, drawing upon examples from the recently completed first phase of the textual data mining program at the Office of Naval Research. The paper concludes by identifying some research directions that offer significant potential for improving the utility of textual data mining for research management applications.