Using Statistical Methods to Improve Knowledge-Based News Categorization

  • Authors:
  • Paul S. Jacobs

  • Affiliations:
  • -

  • Venue:
  • IEEE Expert: Intelligent Systems and Their Applications
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

NLDB, a knowledge-based system that automatically categorizes news stories for dissemination, retrieval, and browsing, is discussed. The major knowledge-based component of NLDB is a lexicosemantic pattern matcher that identifies combinations of words and phrases, as well as more complex patterns. These include word roots, grammatical categories, and semantic structures, such as verbs describing classes of events. It is shown that this linguistic analysis outperforms statistical methods. Because building lexicosemantic patterns can be a laborious process, a set of statistical methods that automate pattern acquisition while preserving the benefits of a knowledge-based approach are developed.