Extracting and evaluating general world knowledge from the Brown corpus

  • Authors:
  • Lenhart Schubert;Matthew Tong

  • Affiliations:
  • University of Rochester;University of Rochester

  • Venue:
  • HLT-NAACL-TEXTMEANING '03 Proceedings of the HLT-NAACL 2003 workshop on Text meaning - Volume 9
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We have been developing techniques for extracting general world knowledge from miscellaneous texts by a process of approximate interpretation and abstraction, focusing initially on the Brown corpus. We apply interpretive rules to clausal patterns and patterns of modification, and concurrently abstract general "possibilistic" propositions from the resulting formulas. Two examples are "A person may believe a proposition", and "Children may live with relatives". Our methods currently yield over 117,000 such propositions (of variable quality) for the Brown corpus (more than 2 per sentence). We report here on our efforts to evaluate these results with a judging scheme aimed at determining how many of these propositions pass muster as "reasonable general claims" about the world in the opinion of human judges. We find that nearly 60% of the extracted propositions are favorably judged according to our scheme by any given judge. The percentage unanimously judged to be reasonable claims by multiple judges is lower, but still sufficiently high to suggest that our techniques may be of some use in tackling the long-standing "knowledge acquisition bottleneck" in AI.