Grading knowledge: extracting degree information from texts

  • Authors:
  • Steffen Staab

  • Affiliations:
  • Universität Karlsruhe, Institute for Applied Informatics and Formal Description Methods, Karlsruhe, Germany

  • Venue:
  • Grading knowledge: extracting degree information from texts
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

"Text Knowledge Extraction" maps natural language texts onto a formal representation of the facts contained in the texts. Common text knowledge extraction methods show a severe lack of methods for understanding natural language "degree expressions", like "expensive hard disk drive" and "good monitor", which describe gradable properties like price and quality, respectively. However, without an adequate understanding of such degree expressions it is often impossible to grasp the central meaning of a text. This book shows concise and comprehensive concepts for extracting degree information from natural language texts. It researches this task with regard to the three levels of (i) analysing natural language degree expressions, (ii) representing them in a terminologic framework, and (iii) inferencing on them byconstrain t propagation. On each of these three levels, the author shows that former approaches to the degree understanding problem were too simplistic, since theyignored byand large the role of the background knowledge involved. Thus, he gives a constructive verification of his central hypothesis, viz. that the proper extraction of grading knowledge relies heavilyon background grading knowledge. This construction proceeds as follows. First, the author gives an overview of the ParseTalk information extraction system. Then, from the review of relevant linguistic literature, the author derives two distinct categories of natural language degree expressions and proposes knowledge-intensive algorithms to handle their analyses in the ParseTalk system. These methods are applied to two text domains, viz. a medical diagnosis domain and a repositoryof texts from information technologymagazines. Moreover, for inferencing the author generalizes from well-known constraint propagation mechanisms. This generalization is especiallyapt for representing and reasoning with natural language degree expressions, but it is also interesting from the point of view where it originated, viz. the field of temporal reasoning. The conclusion of the book gives an integration of all three levels of understanding showing that their coupling leads to an even more advanced -- and more efficient -- performance of the proposed mechanisms.