An Annotated Corpus and a Grammar Model of Theorem Description

  • Authors:
  • Yusuke Baba;Masakazu Suzuki

  • Affiliations:
  • -;-

  • Venue:
  • MKM '03 Proceedings of the Second International Conference on Mathematical Knowledge Management
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Digitizing documents is becoming increasingly popular in various fields, and training computers to understand the contents of digitized documents is of growing interest. Since the early 90's, research of natural language processing using large annotated corpora such as the Penn TreeBank has developed. Applying the methods of corpus-based research, we built a syntactically annotated corpus of theorem descriptions, using a book of set theory, and extracted a grammar model of theorems from the obtained corpus, as the first step to understanding mathematical documents by computer.