Querying Linguistic Treebanks with Monadic Second-Order Logic in Linear Time

  • Authors:
  • Stephan Kepser

  • Affiliations:
  • SFB 441, University of Tübingen, Germany

  • Venue:
  • Journal of Logic, Language and Information
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years large amounts of electronic texts have become available. While the first of these corpora had only a low level of annotation, the more recent ones are annotated with refined syntactic information. To make these rich annotations accessible for linguists, the development of query systems has become an important goal. One of the main difficulties in this task consists in the choice of the right query language, a language which at the same time should be powerful enough to let users formulate the queries they want and which should be efficiently evaluable to keep query response times short. There is a widespread belief that such a query language does not exist. It is therefore the aim of this paper to show that there is indeed a powerful query language that can be efficiently evaluated. We propose the use of monadic second-order logic as a query language. We show that a query in this language can be evaluated in linear time in the size of a tree in the corpus. We also provide examples of complicated linguistic queries expressed in monadic second-order logic thereby demonstrating the high expressive power of the language.