Language documentation: the Nahuatl grammar

  • Authors:
  • Mike Maxwell;Jonathan D. Amith

  • Affiliations:
  • Linguistic Data Consortium;Gettysburg College

  • Venue:
  • CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe an on-going documentation project for Nahuatl, an indigenous language of Mexico. While we follow standard recommendations for documenting text corpora and for the dictionary, the usual recommendations are not explicit concerning the grammar. Since Nahuatl is an agglutinating language, the morphological component of the grammar is highly complex. Accordingly, we consider it essential to not only provide static information about the language, such as a lexicon and parsed text, but dynamic documentation in the form of a working morphological grammar. When compiled into a finite state transducer, this grammar provides parses for arbitrary inflected forms, including many not in the corpus, as well as the generation of the partial or full inflectional paradigms. In keeping with the archival goals of language documentation, we argue that this grammar should be simultaneously human readable and computer processable, so that it will be re-implementable in future computational tools. The notion of literate computing provides the appropriate paradigm for these dual goals.