The Computer Journal
Communications of the ACM - Special issue on internetworking
DocBook: The Definitive Guide with CD-ROM
DocBook: The Definitive Guide with CD-ROM
Hi-index | 0.00 |
We describe an on-going documentation project for Nahuatl, an indigenous language of Mexico. While we follow standard recommendations for documenting text corpora and for the dictionary, the usual recommendations are not explicit concerning the grammar. Since Nahuatl is an agglutinating language, the morphological component of the grammar is highly complex. Accordingly, we consider it essential to not only provide static information about the language, such as a lexicon and parsed text, but dynamic documentation in the form of a working morphological grammar. When compiled into a finite state transducer, this grammar provides parses for arbitrary inflected forms, including many not in the corpus, as well as the generation of the partial or full inflectional paradigms. In keeping with the archival goals of language documentation, we argue that this grammar should be simultaneously human readable and computer processable, so that it will be re-implementable in future computational tools. The notion of literate computing provides the appropriate paradigm for these dual goals.