ARIES: A lexical platform for engineering Spanish processing tools

  • Authors:
  • José/ M. Goñ/i;José/ C. Gonzá/lez;Antonio Moreno

  • Affiliations:
  • E.T.S.I. Telecomunicació/n, Universidad Polité/cnica de Madrid, 28040 Madrid, Spain/ e-mail: jmg@mat.upm.es, jcg@gsi.dit.upm.es;E.T.S.I. Telecomunicació/n, Universidad Polité/cnica de Madrid, 28040 Madrid, Spain/ e-mail: jmg@mat.upm.es, jcg@gsi.dit.upm.es;Dept. de Lingü/í/stica, Universidad Autó/noma de Madrid, Cantoblanco, Madrid, Spain/ e-mail: sandoval@lola.lllf.uam.es

  • Venue:
  • Natural Language Engineering
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a lexical platform that has been developed for the Spanish language. It achieves portability between different computer systems and efficiency, in terms of speed and lexical coverage. A model for the full treatment of Spanish inflectional morphology for verbs, nouns and adjectives is presented. This model permits word formation based solely on morpheme concatenation, driven by a feature-based unification grammar. The run-time lexicon is a collection of allomorphs for both stems and endings. Although not tested, it should be suitable also for other Romance and highly inflected languages. A formalism is also described for encoding a lemma-based lexical source, well suited for expressing linguistic generalizations: inheritance classes, lemma encoding, morpho-graphemic allomorphy rules and limited type-checking. From this source base, we can automatically generate an allomorph indexed dictionary adequate for efficient retrieval and processing. A set of software tools has been implemented around this formalism: lexical base augmenting aids, lexical compilers to build run-time dictionaries and access libraries for them, feature manipulation libraries, unification and pseudo-unification modules, morphological processors, a parsing system, etc. Software interfaces among the different modules and tools are cleanly defined to ease software integration and tool combination in a flexible way. Directions for accessing our e-mail and web demonstration prototypes are also provided. Some figures are given, showing the lexical coverage of our platform compared to some popular spelling checkers.