XUXEN: a spelling checker/corrector for basque based on two-level morphology

  • Authors:
  • E. Agirre;I. Alegria;X. Arregi;X. Artola;A. Díaz de Ilarraza;M. Maritxalar;K. Sarasola;M. Urkia

  • Affiliations:
  • Informatika Fakultatea, Donostia (Basque Country - Spain);Informatika Fakultatea, Donostia (Basque Country - Spain);Informatika Fakultatea, Donostia (Basque Country - Spain);Informatika Fakultatea, Donostia (Basque Country - Spain);Informatika Fakultatea, Donostia (Basque Country - Spain);Informatika Fakultatea, Donostia (Basque Country - Spain);Informatika Fakultatea, Donostia (Basque Country - Spain);U.Z.E.I., Donostia (Basque Country)

  • Venue:
  • ANLC '92 Proceedings of the third conference on Applied natural language processing
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

The application of the formalism of two-level morphology to Basque and its use in the elaboration of the XUXEN spelling checker/corrector are described. This application is intended to cover a large part of the language.Because Basque is a highly inflected language, the approach of spelling checking and correction has been conceived as a by-product of a general purpose morphological analyzer/generator. This analyzer is taken as a basic tool for current and future work on automatic processing of Basque.An extension for continuation class specifications in order to deal with long-distance dependencies is proposed. This extension consists basically of two features added to the standard formalism which allow the lexicon builder to make explicit the interdependencies of morphemes.User-lexicons can be interactively enriched with new entries enabling the checker from then on to recognize all the possible flexions derived from them.Due to a late process of standardization of the language, writers don't always know the standard form to be used and commit errors. The treatment of these "typical errors" is made in a specific way by means of describing them using the two-level lexicon system. In this sense, XUXEN is intended as a useful tool for standardization purposes of present day written Basque.