Using Multiattribute Prediction Suffix Graphs for Spanish Part-of-Speech Tagging

  • Authors:
  • José L. Triviño-Rodriguez;Rafael Morales Bueno

  • Affiliations:
  • -;-

  • Venue:
  • IDA '01 Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

An implementation of a Spanish POS tagger is described in this paper. This implementation combines three basic approaches: a single word tagger based on decision trees; a POS tagger based on a new learning model called the Multiattribute Prediction Suffix Graph; an d a feature structure set of tags. Using decision trees for single word tagging allows the tagger to work without a lexicon that enumerates possible tags only. Moreover, it decreases the error rate because there are no unknown words. The feature structure set of tags is advantageous when the available training corpus is small and the tag set large, which can be the case with morphologically rich languages such as Spanish. Finally, the multiattribute prediction suffix graph model training is more efficient than traditional full-order Markov models and achieves better accuracy.