A Simple Spanish Part of Speech Tagger for Detection and Correction of Accentuation Error

Authors:
Sofia N. Galicia-Haro;Igor A. Bolshakov;Alexander F. Gelbukh
Affiliations:
-;-;-
Venue:
TSD '99 Proceedings of the Second International Workshop on Text, Speech and Dialogue
Year:
1999

Citing 1
Cited 2

A practical part-of-speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing

Diacritics Restoration: Learning from Letters versus Learning from Words

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Special speech synthesis for social network websites

TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the most frequent kind of typographic errors specific to Spanish is connected with accentuation, namely, with omission of an obligatory stress mark or insertion of a superfluous one. If such an error transforms one word to another existing one, the latter cannot be detected by usual spell-checkers, since some context analysis is necessary. A simple procedure is proposed for this task. It relies on (1) some simple heuristics that determine linear context and (2) on a small list of pairs of words that differ only in accentuation mark. This idea is applied to numerous nouns or adjectives like número that pass to quasi-homonymous personal verb forms if they lose their stress marks.