C4.5: programs for machine learning
C4.5: programs for machine learning
Forgetting Exceptions is Harmful in Language Learning
Machine Learning - Special issue on natural language learning
A Simple Spanish Part of Speech Tagger for Detection and Correction of Accentuation Error
TSD '99 Proceedings of the Second International Workshop on Text, Speech and Dialogue
Automatic diacritic restoration for resource-scarce languages
TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Diacritics restoration in vietnamese: letter based vs. syllable based model
PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Statistical unicodification of African languages
Language Resources and Evaluation
Hi-index | 0.01 |
This paper presents a method for diacritics restoration based on learning mechanisms that act at letter level. This technique is new to our knowledge, and we compare it with the well known techniques for diacritics restoration that learn from words. Our method is particularly useful for languages that lack large electronic dictionaries and where means for generalization beyond words are required. Accuracies of over 99% at letter level are reported.