Diacritics Restoration: Learning from Letters versus Learning from Words

  • Authors:
  • Rada Mihalcea

  • Affiliations:
  • -

  • Venue:
  • CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper presents a method for diacritics restoration based on learning mechanisms that act at letter level. This technique is new to our knowledge, and we compare it with the well known techniques for diacritics restoration that learn from words. Our method is particularly useful for languages that lack large electronic dictionaries and where means for generalization beyond words are required. Accuracies of over 99% at letter level are reported.