Morphologic Non-Word Error Detection

  • Authors:
  • Stephane Bressan;Riky Irawan

  • Affiliations:
  • National University of Singapore;National University of Singapore

  • Venue:
  • DEXA '04 Proceedings of the Database and Expert Systems Applications, 15th International Workshop
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Writing and sending e-mails and short messages (SMS) has become one of the most pervasive activities in our daily life. Whether emails from our computers, short messages from our portable phones or both from our portable digital assistants, there is no occasion that does not deserve a text message: "Dear Colleagues, please find attached to this email ... ", "C U at 9pm?", "Forget to turn return DVD", etc. Many have warned that the typos, misspellings, grammatical errors and other linguistic indelicacies, which are commonly accepted in these messages, are announcing the decadence of human languages and communication. One answer to these reservations and critics is to provide the tools for the automatic detection and correction of such errors. In this paper, we are interested in the problem of the detection of non-words. We propose and evaluate two families of new methods based on extended n-grams and morphemes, respectively. We show that most methods we propose yield a better performance than the state of the art technique.