A Mixed Trigrams Approach for Context Sensitive Spell Checking

  • Authors:
  • Davide Fossati;Barbara Eugenio

  • Affiliations:
  • Department of Computer Science, University of Illinois at Chicago, Chicago, IL, USA;Department of Computer Science, University of Illinois at Chicago, Chicago, IL, USA

  • Venue:
  • CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2009
  • Real-Word typo detection

    NLDB'09 Proceedings of the 14th international conference on Applications of Natural Language to Information Systems

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the problem of real-word spell checking, i.e., the detection and correction of typos that result in real words of the target language. This paper proposes a methodology based on a mixed trigrams language model. The model has been implemented, trained, and tested with data from the Penn Treebank. The approach has been evaluated in terms of hit rate, false positive rate, and coverage. The experiments show promising results with respect to the hit rates of both detection and correction, even though the false positive rate is still high.