Real-word spelling correction using Google Web IT 3-grams

  • Authors:
  • Aminul Islam;Diana Inkpen

  • Affiliations:
  • University of Ottawa, Ottawa, ON, Canada;University of Ottawa, Ottawa, ON, Canada

  • Venue:
  • EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a method for detecting and correcting multiple real-word spelling errors using the Google Web IT 3-gram data set and a normalized and modified version of the Longest Common Subsequence (LCS) string matching algorithm. Our method is focused mainly on how to improve the detection recall (the fraction of errors correctly detected) and the correction recall (the fraction of errors correctly amended), while keeping the respective precisions (the fraction of detections or amendments that are correct) as high as possible. Evaluation results on a standard data set show that our method outperforms two other methods on the same task.