Real-word spelling correction using Google web 1Tn-gram data set

  • Authors:
  • Aminul Islam;Diana Inkpen

  • Affiliations:
  • University of Ottawa, Ottawa, ON, Canada;University of Ottawa, Ottawa, ON, Canada

  • Venue:
  • Proceedings of the 18th ACM conference on Information and knowledge management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a method for correcting real-word spelling errors using the Google Web 1T n-gram data set and a normalized and modified version of the Longest Common Subsequence (LCS) string matching algorithm. Our method is focused mainly on how to improve the correction recall (the fraction of errors corrected) while keeping the correction precision (the fraction of suggestions that are correct) as high as possible. Evaluation results on a standard data set show that our method performs very well.