Real-word spelling correction with trigrams: a reconsideration of the Mays, Damerau, and Mercer model

  • Authors:
  • Amber Wilcox-O'Hearn;Graeme Hirst;Alexander Budanitsky

  • Affiliations:
  • Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;Department of Computer Science, University of Toronto, Toronto, Ontario, Canada

  • Venue:
  • CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The trigram-based noisy-channel model of real-word spelling-error correction that was presented by Mays, Damerau, and Mercer in 1991 has never been adequately evaluated or compared with other methods. We analyze the advantages and limitations of the method, and present a new evaluation that enables a meaningful comparison with the WordNet-based method of Hirst and Budanitsky. The trigram method is found to be superior, even on content words. We then show that optimizing over sentences gives better results than variants of the algorithm that optimize over fixed-length windows.