Categorizing Unknown Words: A Decision Tree-Based Misspelling Identifier

  • Authors:
  • Janine Toole

  • Affiliations:
  • -

  • Venue:
  • AI '99 Proceedings of the 12th Australian Joint Conference on Artificial Intelligence: Advanced Topics in Artificial Intelligence
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a robust, portable system for categorizing unknown words. It is based on a multi- component architecture where each component is responsible for identifying one class of unknown words. The focus of this paper is the component that identifies spelling errors. The misspelling identifier uses a decision tree architecture to combine multiple types of evidence about the unknown word. The misspelling identifier is evaluated using data from live closed captions - a gem-e replete with a wide variety of unknown words.