Automatic Word Spacing Using Probabilistic Models Based on Character n-grams

  • Authors:
  • Do-Gil Lee;Hae-Chang Rim;Dongsuk Yook

  • Affiliations:
  • Korea University;Korea University;Korea University

  • Venue:
  • IEEE Intelligent Systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic word spacing decides the correct boundaries between words in a sentence. Word spacing is important in Korean, and word spacing errors are frequent. Several proposed probabilistic word-spacing models resolve problems with previous statistical approaches. These models regard automatic word spacing as a classification problem similar to part-of-speech tagging. By generalizing hidden Markov models, the models can consider a broader context and estimate more accurate probabilities. The authors tested these models under a wide range of conditions to compare them with the state of the art and performed detailed error analysis of them.