Building probabilistic models for natural language
Building probabilistic models for natural language
An Efficient, Probabilistically Sound Algorithm for Segmentation andWord Discovery
Machine Learning - Special issue on natural language learning
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Foundations of Computational Linguistics: Man-Machine Communication in Natural Language
Foundations of Computational Linguistics: Man-Machine Communication in Natural Language
Hi-index | 0.00 |
In this paper, we propose a novel method of building alanguage model for open-vocabulary Korean wordrecognition. Due to the complex morphology of Korean, itis inappropriate to use lexicons based on the linguisticentities such as words and morphemes in open-vocabularydomains. Instead, we build the lexicon bycollecting variable length character sequences from theraw texts using a dynamic Bayesian network model of thelanguage.In simulated word recognition experiments, theproposed language model could find correct words fromlattices of character candidates in 94.3% of cases,increasing the word recognition rates by 20.9%.