Foundations of statistical natural language processing
Foundations of statistical natural language processing
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Hi-index | 0.00 |
Word prediction is the problem of guessing the words which are likely to follow in a given text segment by displaying a list of the most probable words that could appear in that position. In this research, we designed and implemented three word predictors for Persian. Our baseline is a statistical-based system which uses language models. The first system uses word statistics; in the second one we use the main syntactic categories of a Persian POS tagged corpus; and the last one uses the main syntactic categories along with their morphological, syntactic and semantic subcategories. Using KeyStroke Saving (KSS) as the most important metrics to evaluate systems' performance, the primary word-based statistical system achieved 37% KSS, and the second system that used only the main syntactic categories with word-statistics achieved 38.95% KSS. Our last system which used all of the available information to the words get the best result by 42.45% KSS.