Grammatical category disambiguation by statistical optimization
Computational Linguistics
Studies in part of speech labelling
HLT '91 Proceedings of the workshop on Speech and Natural Language
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
A simple rule-based part of speech tagger
HLT '91 Proceedings of the workshop on Speech and Natural Language
Mostly-unsupervised statistical segmentation of Japanese Kanji sequences
Natural Language Engineering
Mostly-unsupervised statistical segmentation of Japanese: applications to kanji
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
BBN: description of the PLUM system as used for MUC-5
MUC5 '93 Proceedings of the 5th conference on Message understanding
BEN: description of the PLUM system as used for MUC-6
MUC6 '95 Proceedings of the 6th conference on Message understanding
Hypothesizing word association from untagged text
HLT '93 Proceedings of the workshop on Human Language Technology
Japanese word segmentation by hidden Markov model
HLT '94 Proceedings of the workshop on Human Language Technology
Progress in information extraction
TIPSTER '96 Proceedings of a workshop on held at Vienna, Virginia: May 6-8, 1996
BBN's PLUM Probabilistic Language Understanding system
TIPSTER '93 Proceedings of a workshop on held at Fredericksburg, Virginia: September 19-23, 1993
Hi-index | 0.00 |
This paper describes an example-based correction component for Japanese word segmentation and part of speech labelling (AMED), and a way of combining it with a pre-existing rule-based Japanese morphological analyzer and a probabilistic part of speech tagger.Statistical algorithms rely on frequency of phenomena or events in corpora; however, low frequency events are often inadequately represented. Here we report on an example based technique used in finding word segments and their part of speech in Japanese text. Rather than using hand-crafted rules, the algorithm employs example data, drawing generalizations during training.