Automatic expansion of abbreviations by using context and character information
Information Processing and Management: an International Journal
Resolving abbreviations to their senses in Medline
Bioinformatics
Chinese unknown word identification using class-based LM
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Vocabulary expansion through automatic abbreviation generation for Chinese voice search
Computer Speech and Language
Hi-index | 0.00 |
This paper presents an n-gram based approach to Chinese abbreviation expansion. In this study, we distinguish reduced abbreviations from non-reduced abbreviations that are created by elimination or generalization. For a reduced abbreviation, a mapping table is compiled to map each short-word in it to a set of long-words, and a bigram based Viterbi algorithm is thus applied to decode an appropriate combination of long-words as its full-form. For a non-reduced abbreviation, a dictionary of non-reduced abbreviation/full-form pairs is used to generate its expansion candidates, and a disambiguation technique is further employed to select a proper expansion based on bigram word segmentation. The evaluation on an abbreviation-expanded corpus built from the PKU corpus showed that the proposed system achieved a recall of 82.9% and a precision of 85.5% on average for different types of abbreviations in Chinese news text.