Automatic text processing
Spelling correction for the telecommunications network for the deaf
Communications of the ACM
Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
C4.5: programs for machine learning
C4.5: programs for machine learning
Semiautomatic disabbreviation of technical text
Information Processing and Management: an International Journal
Acrophile: an automated acronym extractor and server
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Information Retrieval
Extracting Knowledge from Diagnostic Databases
IEEE Expert: Intelligent Systems and Their Applications
TextTiling: segmenting text into multi-paragraph subtopic passages
Computational Linguistics
A hybrid approach to chinese abbreviation expansion
ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Automatic expansion of abbreviations in chinese news text
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Hi-index | 0.00 |
Unknown words such as proper nouns, abbreviations, and acronyms are a major obstacle in text processing. Abbreviations, in particular, are difficult to read/process because they are often domain specific. In this paper, we propose a method for automatic expansion of abbreviations by using context and character information. In previous studies dictionaries were used to search for abbreviation expansion candidates (candidates words for original form of abbreviations) to expand abbreviations. We use a corpus with few abbreviations from the same field instead of a dictionary. We calculate the adequacy of abbreviation expansion candidates based on the similarity between the context of the target abbreviation and that of its expansion candidate. The similarity is calculated using a vector space model in which each vector element consists of words surrounding the target abbreviation and those of its expansion candidate. Experiments using approximately 10,000 documents in the field of aviation showed that the accuracy of the proposed method is 10% higher than that of previously developed methods.