Building an abbreviation dictionary using a term recognition approach

Authors:
Naoaki Okazaki;Sophia Ananiadou
Affiliations:
Graduate School of Information Science and Technology, The University of Tokyo 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8651, Japan;School of Computer Science, The University of Manchester Oxford Road, Manchester, M13 9PL, UK
Venue:
Bioinformatics
Year:
2006

Citing 0
Cited 9

Kleio: a knowledge-enriched information retrieval system for biology

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Abbreviation Disambiguation: Experiments with Various Variants of the One Sense per Discourse Hypothesis

NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
A discriminative alignment model for abbreviation recognition

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Semi-supervised lexicon mining from parenthetical expressions in monolingual web pages

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Abbreviation generation for Japanese multi-word expressions

MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Personalized query expansion in the QIC system

Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
SubSift web services and workflows for profiling and comparing scientists and their published works

Future Generation Computer Systems
Methodological Review: Biomedical text mining and its applications in cancer research

Journal of Biomedical Informatics
Learning Abbreviations from Chinese and English Terms by Modeling Non-Local Information

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Acronyms result from a highly productive type of term variation and trigger the need for an acronym dictionary to establish associations between acronyms and their expanded forms. Results: We propose a novel method for recognizing acronym definitions in a text collection. Assuming a word sequence co-occurring frequently with a parenthetical expression to be a potential expanded form, our method identifies acronym definitions in a similar manner to the statistical term recognition task. Applied to the whole MEDLINE (7 811 582 abstracts), the implemented system extracted 886 755 acronym candidates and recognized 300 954 expanded forms in reasonable time. Our method outperformed base-line systems, achieving 99% precision and 82--95% recall on our evaluation corpus that roughly emulates the whole MEDLINE. Availability and Supplementary information: The implementations and supplementary information are available at our web site: http://www.chokkan.org/research/acromine/ Contact: okazaki@mi.ci.i.u-tokyo.ac.jp