Kleio: a knowledge-enriched information retrieval system for biology
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
A discriminative alignment model for abbreviation recognition
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Semi-supervised lexicon mining from parenthetical expressions in monolingual web pages
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Abbreviation generation for Japanese multi-word expressions
MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Personalized query expansion in the QIC system
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
SubSift web services and workflows for profiling and comparing scientists and their published works
Future Generation Computer Systems
Methodological Review: Biomedical text mining and its applications in cancer research
Journal of Biomedical Informatics
Learning Abbreviations from Chinese and English Terms by Modeling Non-Local Information
ACM Transactions on Asian Language Information Processing (TALIP)
Hi-index | 3.84 |
Motivation: Acronyms result from a highly productive type of term variation and trigger the need for an acronym dictionary to establish associations between acronyms and their expanded forms. Results: We propose a novel method for recognizing acronym definitions in a text collection. Assuming a word sequence co-occurring frequently with a parenthetical expression to be a potential expanded form, our method identifies acronym definitions in a similar manner to the statistical term recognition task. Applied to the whole MEDLINE (7 811 582 abstracts), the implemented system extracted 886 755 acronym candidates and recognized 300 954 expanded forms in reasonable time. Our method outperformed base-line systems, achieving 99% precision and 82--95% recall on our evaluation corpus that roughly emulates the whole MEDLINE. Availability and Supplementary information: The implementations and supplementary information are available at our web site: http://www.chokkan.org/research/acromine/ Contact: okazaki@mi.ci.i.u-tokyo.ac.jp