The nature of statistical learning theory
The nature of statistical learning theory
Machine Learning
Making large-scale support vector machine learning practical
Advances in kernel methods
Acrophile: an automated acronym extractor and server
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Using Compression to Identify Acronyms in Text
DCC '00 Proceedings of the Conference on Data Compression
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A Linguistic Approach to Extracting Acronym Expansions from Text
Knowledge and Information Systems
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Hi-index | 0.00 |
Acronyms are a very dynamic area of the lexicon of many languages. A hybrid, modular methodology for the acquisition of acronyms is presented, which uses an existing acronym-expansion matching component, and machine learning in two separate phases for the identification of long-distance acronym definition patterns.The resulting system, using Support Vector Machines (SVM) is trained on 600 news stories from the Wall Street Journal component of the Penn Treebank corpus using a number of lexical, syntactic, and acronym-expansion matching features. Statistical cooccurrence information for acronym-expansion pairs is extracted from search engine "hit counts".The system achieves Fβ=1=92.38% on 400 news stories from the same source and has good asymptotic efficiency, making it adequate for the automatic extraction of acronyms even from noisy sources, such as newspaper text.