Automatic acquisition of long-distance acronym definitions

  • Authors:
  • Manuel Zahariev

  • Affiliations:
  • Logic and Functional Programming Group, School of Computing Sciences, Simon Fraser University, Burnaby, B.C., Canada

  • Venue:
  • Design and application of hybrid intelligent systems
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Acronyms are a very dynamic area of the lexicon of many languages. A hybrid, modular methodology for the acquisition of acronyms is presented, which uses an existing acronym-expansion matching component, and machine learning in two separate phases for the identification of long-distance acronym definition patterns.The resulting system, using Support Vector Machines (SVM) is trained on 600 news stories from the Wall Street Journal component of the Penn Treebank corpus using a number of lexical, syntactic, and acronym-expansion matching features. Statistical cooccurrence information for acronym-expansion pairs is extracted from search engine "hit counts".The system achieves Fβ=1=92.38% on 400 news stories from the same source and has good asymptotic efficiency, making it adequate for the automatic extraction of acronyms even from noisy sources, such as newspaper text.