Automatic extraction of acronym definitions from the Web

  • Authors:
  • David Sánchez;David Isern

  • Affiliations:
  • Department of Computer Science and Mathematics, Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA) Research Group, University Rovira i Virgili, Tarragona, Spain;Department of Computer Science and Mathematics, Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA) Research Group, University Rovira i Virgili, Tarragona, Spain

  • Venue:
  • Applied Intelligence
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Acronyms are widely used to abbreviate and stress important concepts. The discovery of the definitions associated to an acronym is an important matter in order to support language processing and knowledge-related tasks as information retrieval, ontology mapping or question answering. Acronyms represent a very dynamic and unbounded topic that is constantly evolving. Manual attempts to compose a global scale dictionary of acronym-definition pairs result in an overwhelming amount of work and limited results. Attending these shortcomings, this paper presents an automatic and unsupervised methodology to generate acronyms and extract their potential definitions from the Web. The method has been designed to minimise the set of constraints, offering a domain and -partially- language independent solution, and to exploit the Web in order to create large and general acronym-definition sets. Results have been manually evaluated against the largest manually built acronym repository: Acronym Finder. The evaluation shows that the proposed approach is able to improve the coverage of manual attempts maintaining a high precision.