A discriminative alignment model for abbreviation recognition

  • Authors:
  • Naoaki Okazaki;Sophia Ananiadou;Jun'ichi Tsujii

  • Affiliations:
  • University of Tokyo, Bunkyo-ku, Tokyo, Japan;University of Manchester, Manchester, UK;University of Tokyo, Bunkyo-ku, Tokyo, Japan and University of Manchester, Manchester, UK

  • Venue:
  • COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a discriminative alignment model for extracting abbreviations and their full forms appearing in actual text. The task of abbreviation recognition is formalized as a sequential alignment problem, which finds the optimal alignment (origins of abbreviation letters) between two strings (abbreviation and full form). We design a large amount of finegrained features that directly express the events where letters produce or do not produce abbreviations. We obtain the optimal combination of features on an aligned abbreviation corpus by using the maximum entropy framework. The experimental results show the usefulness of the alignment model and corpus for improving abbreviation recognition.