Investigating GIS and smoothing for maximum entropy taggers

  • Authors:
  • James R. Curran;Stephen Clark

  • Affiliations:
  • University of Edinburgh, Edinburgh;University of Edinburgh, Edinburgh

  • Venue:
  • EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper investigates two elements of Maximum Entropy tagging: the use of a correction feature in the Generalised Iterative Scaling (GIS) estimation algorithm, and techniques for model smoothing. We show analytically and empirically that the correction feature, assumed to be required for the correctness of GIS, is unnecessary. We also explore the use of a Gaussian prior and a simple cutoff for smoothing. The experiments are performed with two tagsets: the standard Penn Treebank POS tagset and the larger set of lexical types from Combinatory Categorial Grammar.