A plethora of methods for learning English countability

  • Authors:
  • Timothy Baldwin;Francis Bond

  • Affiliations:
  • Stanford University, Stanford, CA;Nippon Telegraph and Telephone Corporation, Kyoto, Japan

  • Venue:
  • EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper compares a range of methods for classifying words based on linguistic diagnostics, focusing on the task of learning countabilities for English nouns. We propose two basic approaches to feature representation: distribution-based representation, which simply looks at the distribution of features in the corpus data, and agreement-based representation which analyses the level of token-wise agreement between multiple preprocessor systems. We additionally compare a single multiclass classifier architecture with a suite of binary classifiers, and combine analyses from multiple preprocessors. Finally, we present and evaluate a feature selection method.