A quantitative evaluation of linguistic tests for the automatic prediction of semantic markedness

  • Authors:
  • Vasileios Hatzivassiloglou;Kathleen McKeown

  • Affiliations:
  • Columbia University, New York, N.Y.;Columbia University, New York, N.Y.

  • Venue:
  • ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a corpus-based study of methods that have been proposed in the linguistics literature for selecting the semantically unmarked term out of a pair of antonymous adjectives. Solutions to this problem are applicable to the more general task of selecting the positive term from the pair. Using automatically collected data, the accuracy and applicability of each method is quantified, and a statistical analysis of the significance of the results is performed. We show that some simple methods are indeed good indicators for the answer to the problem while other proposed methods fail to perform better than would be attributable to chance. In addition, one of the simplest methods, text frequency, dominates all others. We also apply two generic statistical learning methods for combining the indications of the individual methods, and compare their performance to the simple methods. The most sophisticated complex learning method offers a small, but statistically significant, improvement over the original tests.