Evaluating smoothing algorithms against plausibility judgements

Authors:
Maria Lapata;Frank Keller;Scott McDonald
Affiliations:
Saarland University, Saarbrücken, Germany;Saarland University, Saarbrücken, Germany;University of Edinburgh, Edinburgh, UK
Venue:
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Year:
2001

Citing 8
Cited 6

Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems

Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems
Class-based n-gram models of natural language

Computational Linguistics
Similarity-Based Models of Word Cooccurrence Probabilities

Machine Learning - Special issue on natural language learning
Determinants of adjective-noun plausibility

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Generalizing automatically generated selectional patterns

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Measures of distributional similarity

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Class-based probability estimation using a semantic hierarchy

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies

A probabilistic account of logical metonymy

Computational Linguistics
Using the web to obtain frequencies for unseen bigrams

Computational Linguistics - Special issue on web as corpus
A comparison of parsing technologies for the biomedical domain

Natural Language Engineering
Evaluating and combining approaches to selectional preference acquisition

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Using the web to overcome data sparseness

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Detection of incorrect case assignments in paraphrase generation

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous research has shown that the plausibility of an adjective-noun combination is correlated with its corpus co-occurrence frequency. In this paper, we estimate the co-occurrence frequencies of adjective-noun pairs that fail to occur in a 100 million word corpus using smoothing techniques and compare them to human plausibility ratings. Both class-based smoothing and distance-weighted averaging yield frequency estimates that are significant predictors of rated plausibility, which provides independent evidence for the validity of these smoothing techniques.