Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
Empirical methods for artificial intelligence
Empirical methods for artificial intelligence
A maximum entropy approach to natural language processing
Computational Linguistics
Machine Learning - Special issue on learning with probabilistic representations
A Guide to the Literature on Learning Probabilistic Networks from Data
IEEE Transactions on Knowledge and Data Engineering
Sequential model selection for word sense disambiguation
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Word-sense disambiguation using decomposable models
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Corpus-based statistical sense resolution
HLT '93 Proceedings of the workshop on Human Language Technology
Context-specific independence in Bayesian networks
UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
Word sense disambiguation with pattern learning and automatic feature selection
Natural Language Engineering
Development and use of a gold-standard data set for subjectivity classifications
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Instance based learning with automatic feature selection applied to word sense disambiguation
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Dependency of context-based word sense disambiguation from representation and domain complexity
NAACL-ANLP-SSCNLPS '00 Proceedings of the 2000 NAACL-ANLP Workshop on Syntactic and semantic complexity in natural language processing systems - Volume 1
A comparison between supervised learning algorithms for word sense disambiguation
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
An empirical study of the domain dependence of supervised word sense disambiguation systems
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Preposition semantic classification via Penn Treebank and FrameNet
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Word sense disambiguation: A survey
ACM Computing Surveys (CSUR)
Exploiting semantic role resources for preposition disambiguation
Computational Linguistics
Dependency of context-based word sense disambiguation from representation and domain complexity
NLPComplexity '00 NAACL-ANLP 2000 Workshop: Syntactic and Semantic Complexity in Natural Language Processing Systems
Word sense disambiguation methods
Programming and Computing Software
Multiplicity and word sense: evaluating and learning from multiply labeled word sense annotations
Language Resources and Evaluation
Ontology-Based word sense disambiguation for scientific literature
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
In this paper, we describe a framework for developing probabilistic classifiers in natural language processing. Our focus is on formulating models that capture the most important interdependencies among features, to avoid overfitting the data while also characterizing the data well. The class of probability models and the associated inference techniques described here were developed in mathematical statistics, and are widely used in artificial intelligence and applied statistics. Our goal is to make this model selection framework accessible to researchers in NLP, and provide pointers to available software and important references. In addition, we describe how the quality of the three determinants of classifier performance (the features, the form of the model, and the parameter estimates) can be separately evaluated. We also demonstrate the classification performance of these models in a large-scale experiment involving the disambiguation of 34 words taken from the HECTOR word sense corpus (Hanks 1996). In 10-fold cross-validations, the model search procedure performs significantly better than naive Bayes on 6 of the words without being significantly worse on any of them.