Communications of the ACM
The nature of statistical learning theory
The nature of statistical learning theory
Machine Learning
Inducing Features of Random Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Advances in kernel methods: support vector learning
Advances in kernel methods: support vector learning
Making large-scale support vector machine learning practical
Advances in kernel methods
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
Additive models, boosting, and inference for generalized divergences
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Large Margin Classification Using the Perceptron Algorithm
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Learning in Neural Networks: Theoretical Foundations
Learning in Neural Networks: Theoretical Foundations
Linear Programming Boosting via Column Generation
Machine Learning
Logistic Regression, AdaBoost and Bregman Distances
Machine Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Discriminative Reranking for Natural Language Parsing
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
An Efficient Boosting Algorithm for Combining Preferences
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Covering number bounds of certain regularized linear function classes
The Journal of Machine Learning Research
On the algorithmic implementation of multiclass kernel-based vector machines
The Journal of Machine Learning Research
Stochastic attribute-value grammars
Computational Linguistics
Estimators for stochastic "Unification-Based" grammars
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Ranking algorithms for named-entity extraction: boosting and the voted perceptron
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
SPoT: a trainable sentence planner
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
IEEE Transactions on Information Theory
Introduction to the special issue on statistical language modeling
ACM Transactions on Asian Language Information Processing (TALIP)
Case-factor diagrams for structured probabilistic modeling
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Filtering-Ranking Perceptron Learning for Partial Parsing
Machine Learning
Ranking and Reranking with Perceptron
Machine Learning
Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach
Computational Linguistics
Discriminative Reranking for Natural Language Parsing
Computational Linguistics
Discriminative language modeling with conditional random fields and the perceptron algorithm
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Incremental parsing with the perceptron algorithm
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Discriminative syntactic language modeling for speech recognition
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Discriminative n-gram language modeling
Computer Speech and Language
Case-factor diagrams for structured probabilistic modeling
Journal of Computer and System Sciences
Word sense disambiguation: A survey
ACM Computing Surveys (CSUR)
A semiparametric generative model for efficient structured-output supervised learning
Annals of Mathematics and Artificial Intelligence
Another look at indirect negative evidence
CACLA '09 Proceedings of the EACL 2009 Workshop on Cognitive Aspects of Computational Language Acquisition
Cutting-plane training of structural SVMs
Machine Learning
Max-Margin Weight Learning for Markov Logic Networks
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Nbest dependency parsing with linguistically rich models
IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Combination strategies for semantic role labeling
Journal of Artificial Intelligence Research
An integrated approach to robust processing of situated spoken dialogue
SRSL '09 Proceedings of the 2nd Workshop on Semantic Representation of Spoken Language
Bootstrapping semantic parsers from conversations
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
A fundamental problem in statistical parsing is the choice of criteria and algo-algorithms used to estimate the parameters in a model. The predominant approach in computational linguistics has been to use a parametric model with some variant of maximum-likelihood estimation. The assumptions under which maximum-likelihood estimation is justified are arguably quite strong. This chapter discusses the statistical theory underlying various parameter-estimation methods, and gives algorithms which depend on alternatives to (smoothed) maximum-likelihood estimation. We first give an overview of results from statistical learning theory. We then show how important concepts from the classification literature - specifically, generalization results based on margins on training data - can be derived for parsing models. Finally, we describe parameter estimation algorithms which are motivated by these generalization bounds.