Learning as search optimization: approximate large margin methods for structured prediction

Authors:
Hal Daumé, III;Daniel Marcu
Affiliations:
Information Sciences Institute, Marina del Rey, CA;Information Sciences Institute, Marina del Rey, CA
Venue:
ICML '05 Proceedings of the 22nd international conference on Machine learning
Year:
2005

Citing 14
Cited 48

Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
Large Margin Classification Using the Perceptron Algorithm

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A new approximate maximal margin classification algorithm

The Journal of Machine Learning Research
Text chunking based on a generalization of winnow

The Journal of Machine Learning Research
A family of additive online algorithms for category ranking

The Journal of Machine Learning Research
Support vector machine learning for interdependent and structured output spaces

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Gaussian process classification for segmenting and annotating sequences

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Case-factor diagrams for structured probabilistic modeling

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Incremental parsing with the perceptron algorithm

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics

Online decoding of Markov models under latency constraints

ICML '06 Proceedings of the 23rd international conference on Machine learning
Efficient inference on sequence segmentation models

ICML '06 Proceedings of the 23rd international conference on Machine learning
A large-scale exploration of effective global features for a joint entity detection and tracking model

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Improving discriminative sequential learning by discovering important association of statistics

ACM Transactions on Asian Language Information Processing (TALIP)
Piecewise pseudolikelihood for efficient training of conditional random fields

Proceedings of the 24th international conference on Machine learning
On learning linear ranking functions for beam search

Proceedings of the 24th international conference on Machine learning
Enhanced max margin learning on multimodal data mining in a multimedia database

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Structured machine learning: the next ten years

Machine Learning
Probabilistic Model for Structured Document Mapping

MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
Sequence Labeling with Reinforcement Learning and Ranking Algorithms

ECML '07 Proceedings of the 18th European conference on Machine Learning
Sequence Labelling SVMs Trained in One Pass

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Applications of Reinforcement Learning to Structured Prediction

Recent Advances in Reinforcement Learning
Multi-domain spoken language understanding with transfer learning

Speech Communication
Polyhedral outer approximations with application to natural language parsing

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A fast boosting-based learner for feature-rich tagging and chunking

CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Deterministic shift-reduce parsing for unification-based grammars by using default unification

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A discriminative model for tree-to-tree translation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
BestCut: a graph algorithm for coreference resolution

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
A hybrid Markov/semi-Markov conditional random field for sequence segmentation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
On structured output training: hard cases and an efficient alternative

Machine Learning
LTAG dependency parsing with bidirectional incremental construction

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Cross-task knowledge-constrained self training

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A discriminative latent variable chinese segmenter with hybrid word/character information

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Making grammar-based generation easier to deploy in dialogue systems

SIGdial '08 Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
Discriminative learning of beam-search heuristics for planning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Structured prediction with reinforcement learning

Machine Learning
Guest editorial: special issue on structured prediction

Machine Learning
Piecewise training for structured prediction

Machine Learning
Practical grammar-based NLG from examples

INLG '08 Proceedings of the Fifth International Natural Language Generation Conference
Learning Linear Ranking Functions for Beam Search with Application to Planning

The Journal of Machine Learning Research
A structured model for joint learning of argument roles and predicate senses

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Forest-guided supertagger training

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
From layout to semantic: a reranking model for mapping web documents to mediated XML representations

Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Multi-dimensional classification with Bayesian networks

International Journal of Approximate Reasoning
Beam-width prediction for efficient context-free parsing

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Optimal and syntactically-informed decoding for monolingual phrase-based alignment

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Learning with lookahead: can history-based models rival globally optimized models?

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Methodological Review: Coreference resolution: A review of general methodologies and applications in the clinical domain

Journal of Biomedical Informatics
Predicting Metal-Binding Sites from Protein Sequence

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Margin-Based active learning for structured output spaces

ECML'06 Proceedings of the 17th European conference on Machine Learning
Syntax-based grammaticality improvement using CCG and guided search

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Approaching Sentiment Analysis by using semi-supervised learning of multi-dimensional classifiers

Neurocomputing
Learning and inference order in structured output elements classification

ACIIDS'12 Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part I
An Ensemble Architecture for Learning Complex Problem-Solving Techniques from Demonstration

ACM Transactions on Intelligent Systems and Technology (TIST)
Structured perceptron with inexact search

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Detecting visual text

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Supervised pre-processing approaches in multiple class variables classification for fish recruitment forecasting

Environmental Modelling & Software
Partial-tree linearization: generalized word ordering for text synthesis

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mappings to structured output spaces (strings, trees, partitions, etc.) are typically learned using extensions of classification algorithms to simple graphical structures (eg., linear chains) in which search and parameter estimation can be performed exactly. Unfortunately, in many complex problems, it is rare that exact search or parameter estimation is tractable. Instead of learning exact models and searching via heuristic means, we embrace this difficulty and treat the structured output problem in terms of approximate search. We present a framework for learning as search optimization, and two parameter updates with convergence the-orems and bounds. Empirical evidence shows that our integrated approach to learning and decoding can outperform exact models at smaller computational cost.