A maximum entropy approach to natural language processing
Computational Linguistics
A comparison of approaches to on-line handwritten character recognition
A comparison of approaches to on-line handwritten character recognition
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A Learning Rate Analysis of Reinforcement Learning Algorithms in Finite-Horizon
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
On the Complexity of General Context-Free Language Parsing and Recognition (Extended Abstract)
Proceedings of the 6th Colloquium, on Automata, Languages and Programming
Support vector machine learning for interdependent and structured output spaces
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning as search optimization: approximate large margin methods for structured prediction
ICML '05 Proceedings of the 22nd international conference on Machine learning
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
ACM SIGIR Forum
Incremental parsing with the perceptron algorithm
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Exponentiated gradient algorithms for log-linear structured prediction
Proceedings of the 24th international conference on Machine learning
Incremental Bayesian networks for structure prediction
Proceedings of the 24th international conference on Machine learning
Search-based structured prediction
Machine Learning
Experiments with infinite-horizon, policy-gradient estimation
Journal of Artificial Intelligence Research
A probabilistic learning method for XML annotation of documents
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Guest editorial: special issue on structured prediction
Machine Learning
Learning with configurable operators and RL-based heuristics
NFMCP'12 Proceedings of the First international conference on New Frontiers in Mining Complex Patterns
Hi-index | 0.00 |
We formalize the problem of Structured Prediction as a Reinforcement Learning task. We first define a Structured Prediction Markov Decision Process (SP-MDP), an instantiation of Markov Decision Processes for Structured Prediction and show that learning an optimal policy for this SP-MDP is equivalent to minimizing the empirical loss. This link between the supervised learning formulation of structured prediction and reinforcement learning (RL) allows us to use approximate RL methods for learning the policy. The proposed model makes weak assumptions both on the nature of the Structured Prediction problem and on the supervision process. It does not make any assumption on the decomposition of loss functions, on data encoding, or on the availability of optimal policies for training. It then allows us to cope with a large range of structured prediction problems. Besides, it scales well and can be used for solving both complex and large-scale real-world problems. We describe two series of experiments. The first one provides an analysis of RL on classical sequence prediction benchmarks and compares our approach with state-of-the-art SP algorithms. The second one introduces a tree transformation problem where most previous models fail. This is a complex instance of the general labeled tree mapping problem. We show that RL exploration is effective and leads to successful results on this challenging task. This is a clear confirmation that RL could be used for large size and complex structured prediction problems.