A maximum entropy approach to natural language processing
Computational Linguistics
A comparison of approaches to on-line handwritten character recognition
A comparison of approaches to on-line handwritten character recognition
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Support vector machine learning for interdependent and structured output spaces
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning as search optimization: approximate large margin methods for structured prediction
ICML '05 Proceedings of the 22nd international conference on Machine learning
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
ACM SIGIR Forum
Incremental parsing with the perceptron algorithm
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Sequence Labeling with Reinforcement Learning and Ranking Algorithms
ECML '07 Proceedings of the 18th European conference on Machine Learning
A probabilistic learning method for XML annotation of documents
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Hi-index | 0.00 |
Supervised learning is about learning functions given a set of input and corresponding output examples. A recent trend in this field is to consider structured outputs such as sequences, trees or graphs. When predicting such structured data, learning models have to select solutions within very large discrete spaces. The combinatorial nature of this problem has recently led to learning models integrating a search component. In this paper, we show that Structured Prediction (SP) can be seen as a sequential decision problem. We introduce SP-MDP: a Markov Decision Process based formulation of Structured Prediction. Learning the optimal policy in SP-MDP is shown to be equivalent as solving the SP problem. This allows us to apply classical Reinforcement Learning (RL) algorithms to SP. We present experiments on two tasks. The first, sequence labeling, has been extensively studied and allows us to compare the RL approach with traditional SP methods. The second, tree transformation, is a challenging SP task with numerous large-scale real-world applications. We show successful results with general RL algorithms on this task on which traditional SP models fail.