A simple automatic derivative evaluation program
Communications of the ACM
Automatic differentiation of algorithms: from simulation to optimization
Automatic differentiation of algorithms: from simulation to optimization
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
The Journal of Machine Learning Research
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Learning with probabilistic features for improved pipeline models
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Hi-index | 0.00 |
In this paper, we describe a novel approach to cascaded learning and inference on sequences. We propose a weakly joint learning model on cascaded inference on sequences, called multilayer sequence labeling. In this model, inference on sequences is modeled as cascaded decision. However, the decision on a sequence labeling sequel to other decisions utilizes the features on the preceding results as marginalized by the probabilistic models on them. It is not novel itself, but our idea central to this paper is that the probabilistic models on succeeding labeling are viewed as indirectly depending on the probabilistic models on preceding analyses. We also propose two types of efficient dynamic programming which are required in the gradient-based optimization of an objective function. One of the dynamic programming algorithms resembles back propagation algorithm for multilayer feed-forward neural networks. The other is a generalized version of the forward-backward algorithm. We also report experiments of cascaded part-of-speech tagging and chunking of English sentences and show effectiveness of the proposed method.