Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Feature-rich part-of-speech tagging with a cyclic dependency network
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Scaling conditional random fields using error-correcting codes
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Chinese segmentation and new word detection using conditional random fields
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Learning Factor Graphs in Polynomial Time and Sample Complexity
The Journal of Machine Learning Research
Piecewise pseudolikelihood for efficient training of conditional random fields
Proceedings of the 24th international conference on Machine learning
Subword-based tagging by conditional random fields for Chinese word segmentation
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Learning and inference over constrained output
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
A chunking strategy towards unknown word detection in chinese word segmentation
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Integrating unsupervised and supervised word segmentation: The role of goodness measures
Information Sciences: an International Journal
2D correlative-chain conditional random fields for semantic annotation of web objects
Journal of Computer Science and Technology
Chinese new word identification: a latent discriminative model with global features
Journal of Computer Science and Technology - Special issue on natural language processing
Hi-index | 0.00 |
As a powerful sequence labeling model, conditional random fields (CRFs) have had successful applications in many natural language processing (NLP) tasks. However, the high complexity of CRFs training only allows a very small tag (or label) set, because the training becomes intractable as the tag set enlarges. This paper proposes an improved decomposed training and joint decoding algorithm for CRF learning. Instead of training a single CRF model for all tags, it trains a binary sub-CRF independently for each tag. An optimal tag sequence is then produced by a joint decoding algorithm based on the probabilistic output of all sub-CRFs involved. To test its effectiveness, we apply this approach to tackling Chinese word segmentation (CWS) as a sequence labeling problem. Our evaluation shows that it can reduce the computational cost of this language processing task by 40-50% without any significant performance loss on various large-scale data sets.