Estimating lexical priors for low-frequency morphologically ambiguous forms
Computational Linguistics
Parallel Optimization: Theory, Algorithms and Applications
Parallel Optimization: Theory, Algorithms and Applications
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A stochastic Japanese morphological analyzer using a forward-DP backward-A* N-best search algorithm
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
The first international Chinese word segmentation Bakeoff
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Hownet And the Computation of Meaning
Hownet And the Computation of Meaning
Chinese and Japanese word segmentation using word-level and character-level information
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Non-projective dependency parsing using spanning tree algorithms
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Discriminative learning and spanning tree algorithms for dependency parsing
Discriminative learning and spanning tree algorithms for dependency parsing
A hybrid approach to word segmentation and POS tagging
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Word lattice reranking for Chinese word segmentation and part-of-speech tagging
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A dual-layer CRFs based joint decoding method for cascaded segmentation and labeling tasks
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Bitext dependency parsing with bilingual subtree constraints
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Filtering syntactic constraints for statistical machine translation
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
A fast decoder for joint word segmentation and POS-tagging using a single discriminative model
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Jointly modeling WSD and SRL with Markov logic
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Syntactic processing using the generalized perceptron and beam search
Computational Linguistics
A stacked sub-word model for joint Chinese word segmentation and part-of-speech tagging
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Parsing the internal structure of words: a new paradigm for Chinese word segmentation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Pointwise prediction for robust, adaptable Japanese morphological analysis
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
SMT helps bitext dependency parsing
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Joint models for Chinese POS tagging and dependency parsing
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Integrating Generative and Discriminative Character-Based Models for Chinese Word Segmentation
ACM Transactions on Asian Language Information Processing (TALIP)
EXPLOITING SUBTREES IN AUTO-PARSED DATA TO IMPROVE DEPENDENCY PARSING
Computational Intelligence
Utilizing dependency language models for graph-based dependency parsing models
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Incremental joint approach to word segmentation, POS tagging, and dependency parsing in Chinese
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Iterative annotation transformation with predict-self reestimation for Chinese word segmentation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Joint Chinese word segmentation, POS tagging and parsing
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Unified dependency parsing of Chinese morphological and syntactic structures
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Automatic Korean word spacing using Pegasos algorithm
Information Processing and Management: an International Journal
Joint Optimization for Chinese POS Tagging and Dependency Parsing
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
In this paper, we present a discriminative word-character hybrid model for joint Chinese word segmentation and POS tagging. Our word-character hybrid model offers high performance since it can handle both known and unknown words. We describe our strategies that yield good balance for learning the characteristics of known and unknown words and propose an error-driven policy that delivers such balance by acquiring examples of unknown words from particular errors in a training corpus. We describe an efficient framework for training our model based on the Margin Infused Relaxed Algorithm (MIRA), evaluate our approach on the Penn Chinese Treebank, and show that it achieves superior performance compared to the state-of-the-art approaches reported in the literature.