A stochastic finite-state word-segmentation algorithm for Chinese
Computational Linguistics
Machine Learning - Special issue on inductive transfer
Recent Advances in Parsing Technology
Recent Advances in Parsing Technology
A Learning Approach to Shallow Parsing
A Learning Approach to Shallow Parsing
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Noun phrase recognition by system combination
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Improving Chinese tokenization with linguistic filters on statistical lexical acquisition
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
A trainable rule-based algorithm for word segmentation
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Tagging inflective languages: prediction of morphological categories for a rich, structured tagset
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Transformation-based learning in the fast lane
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
A unified statistical model for the identification of English baseNP
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Introduction to the CoNLL-2000 shared task: chunking
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Coaxing confidences from an old friend: probabilistic classifications from transformation rule lists
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Improving Feature Selection for Maximum Entropy-Based Word Sense Disambiguation
PorTAL '02 Proceedings of the Third International Conference on Advances in Natural Language Processing
A maximum-entropy chinese parser augmented by transformation-based learning
ACM Transactions on Asian Language Information Processing (TALIP)
A statistical method for short answer extraction
ODQA '01 Proceedings of the workshop on Open-domain question answering - Volume 12
Factorizing complex models: a case study in mention detection
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
TBL-improved non-deterministic segmentation and POS tagging for a Chinese parser
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Jointly labeling multiple sequences: a factorial HMM approach
ACLstudent '05 Proceedings of the ACL Student Research Workshop
Hi-index | 0.00 |
This paper presents a novel method that allows a machine learning algorithm following the transformation-based learning paradigm (Brill, 1995) to be applied to multiple classification tasks by training jointly and simultaneously on all fields. The motivation for constructing such a system stems from the observation that many tasks in natural language processing are naturally composed of multiple subtasks which need to be resolved simultaneously; also tasks usually learned in isolation can possibly benefit from being learned in a joint framework, as the signals for the extra tasks usually constitute inductive bias.The proposed algorithm is evaluated in two experiments: in one, the system is used to jointly predict the part-of-speech and text chunks/baseNP chunks of an English corpus; and in the second it is used to learn the joint prediction of word segment boundaries and part-of-speech tagging for Chinese. The results show that the simultaneous learning of multiple tasks does achieve an improvement in each task upon training the same tasks sequentially. The part-of-speech tagging result of 96.63% is state-of-the-art for individual systems on the particular train/test split.