A general and multi-lingual phrase chunking model based on masking method

Authors:
Yu-Chieh Wu;Chia-Hui Chang;Yue-Shi Lee
Affiliations:
Department of Computer Science and Information Engineering, National Central University, Jhongli City, Taoyuan County, Taiwan,R.O.C.;Department of Computer Science and Information Engineering, National Central University, Jhongli City, Taoyuan County, Taiwan,R.O.C.;Department of Computer Science and Information Engineering, Ming Chuan University, Taoyuan, Taiwan, R.O.C.
Venue:
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Year:
2006

Citing 11
Cited 7

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
A statistical learning learning model of text classification for support vector machines

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Memory-based shallow parsing

The Journal of Machine Learning Research
Shallow parsing using specialized hmms

The Journal of Machine Learning Research
Text chunking based on a generalization of winnow

The Journal of Machine Learning Research
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Co-trained support vector machines for large scale unstructured document classification using unlabeled data and syntactic information

Information Processing and Management: an International Journal
Chunking with support vector machines

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Introduction to the CoNLL-2000 shared task: chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Exploring evidence for shallow parsing

ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7

Robust and efficient multiclass SVM models for phrase pattern recognition

Pattern Recognition
The exploration of deterministic and efficient dependency parsing

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Joint training and decoding using virtual nodes for cascaded segmentation and tagging tasks

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Efficient and robust phrase chunking using support vector machines

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
ETL ensembles for chunking, NER and SRL

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Exploiting chunk-level features to improve phrase chunking

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
The Effect of Stemming on Arabic Text Classification: An Empirical Study

International Journal of Information Retrieval Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Several phrase chunkers have been proposed over the past few years. Some state-of-the-art chunkers achieved better performance via integrating external resources, e.g., parsers and additional training data, or combining multiple learners. However, in many languages and domains, such external materials are not easily available and the combination of multiple learners will increase the cost of training and testing. In this paper, we propose a mask method to improve the chunking accuracy. The experimental results show that our chunker achieves better performance in comparison with other deep parsers and chunkers. For CoNLL-2000 data set, our system achieves 94.12 in F rate. For the base-chunking task, our system reaches 92.95 in F rate. When porting to Chinese, the performance of the base-chunking task is 92.36 in F rate. Also, our chunker is quite efficient. The complete chunking time of a 50K words document is about 50 seconds.