A robust multilingual portable phrase chunking system

Authors:
Yue-Shi Lee;Yu-Chieh Wu
Affiliations:
Department of Computer Science and Information Engineering, Ming Chuan University, 5 De-Ming Rd., Gwei Shan District, Taoyuan 333, Taiwan, ROC;Department of Computer Science and Information Engineering, National Central University, 300 Jong-Da Rd., Jhongli City, Taoyuan 320, Taiwan, ROC
Venue:
Expert Systems with Applications: An International Journal
Year:
2007

Citing 35
Cited 3

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
A statistical learning learning model of text classification for support vector machines

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Memory-based shallow parsing

The Journal of Machine Learning Research
Shallow parsing using specialized hmms

The Journal of Machine Learning Research
Text chunking based on a generalization of winnow

The Journal of Machine Learning Research
Extracting molecular binding relationships from biomedical text

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Cascaded Markov Models

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Representing text chunks

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Co-trained support vector machines for large scale unstructured document classification using unlabeled data and syntactic information

Information Processing and Management: an International Journal
A text-mining system for knowledge discovery from biomedical documents

IBM Systems Journal
Text chunking using regularized Winnow

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Chunking with support vector machines

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Example selection for bootstrapping statistical parsers

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Using predicate-argument structures for information extraction

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Chunk-based statistical translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Feature-rich statistical translation of noun phrases

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Filtering-Ranking Perceptron Learning for Partial Parsing

Machine Learning
Introduction to the CoNLL-2000 shared task: chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Chunking with maximum entropy models

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Use of support vector learning for chunk identification

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Text chunking by system combination

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Error-driven HMM-based chunk tagger with context-dependent lexicon

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Exploring evidence for shallow parsing

ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Bootstrapping POS taggers using unlabelled data

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity recognition through classifier combination

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A high-performance semi-supervised learning method for text chunking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Voting between multiple data representations for text chunking

AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence

A weighted string pattern matching-based passage ranking algorithm for video question answering

Expert Systems with Applications: An International Journal
Robust and efficient multiclass SVM models for phrase pattern recognition

Pattern Recognition
An approximate approach for training polynomial kernel SVMs in linear time

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions

Quantified Score

Hi-index	12.05

Visualization

Abstract

Automatic text chunking aims to recognize grammatical phrase structures in natural language text. Text chunking provides downstream syntactic information for further analysis, which is also an important technology in the area of text mining (TM) and natural language processing (NLP). Existing chunking systems make use of external knowledge, e.g. grammar parsers, or integrate multiple learners to achieve higher performance. However, the external knowledge is almost unavailable in many domains and languages. Besides, employing multiple learners does not only complicate the system architecture, but also increase training and testing time costs. In this paper, we present a novel phrase chunking model based on the proposed mask method without employing external knowledge and multiple learners. The mask method could automatically derive more training examples from the original training data, which significantly improves system performance. We had evaluated our method in different chunking tasks and languages in comparison to previous studies. The experimental results show that our method achieves state of the art performance in chunking tasks. In two English chunking tasks, i.e., shallow parsing and base-chunking, our method achieves 94.22 and 93.23 in F"("@b"="1") rates. When porting to Chinese, the F"("@b"="1") rate is 92.30. Also, our chunker is quite efficient. The complete chunking time of a 50K-words is less than 10s.