Efficient text chunking using linear kernel with masked method

Authors:
Yu-Chieh Wu;Chia-Hui Chang
Affiliations:
Department of Computer Science, National Central University, 320, Jhongli City, Taoyuan, Taiwan;Department of Computer Science, National Central University, 320, Jhongli City, Taoyuan, Taiwan
Venue:
Knowledge-Based Systems
Year:
2007

Citing 23
Cited 1

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
A statistical learning learning model of text classification for support vector machines

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Principle-Based Parsing: Computation and Psycholinguistics

Principle-Based Parsing: Computation and Psycholinguistics
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Memory-Based Lexical Acquisition and Processing

Proceedings of the Third International EAMT Workshop on Machine Translation and the Lexicon
Structured use of external knowledge for event-based open domain question answering

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Shallow parsing using specialized hmms

The Journal of Machine Learning Research
Text chunking based on a generalization of winnow

The Journal of Machine Learning Research
Shallow parsing with pos taggers and linguistic features

The Journal of Machine Learning Research
Improving accuracy in word class tagging through the combination of machine learning systems

Computational Linguistics
Representing text chunks

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Co-trained support vector machines for large scale unstructured document classification using unlabeled data and syntactic information

Information Processing and Management: an International Journal
Text chunking using regularized Winnow

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Chunking with support vector machines

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Filtering-Ranking Perceptron Learning for Partial Parsing

Machine Learning
Introduction to the CoNLL-2000 shared task: chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Use of support vector learning for chunk identification

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Text chunking by system combination

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Chunking with WPDV models

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Error-driven HMM-based chunk tagger with context-dependent lexicon

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Named entity recognition through classifier combination

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4

Random projection ensemble learning with multiple empirical kernels

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we proposed an efficient and accurate text chunking system using linear SVM kernel and a new technique called masked method. Previous researches indicated that systems combination or external parsers can enhance the chunking performance. However, the cost of constructing multi-classifiers is even higher than developing a single processor. Moreover, the use of external resources will complicate the original tagging process. To remedy these problems, we employ richer features and propose a masked-based method to solve unknown word problem to enhance system performance. In this way, no external resources or complex heuristics are required for the chunking system. The experiments show that when training with the CoNLL-2000 chunking dataset, our system achieves 94.12 in F"("@b") rate with linear. Furthermore, our chunker is quite efficient since it adopts a linear kernel SVM. The turn-around tagging time on CoNLL-2000 testing data is less than 50s which is about 115 times than polynomial kernel SVM.