Text chunking by combining hand-crafted rules and memory-based learning

Authors:
Seong-Bae Park;Byoung-Tak Zhang
Affiliations:
Seoul National University, Seoul, Korea;Seoul National University, Seoul, Korea
Venue:
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Year:
2003

Citing 6
Cited 10

C4.5: programs for machine learning

C4.5: programs for machine learning
Improving accuracy by combining rule-based and case-based reasoning

Artificial Intelligence
Forgetting Exceptions is Harmful in Language Learning

Machine Learning - Special issue on natural language learning
Learning from Data: Concepts, Theory, and Methods

Learning from Data: Concepts, Theory, and Methods
Text chunking using regularized Winnow

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Use of support vector learning for chunk identification

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7

Automatic word spacing in Korean for small memory devices

IEA/AIE'2005 Proceedings of the 18th international conference on Innovations in Applied Artificial Intelligence
An empirical study of Chinese chunking

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Optimizing weights in combining classifiers in natural language learning

ACST'07 Proceedings of the third conference on IASTED International Conference: Advances in Computer Science and Technology
Dependency Analysis of Clauses Using Parse Tree Kernels

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Combining rule-based learning and memory-based learning for automatic word spacing in simple message service

Applied Soft Computing
Clause boundary recognition using support vector machines

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Automatic occupation coding with combination of machine learning and hand-crafted rules

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Chunking using conditional random fields in korean texts

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Turkish constituent chunking with morphological and contextual features

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Naxi sentence similarity calculation based on improved chunking edit-distance

International Journal of Wireless and Mobile Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a hybrid of hand-crafted rules and a machine learning method for chunking Korean. In the partially free word-order languages such as Korean and Japanese, a small number of rules dominate the performance due to their well-developed postpositions and endings. Thus, the proposed method is primarily based on the rules, and then the residual errors are corrected by adopting a memory-based machine learning method. Since the memory-based learning is an efficient method to handle exceptions in natural language processing, it is good at checking whether the estimates are exceptional cases of the rules and revising them. An evaluation of the method yields the improvement in F-score over the rules or various machine learning methods alone.