Korean named entity recognition using HMM and CoTraining model

Authors:
Euisok Chung;Yi-Gyu Hwang;Myung-Gil Jang
Affiliations:
Electronics and Telecommunications Research Institute, Kajong-Dong, Yusong-Gu, Daejon, Korea;Electronics and Telecommunications Research Institute, Kajong-Dong, Yusong-Gu, Daejon, Korea;Electronics and Telecommunications Research Institute, Kajong-Dong, Yusong-Gu, Daejon, Korea
Venue:
AsianIR '03 Proceedings of the sixth international workshop on Information retrieval with Asian languages - Volume 11
Year:
2003

Citing 5
Cited 1

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
Modified Kneser-Ney Smoothing of n-gram Models

Modified Kneser-Ney Smoothing of n-gram Models
Nymble: a high-performance learning name-finder

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Named entity recognition using an HMM-based chunk tagger

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics

Automatic rule learning exploiting morphological features for named entity recognition in Turkish

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Named entity recognition is important in sophisticated information service system such as Question Answering and Text Mining since most of the answer type and text mining unit depend on the named entity type. Therefore we focus on named entity recognition model in Korean. Korean named entity recognition is difficult since each word of named entity has not specific features such as the capitalizing feature of English. It has high dependence on the large amounts of hand-labeled data and the named entity dictionary, even though these are tedious and expensive to create. In this paper, we devise HMM based named entity recognizer to consider various context models. Furthermore, we consider weakly supervised learning technique, CoTraining, to combine labeled data and unlabeled data.