Fusion of multiple features for chinese named entity recognition based on CRF model

Authors:
Yuejie Zhang;Zhiting Xu;Tao Zhang
Affiliations:
Department of Computer Science & Engineering, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, P.R. China;Department of Computer Science & Engineering, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, P.R. China;School of Information Management & Engineering, Shanghai University of Finance & Economics, Shanghai, P.R. China
Venue:
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Year:
2008

Citing 8
Cited 2

A maximum entropy approach to natural language processing

Computational Linguistics
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Learn - Filter - Apply - Forget. Mixed Approaches to Named Entity Recognition

NLDB'01 Proceedings of the 6th International Workshop on Applications of Natural Language to Information Systems
Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Automatic recognition of Chinese unknown words based on roles tagging

SIGHAN '02 Proceedings of the first SIGHAN workshop on Chinese language processing - Volume 18
Maximum entropy models for named entity recognition

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Chinese named entity recognition based on multiple features

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing

Personal Name Recognition Based on Categorized Linguistic Knowledge

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
A structural approach to extracting Chinese position relations from web pages

Journal of Web Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the ability of Conditional Random Field (CRF) combining with multiple features to perform robust and accurate Chinese Named Entity Recognition. We describe the multiple feature templates including local feature templates and global feature templates used to extract multiple features with the help of human knowledge. Besides, we show that human knowledge can reasonably smooth the model and thus the need of training data for CRF might be reduced. From the experimental results on People's Daily corpus, we can conclude that our model is an effective pattern to combine statistical model and human knowledge. And the experiments on another data set also confirm the above conclusion, which shows that our features have consistence on different testing data.