Chinese named entity recognition with a sequence labeling approach: based on characters, or based on words?

Authors:
Zhangxun Liu;Conghui Zhu;Tiejun Zhao
Affiliations:
MOE-MS Key Laboratory of NLP and speech, Harbin Institute of Technology, Harbin, China;MOE-MS Key Laboratory of NLP and speech, Harbin Institute of Technology, Harbin, China;MOE-MS Key Laboratory of NLP and speech, Harbin Institute of Technology, Harbin, China
Venue:
ICIC'10 Proceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing
Year:
2010

Citing 8
Cited 2

An Algorithm that Learns What‘s in a Name

Machine Learning - Special issue on natural language learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A maximum entropy approach to named entity recognition

A maximum entropy approach to named entity recognition
Named entity recognition: a maximum entropy approach using global information

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Fine-grained proper noun ontologies for question answering

SEMANET '02 Proceedings of the 2002 workshop on Building and using semantic networks - Volume 11
An effective two-stage model for exploiting non-local dependencies in named entity recognition

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Semantic enrichment of journal articles using chemical named entity recognition

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Improving machine translation quality with automatic named entity recognition

EAMT '03 Proceedings of the 7th International EAMT workshop on MT and other Language Technology Tools, Improving MT through other Language Technology Tools: Resources and Tools for Building MT

Mining methodologies from NLP publications: A case study in automatic terminology recognition

Computer Speech and Language
A relation extraction method of Chinese named entities based on location and semantic features

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Named Entity Recognition (NER), an important problem of Natural Language Processing, is the basis for other applications, such as Data Mining and Relation Extraction. With a sequence labeling approach, this paper wants to answer which kind of tokens that should be taken as the graininess in NER task, characters or words. Meanwhile, we use not only local context features within a sentence, but also global knowledge features extracting from other occurrences of each word in the whole corpus. The results show that without the global features the person names and the location names have good result based on characters, but the organization names are more suitable based on words. When global features are added, the performance of based on words improved significantly.