CHINERS: a Chinese named entity recognition system for the sports domain

Authors:
Tianfang Yao;Wei Ding;Gregor Erbach
Affiliations:
Saarland University, Germany;Saarland University, Germany;Saarland University, Germany
Venue:
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Year:
2003

Citing 3
Cited 4

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Partial parsing via finite-state cascades

Natural Language Engineering
A trainable rule-based algorithm for word segmentation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics

Chinese named entity and relation identification system

COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
A novel machine learning approach for the identification of named entity relations

FeatureEng '05 Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing
Identifying semantic relations between named entities from chinese texts

Proceedings of the 2005 joint Chinese-German conference on Cognitive systems
Recall-oriented learning of named entities in Arabic Wikipedia

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the investigation for Chinese named entity (NE) recognition, we are confronted with two principal challenges. One is how to ensure the quality of word segmentation and Part-of-Speech (POS) tagging, because its consequence has an adverse impact on the performance of NE recognition. Another is how to flexibly, reliably and accurately recognize NEs. In order to cope with the challenges, we propose a system architecture which is divided into two phases. In the first phase, we should reduce word segmentation and POS tagging errors leading to the second phase as much as possible. For this purpose, we utilize machine learning techniques to repair such errors. In the second phase, we design Finite State Cascades (FSC) which can be automatically constructed depending on the recognition rule sets as a shallow parser for the recognition of NEs. The advantages of that are reliable, accurate and easy to do maintenance for FSC. Additionally, to recognize special NEs, we work out the corresponding strategies to enhance the correctness of the recognition. The experimental evaluation of the system has shown that the total average recall and precision for six types of NEs are 83% and 85% respectively. Therefore, the system architecture is reasonable and effective.