An Algorithm that Learns What‘s in a Name
Machine Learning - Special issue on natural language learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A machine learning approach to coreference resolution of noun phrases
Computational Linguistics - Special issue on computational anaphora resolution
Design of the MUC-6 evaluation
MUC6 '95 Proceedings of the 6th conference on Message understanding
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Named entity recognition with character-level models
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A mention-synchronous coreference resolution algorithm based on the Bell tree
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Chinese named entity recognition based on multiple features
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Chinese named entity recognition based on multilevel linguistic features
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
The use of SVM for chinese new word identification
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Detecting, categorizing and clustering entity mentions in Chinese text
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Information Sciences: an International Journal
Hi-index | 0.00 |
This paper presents a Chinese entity detection and tracking system that takes advantages of character-based models and machine learning approaches. An entity here is defined as a link of all its mentions in text together with the associated attributes. Entity mentions of different types normally exhibit quite different linguistic patterns. Six separate Conditional Random Fields (CRF) models that incorporate character N-gram and word knowledge features are built to detect the extent and the head of three types of mentions, namely named, nominal and pronominal mentions. For each type of mentions, attributes are identified by Support Vector Machine (SVM) classifiers which take mention heads and their context as classification features. Mentions can then be merged into a unified entity representation by examining their attributes and connections in a rule-based coreference resolution process. The system is evaluated on ACE 2005 corpus and achieves competitive results.