A named entity extraction using word information repeatedly collected from unlabeled data

Authors:
Tomoya Iwakura
Affiliations:
Fujitsu Laboratories Ltd., Kawasaki, Japan
Venue:
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Year:
2010

Citing 7
Cited 0

Representing text chunks

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Japanese Named Entity extraction with redundant morphological analysis

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Named entity extraction based on a maximum entropy model and transformation rules

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Combining outputs of multiple Japanese named entity chunkers by stacking

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A high-performance semi-supervised learning method for text chunking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A fast boosting-based learner for feature-rich tagging and chunking

CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Phrase clustering for discriminative learning

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a method for Named Entity (NE) extraction using NE-related labels of words repeatedly collected from unlabeled data. NE-related labels of words are candidate NE classes of each word, NE classes of co-occurring words of each word, and so on. To collect NE-related labels of words, we extract NEs from unlabeled data with an NE extractor. Then we collect NE-related labels of words from the extraction results. We create a new NE extractor using the NE-related labels of each word as new features. The new NE extractor is used to collect new NE-related labels of words. The experimental results using IREX data set for Japanese NE extraction show that our method contributes improved accuracy.