An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Named entity chunking techniques in supervised learning for Japanese named entity recognition
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Japanese named entity extraction evaluation: analysis of results
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Efficient support vector classifiers for named entity recognition
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Japanese named entity recognition based on a simple rule generator and decision tree learning
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Japanese Named Entity extraction with redundant morphological analysis
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Analysis and robust extraction of changing named entities
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Hi-index | 0.00 |
This paper proposes a novel method to extract named entities including unfamiliar words which do not occur or occur few times in a training corpus using a large unannotated corpus. The proposed method consists of two steps. The first step is to assign the most similar and familiar word to each unfamiliar word based on their context vectors calculated from a large unannotated corpus. After that, traditional machine learning approaches are employed as the second step. The experiments of extracting Japanese named entities from IREX corpus and NHK corpus show the effectiveness of the proposed method.