Investigating GIS and smoothing for maximum entropy taggers
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Ranking algorithms for named-entity extraction: boosting and the voted perceptron
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Introduction to the CoNLL-2002 shared task: language-independent named entity recognition
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Language independent NER using a maximum entropy tagger
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Unsupervised Multilingual Sentence Boundary Detection
Computational Linguistics
Information extraction from Wikipedia: moving down the long tail
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to Tag and Tagging to Learn: A Case Study on Wikipedia
IEEE Intelligent Systems
Comparison between tagged corpora for the named entity task
CompareCorpora '00 Proceedings of the Workshop on Comparing Corpora
Analysing Wikipedia and gold-standard corpora for NER training
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Automatic gazetteer generation from wikipedia
NLP4DL'09/AT4DL'09 Proceedings of the 2009 international conference on Advanced language technologies for digital libraries
Recall-oriented learning of named entities in Arabic Wikipedia
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Population of a knowledge base for news metadata from unstructured text and web data
AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
Learning multilingual named entity recognition from Wikipedia
Artificial Intelligence
Hi-index | 0.00 |
Named entity recognition (NER) is used in many domains beyond the newswire text that comprises current gold-standard corpora. Recent work has used Wikipedia's link structure to automatically generate near gold-standard annotations. Until now, these resources have only been evaluated on newswire corpora or themselves. We present the first NER evaluation on a Wikipedia gold standard (WG) corpus. Our analysis of cross-corpus performance on WG shows that Wikipedia text may be a harder NER domain than newswire. We find that an automatic annotation of Wikipedia has high agreement with WG and, when used as training data, outperforms newswire models by up to 7.7%.