Wide-Coverage Spanish Named Entity Extraction
IBERAMIA 2002 Proceedings of the 8th Ibero-American Conference on AI: Advances in Artificial Intelligence
Message Understanding Conference-6: a brief history
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Japanese Named Entity extraction with redundant morphological analysis
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Named Entity Extraction using AdaBoost
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Introduction to the CoNLL-2002 shared task: language-independent named entity recognition
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Analysing Wikipedia and gold-standard corpora for NER training
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Functional aspects in portuguese NER
PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
Unsupervised named-entity recognition: generating gazetteers and resolving ambiguity
AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Hi-index | 0.00 |
We propose a resource-based Named Entity Classification (NEC) system, which combines named entity extraction with simple language-independent heuristics. Large lists (gazetteers) of named entities are automatically extracted making use of semi-structured information from the Wikipedia, namely infoboxes and category trees. Language-independent heuristics are used to disambiguate and classify entities that have been already identified (or recognized) in text. We compare the performance of our resource-based system with that of a supervised NEC module implemented for the FreeLing suite, which was the winner system in CoNLL-2002 competition. Experiments were performed over Portuguese text corpora taking into account several domains and genres.