Message Understanding Conference-6: a brief history
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Acquisition of categorized named entities for web search
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Discovering relations among named entities from large corpora
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
SIEMÊS – a named-entity recognizer for portuguese relying on similarity rules
PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
A bootstrapping approach for training a NER with conditional random fields
EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
SIEMÊS – a named-entity recognizer for portuguese relying on similarity rules
PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
Web 2.0, Language Resources and standards to automatically build a multilingual Named Entity Lexicon
Language Resources and Evaluation
Hi-index | 0.00 |
In this paper we describe REPENTINO, a publicly available gazetteer intended to help the development of named entity recognition systems for Portuguese. REPENTINO wishes to minimize the problems developers face due to the limited availability of this type of lexical-semantic resources for Portuguese. The data stored in REPENTINO was mostly extracted from corpora and from the web using simple semi-automated methods. Currently, REPENTINO stores nearly 450k instances of named entities divided in more than 100 categories and subcategories covering a much wider set of domains than those usually included in traditional gazetteers. We will present some figures regarding the current content of the gazetteer and describe future work regarding the evaluation of this resource and its enrichment with additional information.