A golden resource for named entity recognition in portuguese

Authors:
Diana Santos;Nuno Cardoso
Affiliations:
Linguateca: Node of Oslo at SINTEF ICT;Linguateca: Node of XLDB at University of Lisbon
Venue:
PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
Year:
2006

Citing 11
Cited 1

Building a question answering test collection

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Toward Language-dependent Applications

Machine Translation
A statistical profile of the Named Entity task

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Named Entity recognition without gazetteers

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Recognizing text genres with simple metrics using discriminant analysis

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Message Understanding Conference-6: a brief history

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Introduction to the CoNLL-2002 shared task: language-independent named entity recognition

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
The multilingual entity task (MET) overview

TIPSTER '96 Proceedings of a workshop on held at Vienna, Virginia: May 6-8, 1996
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Multi-level NER for Portuguese in a CG framework

PROPOR'03 Proceedings of the 6th international conference on Computational processing of the Portuguese language
Cooperatively evaluating portuguese morphology

PROPOR'03 Proceedings of the 6th international conference on Computational processing of the Portuguese language

SeRELeP-Olympics: hot topics for a news portal based on semantic types and named entities

Companion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a collection of texts manually annotated with named entities in context, which was used for HAREM, the first evaluation contest for named entity recognizers for Portuguese. We discuss the options taken and the originality of our approach compared with previous evaluation initiatives in the area. We document the choice of categories, their quantitative weight in the overall collection and how we deal with vagueness and underspecification.