Semantic tagging of unknown proper nouns

Authors:
Alessandro Cucchiarelli;Danilo Luzi;Paola Velardi
Affiliations:
Università/ di Ancona, Istituto di Informatica, Via Brecce Bianche, 160131 Ancona, Italy/ e-mail: alex@inform.unian.it;Università/ di Ancona, Istituto di Informatica, Via Brecce Bianche, 160131 Ancona, Italy/ e-mail: alex@inform.unian.it;Università/ di Roma ‘/La Sapienza’/, Dipartimento di Scienze dell'Informazione, Via Salaria 113, I00198 Roma, Italy/ e-mail: velardi@dsi.uniromal.it
Venue:
Natural Language Engineering
Year:
1999

Citing 16
Cited 3

Information-Based Evaluation Criterion for Classifier's Performance

Machine Learning
Lexical ambiguity and information retrieval

ACM Transactions on Information Systems (TOIS)
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Internal and external evidence in the identification and semantic categorization of proper names

Corpus processing for lexical acquisition
Identifying unknown proper names in newswire text

Corpus processing for lexical acquisition
Categorizing and standardizing proper nouns for efficient information retrieval

Corpus processing for lexical acquisition
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Finding a domain-appropriate sense inventory for semantically tagging a corpus

Natural Language Engineering
A statistical profile of the Named Entity task

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Nymble: a high-performance learning name-finder

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Disambiguation of proper names in text

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Word sense disambiguation using optimised combinations of knowledge sources

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Word sense ambiguation: clustering related senses

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
A "not-so-shallow" parser for collocational analysis

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
New York University: description of the PROTEUS system as used for MUC-4

MUC4 '92 Proceedings of the 4th conference on Message understanding
One sense per discourse

HLT '91 Proceedings of the workshop on Speech and Natural Language

Using text processing techniques to automatically enrich a domain ontology

Proceedings of the international conference on Formal Ontology in Information Systems - Volume 2001
On the Analysis of Locative Phrases with Graphs and Lexicon-Grammar: The Classifier/Proper Noun Pairing

PorTAL '02 Proceedings of the Third International Conference on Advances in Natural Language Processing
Automatic feature thesaurus enrichment: extracting generic terms from digital gazetteer

Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we describe a context-based method to semantically tag unknown proper nouns (U-PNs) in corpora. Like many others, our system relies on a gazetteer and a set of context-dependent heuristics to classify proper nouns. However, proper nouns are an open-end class: when parsing new fragments of a corpus, even in the same language domain, we can expect that several proper nouns cannot be semantically tagged. The algorithm that we propose assigns to an unknown PN an entity type based on the analysis of syntactically and semantically similar contexts already seen in the application corpus. The performance of the algorithm is evaluated not only in terms of precision, following the tradition of MUC conferences, but also in terms of information gain, an information theoretic measure that takes into account the complexity of the classification task.