Unsupervised name ambiguity resolution using a generative model

Authors:
Zornitsa Kozareva;Sujith Ravi
Affiliations:
USC Information Sciences Institute, Marina del Rey, CA;USC Information Sciences Institute, Marina del Rey, CA
Venue:
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Year:
2011

Citing 18
Cited 1

Latent dirichlet allocation

The Journal of Machine Learning Research
Entity-based cross-document coreferencing using the Vector Space Model

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
New Techniques for Disambiguation in Natural Language and Their Application to Biological Text

The Journal of Machine Learning Research
Person resolution in person search results: WebHawk

Proceedings of the 14th ACM international conference on Information and knowledge management
Unsupervised personal name disambiguation

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Bayesian query-focused summarization

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Efficient topic-based unsupervised name disambiguation

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Named Entity Disambiguation: A Hybrid Statistical and Rule-Based Incremental Approach

ASWC '08 Proceedings of the 3rd Asian Semantic Web Conference on The Semantic Web
Unsupervised Discrimination of Person Names in Web Contexts

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
The SemEval-2007 WePS evaluation: establishing a benchmark for the web people search task

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
CU-COMSEM: exploring rich features for unsupervised web personal name disambiguation

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
IRST-BP: web people search using name entities

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Latent variable models of concept-attribute attachment

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
The role of named entities in web people search

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Coreference resolution in a modular, entity-centered model

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A latent dirichlet allocation method for selectional preferences

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
An unsupervised language independent method of name discrimination using second order co-occurrence features

CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Name discrimination by clustering similar contexts

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing

Analysis and refinement of cross-lingual entity linking

CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Resolving ambiguity associated with names found on the Web, Wikipedia or medical texts is a very challenging task, which has been of great interest to the research community. We propose a novel approach to disambiguating names using Latent Dirichlet Allocation, where the learned topics represent the underlying senses of the ambiguous name. We conduct a detailed evaluation on multiple data sets containing ambiguous person, location and organization names and for multiple languages such as English, Spanish, Romanian and Bulgarian. We conduct comparative studies with existing approaches and show a substantial improvement of 15 to 35% in task accuracy.