Unsupervised name ambiguity resolution using a generative model

  • Authors:
  • Zornitsa Kozareva;Sujith Ravi

  • Affiliations:
  • USC Information Sciences Institute, Marina del Rey, CA;USC Information Sciences Institute, Marina del Rey, CA

  • Venue:
  • EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Resolving ambiguity associated with names found on the Web, Wikipedia or medical texts is a very challenging task, which has been of great interest to the research community. We propose a novel approach to disambiguating names using Latent Dirichlet Allocation, where the learned topics represent the underlying senses of the ambiguous name. We conduct a detailed evaluation on multiple data sets containing ambiguous person, location and organization names and for multiple languages such as English, Spanish, Romanian and Bulgarian. We conduct comparative studies with existing approaches and show a substantial improvement of 15 to 35% in task accuracy.