JHU1: an unsupervised approach to person name disambiguation using web snippets

  • Authors:
  • Delip Rao;Nikesh Garera;David Yarowsky

  • Affiliations:
  • Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD

  • Venue:
  • SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an approach to person name disambiguation using K-means clustering on rich-feature-enhanced document vectors, augmented with additional web-extracted snippets surrounding the polysemous names to facilitate term bridging. This yields a significant F-measure improvement on the shared task training data set. The paper also illustrates the significant divergence between the properties of the training and test data in this shared task, substantially skewing results. Our system optimized on F0.2 rather than F0.5 would have achieved top performance in the shared task.