Entity resolution on uncertain relations

  • Authors:
  • Huabin Feng;Hongzhi Wang;Jianzhong Li;Hong Gao

  • Affiliations:
  • Harbin Institute of Technology, China;Harbin Institute of Technology, China;Harbin Institute of Technology, China;Harbin Institute of Technology, China

  • Venue:
  • WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In many different application areas entity resolution places a pivotal role. Because of the existence of uncertain in many applications such as information extraction and online product category, entity resolution should be applied on uncertain data. The characteristic of uncertainty makes it impossible to apply traditional techniques directly. In this paper, we propose techniques to perform entity resolution on uncertain data. Firstly, we propose a new probabilistic similarity metric for uncertain tuples. Secondly, based on the metric, we propose novel pruning techniques to efficiently join pairwise uncertain tuples without enumerating all possible worlds. Finally, we propose a density-based clustering algorithm to combine the results of pairwise similarity join. With extensive experimental evaluation on synthetic and real-world data sets, we demonstrate the benefits and features of our approaches.