Towards alias detection without string similarity: an active learning based approach

  • Authors:
  • Lili Jiang;Jianyong Wang;Ping Luo;Ning An;Min Wang

  • Affiliations:
  • Lanzhou University, Lanzhou, China;Tsinghua University, Beijing, China;HP Labs China, Beijing, China;School of Computer and Information, Hefei University of Technology, Hefei, China;HP Labs China, Beijing, China

  • Venue:
  • SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Entity aliases commonly exist and accurately detecting these aliases plays a vital role in various applications. In this paper, we use an active-learning-based method to detect aliases without string similarity. To minimize the cost on pairwise comparison, a subset-based method restricts the alias selection within a small-scale entity set. Within each generated entity set, an active learning based logistic regression classifier is employed to predict whether a candidate is the alias of a given entity. The experimental results on three datasets clearly demonstrate that our proposed approach can effectively detect this kind of entity aliases.