Author name disambiguation for citations on the deep web

  • Authors:
  • Rui Zhang;Derong Shen;Yue Kou;Tiezheng Nie

  • Affiliations:
  • College of Information Science and Engineering, Northeastern University, Shenyang, China;College of Information Science and Engineering, Northeastern University, Shenyang, China;College of Information Science and Engineering, Northeastern University, Shenyang, China;College of Information Science and Engineering, Northeastern University, Shenyang, China

  • Venue:
  • WAIM'10 Proceedings of the 2010 international conference on Web-age information management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Name ambiguity is a critical problem in many applications, in particular in the online bibliographic digital libraries. Although several clustering-based methods have been proposed, the problem still presents to be a big challenge for both data integration and cleaning process. In this paper, we present a complementary study to the author name disambiguation from another point of view. We focus on the common names, especially non-canonical ones. We propose an approach of automatic access to authors' personal information over Deep Web, and compute the similarity of every two citations according to the following features: co-author name, author's affiliation, e-mail address and title. Then we employ Affinity Propagation clustering algorithm to attributing the resembling citations to the proper authors. We conducted experiments based on five data sources: DBLP, CiteSeer, IEEE, ACM and Springer LINK. Experiments results show that significant improvements can be obtained by using the proposed approach.