Author name disambiguation for citations on the deep web

Authors:
Rui Zhang;Derong Shen;Yue Kou;Tiezheng Nie
Affiliations:
College of Information Science and Engineering, Northeastern University, Shenyang, China;College of Information Science and Engineering, Northeastern University, Shenyang, China;College of Information Science and Engineering, Northeastern University, Shenyang, China;College of Information Science and Engineering, Northeastern University, Shenyang, China
Venue:
WAIM'10 Proceedings of the 2010 international conference on Web-age information management
Year:
2010

Citing 11
Cited 1

Two supervised learning approaches for name disambiguation in author citations

Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Name disambiguation in author citations using a K-way spectral clustering method

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Generative models for name disambiguation

Proceedings of the 16th international conference on World Wide Web
Adaptive graphical approach to entity resolution

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
A constraint-based probabilistic framework for name disambiguation

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Citation data clustering for author name disambiguation

Proceedings of the 2nd international conference on Scalable information systems
A unified framework for name disambiguation

Proceedings of the 17th international conference on World Wide Web
Name Disambiguation Using Atomic Clusters

WAIM '08 Proceedings of the 2008 The Ninth International Conference on Web-Age Information Management
GHOST: an effective graph-based framework for name distinction

Proceedings of the 17th ACM conference on Information and knowledge management
A Term-Based Driven Clustering Approach for Name Disambiguation

APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
A Latent Topic Model for Complete Entity Resolution

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering

A research agenda for data curation cyberinfrastructure

Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

Name ambiguity is a critical problem in many applications, in particular in the online bibliographic digital libraries. Although several clustering-based methods have been proposed, the problem still presents to be a big challenge for both data integration and cleaning process. In this paper, we present a complementary study to the author name disambiguation from another point of view. We focus on the common names, especially non-canonical ones. We propose an approach of automatic access to authors' personal information over Deep Web, and compute the similarity of every two citations according to the following features: co-author name, author's affiliation, e-mail address and title. Then we employ Affinity Propagation clustering algorithm to attributing the resembling citations to the proper authors. We conducted experiments based on five data sources: DBLP, CiteSeer, IEEE, ACM and Springer LINK. Experiments results show that significant improvements can be obtained by using the proposed approach.