Combining machine learning and human judgment in author disambiguation

Authors:
Yanan Qian;Yunhua Hu;Jianling Cui;Qinghua Zheng;Zaiqing Nie
Affiliations:
MOE KLINNS Lab and SKLMS Lab, Xi'an Jiaotong University, Xi'an, China;Microsoft Research Asia, Beijing, China;College of Software, Nankai University, Tianjin, China;MOE KLINNS Lab and SKLMS Lab, Xi'an Jiaotong University, Xi'an, China;Microsoft Research Asia, Beijing, China
Venue:
Proceedings of the 20th ACM international conference on Information and knowledge management
Year:
2011

Citing 10
Cited 3

Two supervised learning approaches for name disambiguation in author citations

Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Name disambiguation in author citations using a K-way spectral clustering method

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Comparative study of name disambiguation problem using a scalable blocking-based framework

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Adaptive Name Matching in Information Integration

IEEE Intelligent Systems
Search engine driven author disambiguation

Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Generative models for name disambiguation

Proceedings of the 16th international conference on World Wide Web
Name Disambiguation Using Atomic Clusters

WAIM '08 Proceedings of the 2008 The Ninth International Conference on Web-Age Information Management
GHOST: an effective graph-based framework for name distinction

Proceedings of the 17th ACM conference on Information and knowledge management
Disambiguating authors in academic publications using random forests

Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Efficient name disambiguation for large-scale databases

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases

What's in a name?: an unsupervised approach to link users across communities

Proceedings of the sixth ACM international conference on Web search and data mining
Vietnamese author name disambiguation for integrating publications from heterogeneous sources

ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part I
A semi-supervised approach for author disambiguation in KDD CUP 2013

Proceedings of the 2013 KDD Cup 2013 Workshop

Quantified Score

Hi-index	0.00

Visualization

Abstract

Author disambiguation in digital libraries becomes increasingly difficult as the number of publications and consequently the number of ambiguous author names keep growing. The fully automatic author disambiguation approach could not give satisfactory results due to the lack of signals in many cases. Furthermore, human judgment on the basis of automatic algorithms is also not suitable because the automatically disambiguated results are often mixed and not understandable for humans. In this paper, we propose a Labeling Oriented Author Disambiguation approach, called LOAD, to combine machine learning and human judgment together in author disambiguation. LOAD exploits a framework which consists of high precision clustering, high recall clustering, and top dissimilar clusters selection and ranking. In the framework, supervised learning algorithms are used to train the similarity functions between publications and a clustering algorithm is further applied to generate clusters. To validate the effectiveness and efficiency of the proposed LOAD approach, comprehensive experiments are conducted. Comparing to conventional author disambiguation algorithms, the LOAD yields much more accurate results to assist human labeling. Further experiments show that the LOAD approach can save labeling time dramatically.