Modern Information Retrieval
Two supervised learning approaches for name disambiguation in author citations
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Innovation in scholarly communication: Vision and projects from High-Energy Physics
Information Services and Use - APE 2008 Academic Publishing in Europe, Quality and Publishing
On co-authorship for author disambiguation
Information Processing and Management: an International Journal
Author name disambiguation in MEDLINE
ACM Transactions on Knowledge Discovery from Data (TKDD)
Effective self-training author name disambiguation in scholarly digital libraries
Proceedings of the 10th annual joint conference on Digital libraries
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
A collaboration of leading research centers in the field of High Energy Physics (HEP) has built INSPIRE, a novel information infrastructure, which comprises the entire corpus of about one million documents produced within the discipline, including a rich set of metadata, citation information and half a million full-text documents, and offers a unique opportunity for author disambiguation strategies. The presented approach features extended metadata comparison metrics and a three-step unsupervised graph clustering technique. The algorithm aided in identifying 200'000 individuals from 6'500'000 author signatures. Preliminary tests based on knowledge of external experts and a pilot of a crowd-sourcing system show a success rate of more than 96% within the selected test cases. The obtained author clusters serve as a recommendation for INSPIRE users to further clean the publication list in a crowd-sourced approach.