Construction of a large-scale test set for author disambiguation

  • Authors:
  • In-Su Kang;Pyung Kim;Seungwoo Lee;Hanmin Jung;Beom-Jong You

  • Affiliations:
  • Department of Computer Science and Engineering, Kyungsung University, 314-79, Daeyeon-dong, Nam-gu, Busan 608-736, South Korea;Korea Institute of Science and Technology Information (KISTI), 335 Gwahang-no, Yuseong-gu, Daejeon 305-806, South Korea;Korea Institute of Science and Technology Information (KISTI), 335 Gwahang-no, Yuseong-gu, Daejeon 305-806, South Korea;Korea Institute of Science and Technology Information (KISTI), 335 Gwahang-no, Yuseong-gu, Daejeon 305-806, South Korea;Korea Institute of Science and Technology Information (KISTI), 335 Gwahang-no, Yuseong-gu, Daejeon 305-806, South Korea

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Author disambiguation resolves same-name author occurrences in the bibliographic data into namesakes. This enables author-centered searches and high-quality social network analysis. As an attempt to promote much research in author disambiguation, KISTI have constructed a new large-scale test set for this field. This article describes its semi-manual creation procedures, characteristics especially in terms of author ambiguities and name diversities. In addition, the baseline performance of author clustering against the test set is provided.