Requirements for a cocitation similarity measure, with special reference to Pearson's correlation coefficient

  • Authors:
  • Per Ahlgren;Bo Jarneving;Ronald Rousseau

  • Affiliations:
  • Swedish School of Library and Information Science, 501 90 Borås, Sweden;Swedish School of Library and Information Science, 501 90 Borås, Sweden;KHBO, Department of Industrial Sciences and Technology, B-8400 Oostende, Belgium

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Author cocitation analysis (ACA), a special type of cocitation analysis, was introduced by White and Griffith in 1981. This technique is used to analyze the intellectual structure of a given scientific field. In 1990, McCain published a technical overview that has been largely adopted as a standard. Here, McCain notes that Pearson's correlation coefficient (Pearson's r) is often used as a similarity measure in ACA and presents some advantages of its use. The present article criticizes the use of Pearson's r in ACA and sets forth two natural requirements that a similarity measure applied in ACA should satisfy. It is shown that Pearson's r does not satisfy these requirements. Real and hypothetical data are used in order to obtain counterexamples to both requirements. It is concluded that Pearson's r is probably not an optimal choice of a similarity measure in ACA. Still, further empirical research is needed to show if, and in that case to what extent, the use of similarity measures in ACA that fulfill these requirements would lead to objectively better results In full-scale studies. Further, problems related to incomplete cocitation matrices are discussed.