Modeling the author bias between two on-line computer science citation databases

  • Authors:
  • Vaclav Petricek;Ingemar J. Cox;Hui Han;Isaac G. Councill;C. Lee Giles

  • Affiliations:
  • University College London, London, UK;University College London, London, UK;Yahoo! Inc., Sunnyvale, CA;Pennsylvania State University, University Park, PA;Pennsylvania State University, University Park, PA

  • Venue:
  • WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We examine the difference and similarities between two on-line computer science citation databases DBLP and CiteSeer. The database entries in DBLP are inserted manually while the CiteSeer entries are obtained autonomously. We show that the CiteSeer database contains considerably fewer single author papers. This bias can be modeled by an exponential process with intuitive explanation. The model permits us to predict that the DBLP database covers approximately 30% of the entire literature of Computer Science.