Scienceography: the study of how science is written

  • Authors:
  • Graham Cormode;S. Muthukrishnan;Jinyun Yan

  • Affiliations:
  • AT&T Labs---Research;Rutgers University;Rutgers University

  • Venue:
  • FUN'12 Proceedings of the 6th international conference on Fun with Algorithms
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Scientific literature has itself been the subject of much scientific study, for a variety of reasons: understanding how results are communicated, how ideas spread, and assessing the influence of areas or individuals. However, most prior work has focused on extracting and analyzing citation and stylistic patterns. In this work, we introduce the notion of ‘scienceography', which focuses on the writing of science. We provide a first large scale study using data derived from the arXiv e-print repository. Crucially, our data includes the "source code" of scientific papers--the $\hbox{\LaTeX }$ source--which enables us to study features not present in the "final product", such as the tools used and private comments between authors. Our study identifies broad patterns and trends in two example areas--computer science and mathematics--as well as highlighting key differences in the way that science is written in these fields. Finally, we outline future directions to extend the new topic of scienceography.