File size distribution on UNIX systems: then and now

  • Authors:
  • Andrew S. Tanenbaum;Jorrit N. Herder;Herbert Bos

  • Affiliations:
  • Vrije Universiteit, Amsterdam, The Netherlands;Vrije Universiteit, Amsterdam, The Netherlands;Vrije Universiteit, Amsterdam, The Netherlands

  • Venue:
  • ACM SIGOPS Operating Systems Review
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Knowledge of the file size distribution is needed to optimize file system design. In particular, if all the files are small, the disk block size should be small, too, to avoid wasting too large a fraction of the disk. On the other hand, if files are generally large, choosing a large block size is good since it leads to more efficient transfers. Only by knowing the file size distribution can reasonable choices be made. In 1984, we published the file size distribution for a university computer science department. We have now made the same measurements 20 years later to see how file sizes have changed. In short, the median file size has more than doubled (from 1080 bytes to 2475 bytes), but large files still dominate the storage requirements.