An analysis of compare-by-hash

  • Authors:
  • Val Henson

  • Affiliations:
  • Sun Microsystems

  • Venue:
  • HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent research has produced a new and perhaps dangerous technique for uniquely identifying blocks that I will call compare-by-hash. Using this technique, we decide whether two blocks are identical to each other by comparing their hash values, using a collision-resistant hash such as SHA-1[5]. If the hash values match, we assume the blocks are identical without further ado. Users of compare-by-hash argue that this assumption is warranted because the chance of a hash collision between any two randomly generated blocks is estimated to be many orders of magnitude smaller than the chance of many kinds of hardware errors. Further analysis shows that this approach is not as risk-free as it seems at first glance.