Tight Bounds for Hashing Block Sources

  • Authors:
  • Kai-Min Chung;Salil Vadhan

  • Affiliations:
  • School of Engineering & Applied Sciences, Harvard University, Cambridge, MA;School of Engineering & Applied Sciences, Harvard University, Cambridge, MA

  • Venue:
  • APPROX '08 / RANDOM '08 Proceedings of the 11th international workshop, APPROX 2008, and 12th international workshop, RANDOM 2008 on Approximation, Randomization and Combinatorial Optimization: Algorithms and Techniques
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is known that if a 2-universal hash function His applied to elements of a block source(X1,...,XT), where each item Xihas enough min-entropy conditioned on the previous items, then the output distribution (H,H(X1),...,H(XT)) will be "close" to the uniform distribution. We provide improved bounds on how much min-entropy per item is required for this to hold, both when we ask that the output be close to uniform in statistical distance and when we only ask that it be statistically close to a distribution with small collision probability. In both cases, we reduce the dependence of the min-entropy on the number Tof items from 2logTin previous work to logT, which we show to be optimal. This leads to corresponding improvements to the recent results of Mitzenmacher and Vadhan (SODA `08) on the analysis of hashing-based algorithms and data structures when the data items come from a block source.