Testing k-wise and almost k-wise independence

  • Authors:
  • Noga Alon;Alexandr Andoni;Tali Kaufman;Kevin Matulef;Ronitt Rubinfeld;Ning Xie

  • Affiliations:
  • Tel Aviv University, Tel Aviv , Israel;MIT, Cambridge, MA;MIT, Cambridge, MA;MIT, Cambridge, MA;MIT, Cambridge, MA;State Univ. of New York at Buffalo, Amherst, NY

  • Venue:
  • Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work, we consider the problems of testing whether adistribution over (0,1n) is k-wise (resp. (ε,k)-wise) independentusing samples drawn from that distribution. For the problem of distinguishing k-wise independent distributions from those that are δ-far from k-wise independence in statistical distance, we upper bound the number ofrequired samples by Õ(nk/δ2) and lower bound it by Ω(nk-1/2/δ) (these bounds hold for constantk, and essentially the same bounds hold for general k). Toachieve these bounds, we use Fourier analysis to relate adistribution's distance from k-wise independence to its biases, a measure of the parity imbalance it induces on a setof variables. The relationships we derive are tighter than previouslyknown, and may be of independent interest. To distinguish (ε,k)-wise independent distributions from thosethat are δ-far from (ε,k)-wise independence in statistical distance, we upper bound thenumber of required samples by O(k log n / δ2ε2) and lower bound it by Ω(√ k log n / 2k(ε+δ)√ log 1/2k(ε+δ)). Although these bounds are anexponential improvement (in terms of n and k) over thecorresponding bounds for testing k-wise independence, we give evidence thatthe time complexity of testing (ε,k)-wise independence isunlikely to be poly(n,1/ε,1/δ) for k=Θ(log n),since this would disprove a plausible conjecture concerning the hardness offinding hidden cliques in random graphs. Under the conjecture, ourresult implies that for, say, k = log n and ε = 1 / n0.99,there is a set of (ε,k)-wise independent distributions, and a set of distributions at distance δ=1/n0.51 from (ε,k)-wiseindependence, which are indistinguishable by polynomial time algorithms.