Learning DNF by Approximating Inclusion-Exclusion Formulae

  • Authors:
  • Jun Tarui;Tatsuie Tsukiji

  • Affiliations:
  • -;-

  • Venue:
  • COCO '99 Proceedings of the Fourteenth Annual IEEE Conference on Computational Complexity
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distribution-free learnability and unlearnability of polynomial-size Disjunctive Normal Form formulae is discussed. Recently, Bshouty showed that {DNF} is distribution-free learnable in $2^{O(\sqrt n (\log n)^2)}$ time. In this paper we show that Bshouty's learning time is attained by naively searching weak hypotheses among short symmetric functions and appealing to widely-know boosting techniques investigated by Shapire and also by Freund. To obtain this learnability result, we show that a given polynomial-size {DNF} formula can be approximated by a conjunction of length $O(\sqrt n \log n)$ with accuracy at least $2^{-O(\sqrt n \log^2 n)}$. We also obtain a similar lower bound for learning {DNF$ under a cetain joint-distribution. In more precise, for any $0 \le \varepsilon \le 1/2$ and any Boolean conjunction $f$ of length $\Theta(n)$, we construct a joint-distribution over $(x,y) \in \{0,1\}^n \times \{0,1\}$ such that $f(x) = y$ holds with probability at least $1-\varepsilon$ but $h(x) = y$ happens with probability exactly $1/2$ for any Boolean function $h : \{0,1\}^n \to \{0,1\}$ that depends on at most $O(\sqrt{n\varepsilon})$ variables. Therefore, under such a joint-distribution, any naive search must enumerate at least $2^{O(\sqrt{n\varepsilon} \log n)}$ number of symmetric functions for getting a better hypothesis than guessing at random.