Communications of the ACM
Computational limitations on learning from examples
Journal of the ACM (JACM)
The computational complexity of machine learning
The computational complexity of machine learning
The Strength of Weak Learnability
Machine Learning
Learning DNF under the uniform distribution in quasi-polynomial time
COLT '90 Proceedings of the third annual workshop on Computational learning theory
Weakly learning DNF and characterizing statistical query learning using Fourier analysis
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Journal of the ACM (JACM)
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Machine Learning
Machine Learning
On learning monotone DNF under product distributions
Information and Computation
Learning functions of k relevant variables
Journal of Computer and System Sciences - Special issue: STOC 2003
Learning Monotone Decision Trees in Polynomial Time
SIAM Journal on Computing
An efficient membership-query algorithm for learning DNF with respect to the uniform distribution
SFCS '94 Proceedings of the 35th Annual Symposium on Foundations of Computer Science
Proceedings of the 3rd Innovations in Theoretical Computer Science Conference
Finding Correlations in Subquadratic Time, with Applications to Learning Parities and Juntas
FOCS '12 Proceedings of the 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science
Hi-index | 0.00 |
We study the problem of PAC learning the class of DNF formulas with a type of natural pairwise query specific to the DNF representation. Specifically, given a pair of positive examples from a polynomial-sized sample, we consider boolean queries that ask whether the two examples satisfy at least one term in common in the target DNF, and numerical queries that ask how many terms in common the two examples satisfy. We provide both positive and negative results for learning with these queries under both uniform and general distributions. For boolean queries, we show that the problem of learning an arbitrary DNF target under an arbitrary distribution is no easier than in the traditional PAC model. However, on the positive side, we show that under the uniform distribution, we can properly learn any DNF formula with O(log(n)) relevant variables, any DNF formula where each variable appears in at most O(log(n)) terms, and any DNF formula having at most 2O(√log(n)) terms. Under general distributions, we show that 2-term DNFs are efficiently properly learnable as are disjoint DNFs. For numerical queries, we show we can learn arbitrary DNF formulas under the uniform distribution; in the process, we give an algorithm for learning a sum of monotone terms from labeled data only. Numerical-valued queries also allow us to properly learn any DNF with O(log(n)) relevant variables under arbitrary distributions, as well as DNF having O(log(n)) terms, and DNF for which each example can satisfy at most O(1) terms. Other possible generalizations of the query include allowing the algorithm to ask the query for an arbitrary number of examples from the sample at once (rather than just two), or allowing the algorithm to ask the query for examples of its own construction; we show that both of these generalizations allow for efficient proper learnability of arbitrary DNF functions under arbitrary distributions.