Learnability of DNF with representation-specific queries

Authors:
Liu Yang;Avrim Blum;Jaime Carbonell
Affiliations:
Carnegie Mellon University, Pittsburgh, USA;Carnegie Mellon University, Pittsburgh, USA;Carnegie Mellon University, Pittsburgh, USA
Venue:
Proceedings of the 4th conference on Innovations in Theoretical Computer Science
Year:
2013

Citing 16
Cited 0

A theory of the learnable

Communications of the ACM
Computational limitations on learning from examples

Journal of the ACM (JACM)
The computational complexity of machine learning

The computational complexity of machine learning
The Strength of Weak Learnability

Machine Learning
Learning DNF under the uniform distribution in quasi-polynomial time

COLT '90 Proceedings of the third annual workshop on Computational learning theory
Weakly learning DNF and characterizing statistical query learning using Fourier analysis

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Learning Boolean formulas

Journal of the ACM (JACM)
Learning DNF in time

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Queries and Concept Learning

Machine Learning
Queries and Concept Learning

Machine Learning
On learning monotone DNF under product distributions

Information and Computation
Learning functions of k relevant variables

Journal of Computer and System Sciences - Special issue: STOC 2003
Learning Monotone Decision Trees in Polynomial Time

SIAM Journal on Computing
An efficient membership-query algorithm for learning DNF with respect to the uniform distribution

SFCS '94 Proceedings of the 35th Annual Symposium on Foundations of Computer Science
Restriction access

Proceedings of the 3rd Innovations in Theoretical Computer Science Conference
Finding Correlations in Subquadratic Time, with Applications to Learning Parities and Juntas

FOCS '12 Proceedings of the 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study the problem of PAC learning the class of DNF formulas with a type of natural pairwise query specific to the DNF representation. Specifically, given a pair of positive examples from a polynomial-sized sample, we consider boolean queries that ask whether the two examples satisfy at least one term in common in the target DNF, and numerical queries that ask how many terms in common the two examples satisfy. We provide both positive and negative results for learning with these queries under both uniform and general distributions. For boolean queries, we show that the problem of learning an arbitrary DNF target under an arbitrary distribution is no easier than in the traditional PAC model. However, on the positive side, we show that under the uniform distribution, we can properly learn any DNF formula with O(log(n)) relevant variables, any DNF formula where each variable appears in at most O(log(n)) terms, and any DNF formula having at most 2O(√log(n)) terms. Under general distributions, we show that 2-term DNFs are efficiently properly learnable as are disjoint DNFs. For numerical queries, we show we can learn arbitrary DNF formulas under the uniform distribution; in the process, we give an algorithm for learning a sum of monotone terms from labeled data only. Numerical-valued queries also allow us to properly learn any DNF with O(log(n)) relevant variables under arbitrary distributions, as well as DNF having O(log(n)) terms, and DNF for which each example can satisfy at most O(1) terms. Other possible generalizations of the query include allowing the algorithm to ask the query for an arbitrary number of examples from the sample at once (rather than just two), or allowing the algorithm to ask the query for examples of its own construction; we show that both of these generalizations allow for efficient proper learnability of arbitrary DNF functions under arbitrary distributions.