On using extended statistical queries to avoid membership queries

Authors:
Nader H. Bshouty;Vitaly Feldman
Affiliations:
Department of Computer Science, Technion , Israel Institute of Technology, Haifa, 32000, Israel;Department of Computer Science, Technion , Israel Institute of Technology, Haifa, 32000, Israel
Venue:
The Journal of Machine Learning Research
Year:
2002

Citing 20
Cited 14

A theory of the learnable

Communications of the ACM
Learning from good and bad data

Learning from good and bad data
A hard-core predicate for all one-way functions

STOC '89 Proceedings of the twenty-first annual ACM symposium on Theory of computing
The Strength of Weak Learnability

Machine Learning
Boosting a weak learning algorithm by majority

COLT '90 Proceedings of the third annual workshop on Computational learning theory
Types of noise in data for concept learning

COLT '88 Proceedings of the first annual workshop on Computational learning theory
Learning k-DNF with noise in the attributes

COLT '88 Proceedings of the first annual workshop on Computational learning theory
When won't membership queries help?

STOC '91 Proceedings of the twenty-third annual ACM symposium on Theory of computing
Learning decision trees using the Fourier spectrum

STOC '91 Proceedings of the twenty-third annual ACM symposium on Theory of computing
Toward efficient agnostic learning

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Exact identification of read-once formulas using fixed points of amplification functions

SIAM Journal on Computing
Efficient noise-tolerant learning from statistical queries

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Weakly learning DNF and characterizing statistical query learning using Fourier analysis

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
An efficient membership-query algorithm for learning DNF with respect to the uniform distribution

Journal of Computer and System Sciences
Uniform-distribution attribute noise learnability

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Noise-tolerant learning, the parity problem, and the statistical query model

STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Learning From Noisy Examples

Machine Learning
Learning by extended statistical queries and its relation to PAC learning

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Boosting and Hard-Core Sets

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Learning with Queries Corrupted by Classification Noise

ISTCS '97 Proceedings of the Fifth Israel Symposium on the Theory of Computing Systems (ISTCS '97)

A General Dimension for Approximately Learning Boolean Functions

ALT '02 Proceedings of the 13th International Conference on Algorithmic Learning Theory
Learning juntas

Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Learning functions of k relevant variables

Journal of Computer and System Sciences - Special issue: STOC 2003
Learning DNF from random walks

Journal of Computer and System Sciences - Special issue: Learning theory 2003
A general dimension for query learning

Journal of Computer and System Sciences
Learning with errors in answers to membership queries

Journal of Computer and System Sciences
Evolvability from learning algorithms

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Application of a generalization of russo's formula to learning from multiple random oracles

Combinatorics, Probability and Computing
Exploiting Product Distributions to Identify Relevant Variables of Correlation Immune Functions

The Journal of Machine Learning Research
Characterizing statistical query learning: simplified notions and proofs

ALT'09 Proceedings of the 20th international conference on Algorithmic learning theory
What Can We Learn Privately?

SIAM Journal on Computing
Learning DNF by statistical and proper distance queries under the uniform distribution

ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
On attribute efficient and non-adaptive learning of parities and DNF expressions

COLT'05 Proceedings of the 18th annual conference on Learning Theory
A complete characterization of statistical query learning with applications to evolvability

Journal of Computer and System Sciences

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Kushilevitz-Mansour (KM) algorithm is an algorithm that finds all the "large" Fourier coefficients of a Boolean function. It is the main tool for learning decision trees and DNF expressions in the PAC model with respect to the uniform distribution. The algorithm requires access to the membership query (MQ) oracle. The access is often unavailable in learning applications and thus the KM algorithm cannot be used. We significantly weaken this requirement by producing an analogue of the KM algorithm that uses extended statistical queries (SQ) (SQs in which the expectation is taken with respect to a distribution given by a learning algorithm). We restrict a set of distributions that a learning algorithm may use for its statistical queries to be a set of product distributions with each bit being 1 with probability ρ, 1/2 or 1-ρ for a constant 1/2 ρ 0 (we denote the resulting model by SQ-Dρ). Our analogue finds all the "large" Fourier coefficients of degree lower than clog(n) (we call it the Bounded Sieve (BS)). We use BS to learn decision trees and by adapting Freund's boosting technique we give an algorithm that learns DNF in SQ-Dρ. An important property of the model is that its algorithms can be simulated by MQs with persistent noise. With some modifications BS can also be simulated by MQs with product attribute noise (i.e., for a query x oracle changes every bit of x with some constant probability and calculates the value of the target function at the resulting point) and classification noise. This implies learnability of decision trees and weak learnability of DNF with this non-trivial noise. In the second part of this paper we develop a characterization for learnability with these extended statistical queries. We show that our characterization when applied to SQ-Dρ is tight in terms of learning parity functions. We extend the result given by Blum et al. by proving that there is a class learnable in the PAC model with random classification noise and not learnable in SQ-Dρ.