Improved MCMC sampling methods for estimating weighted sums in Winnow with application to DNF learning

Authors:
Qingping Tao;Stephen D. Scott
Affiliations:
GC Image, LLC, Lincoln, USA 68508;Department of Computer Science & Engineering, University of Nebraska, Lincoln, USA 68588-0115
Venue:
Machine Learning
Year:
2008

Citing 16
Cited 0

On learning embedded symmetric concepts

COLT '93 Proceedings of the sixth annual conference on Computational learning theory
The weighted majority algorithm

Information and Computation
Weakly learning DNF and characterizing statistical query learning using Fourier analysis

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
The Markov chain Monte Carlo method: an approach to approximate counting and integration

Approximation algorithms for NP-hard problems
Solving the multiple instance problem with axis-parallel rectangles

Artificial Intelligence
Tracking the Best Disjunction

Machine Learning - Special issue on context sensitivity and concept drift
General and Efficient Multisplitting of Numerical Attributes

Machine Learning
Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Agnostic learning of geometric patterns

Journal of Computer and System Sciences
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
More efficient PAC-learning of DNF with membership queries under the uniform distribution

Journal of Computer and System Sciences
Learning DNF in time 2õ(n1/3)

Journal of Computer and System Sciences - STOC 2001
On approximating weighted sums with exponentially many terms

Journal of Computer and System Sciences
Maximum Margin Algorithms with Boolean Kernels

The Journal of Machine Learning Research
Efficiency versus convergence of Boolean kernels for on-line learning algorithms

Journal of Artificial Intelligence Research
Improved use of continuous attributes in C4.5

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

A Markov chain Monte Carlo method has previously been introduced to estimate weighted sums in multiplicative weight update algorithms when the number of inputs is exponential. However, the original algorithm still required extensive simulation of the Markov chain in order to get accurate estimates of the weighted sums. We propose an optimized version of the original algorithm that produces exactly the same classifications while often using fewer Markov chain simulations. We also apply three other sampling techniques and empirically compare them with the original Metropolis sampler to determine how effective each is in drawing good samples in the least amount of time, in terms of accuracy of weighted sum estimates and in terms of Winnow's prediction accuracy. We found that two other samplers (Gibbs and Metropolized Gibbs) were slightly better than Metropolis in their estimates of the weighted sums. For prediction errors, there is little difference between any pair of MCMC techniques we tested. Also, on the data sets we tested, we discovered that all approximations of Winnow have no disadvantage when compared to brute force Winnow (where weighted sums are exactly computed), so generalization accuracy is not compromised by our approximation. This is true even when very small sample sizes and mixing times are used.