Universal ε-approximators for integrals

Authors:
Michael Langberg;Leonard J. Schulman
Affiliations:
Open University of Israel, Israel;California Institute of Technology, Pasadena, CA
Venue:
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Year:
2010

Citing 20
Cited 4

Learnability and the Vapnik-Chervonenkis dimension

Journal of the ACM (JACM)
Inductive principles of the search for empirical dependences (methods based on weak convergence of probability measures)

COLT '89 Proceedings of the second annual workshop on Computational learning theory
Characterizations of learnability for classes of {O, …, n}-valued functions

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Independent collections of translates of boxes and a conjecture due to Gru¨nbaum

Discrete & Computational Geometry
Efficient distribution-free learning of probabilistic concepts

Journal of Computer and System Sciences - Special issue: 31st IEEE conference on foundations of computer science, Oct. 22–24, 1990
Characterizations of learnability for classes of {0, …, n}-valued functions

Journal of Computer and System Sciences
Fat-shattering and the learnability of real-valued functions

Journal of Computer and System Sciences
Scale-sensitive dimensions, uniform convergence, and learnability

Journal of the ACM (JACM)
Segmentation problems

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Prediction, learning, uniform convergence, and scale-sensitive dimensions

Journal of Computer and System Sciences - Special issue on the eighth annual workshop on computational learning theory, July 5–8, 1995
On two segmentation problems

Journal of Algorithms
Clustering for edge-cost minimization (extended abstract)

STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
The discrepancy method: randomness and complexity

The discrepancy method: randomness and complexity
Lectures on Discrete Geometry

Lectures on Discrete Geometry
On Learning Sets and Functions

Machine Learning
A Randomized Approximation Scheme for Metric MAX-CUT

FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
On coresets for k-means and k-median clustering

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Smaller coresets for k-median and k-means clustering

SCG '05 Proceedings of the twenty-first annual symposium on Computational geometry
A PTAS for k-means clustering based on weak coresets

SCG '07 Proceedings of the twenty-third annual symposium on Computational geometry
Coresets for discrete integration and clustering

FSTTCS'06 Proceedings of the 26th international conference on Foundations of Software Technology and Theoretical Computer Science

A unified framework for approximating and clustering data

Proceedings of the forty-third annual ACM symposium on Theory of computing
A near-linear algorithm for projective clustering integer points

Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Data reduction for weighted and outlier-resistant clustering

Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Learning Big (Image) Data via Coresets for Dictionaries

Journal of Mathematical Imaging and Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

Let X be a space and F a family of 0, 1-valued functions on X. Vapnik and Chervonenkis showed that if F is "simple" (finite VC dimension), then for every probability measure μ on X and ε 0 there is a finite set S such that for all f ε F, Σxεs f(x)/|S| = |f f(x)dμ(x)] ± ε. Think of S as a "universal ε-approximator" for integration in F. S can actually be obtained w.h.p. just by sampling a few points from μ. This is a mainstay of computational learning theory. It was later extended by other authors to families of bounded (e.g., [0, 1]-valued) real functions. In this work we establish similar "universal ε-approximators" for families of unbounded nonnegative real functions --- in particular, for the families over which one optimizes when performing data classification. (In this case the ε-approximation should be multiplicative.) Specifically, let F be the family of "k-median functions" (or k-means, etc.) on Rd with an arbitrary norm ϱ. That is, any set u1,..., uk ε Rd determines an f by f(x) = (mini ϱ(x - ui))α. (Here α ≥ 0.) Then for every measure μ on Rd there exists a set S of cardinality poly(k, d, 1/ε) and a measure v supported on S such that for every f ε F, Σxεs f(x)v(x) ε (1 ± ε) · (f f (x)dμ(x)).