PAC learnability under non-atomic measures: A problem by Vidyasagar

  • Authors:
  • Vladimir Pestov

  • Affiliations:
  • -

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2013

Quantified Score

Hi-index 5.23

Visualization

Abstract

In response to a 1997 problem of M. Vidyasagar, we state a criterion for PAC learnability of a concept class C under the family of all non-atomic (diffuse) measures on the domain @W. The uniform Glivenko-Cantelli property with respect to non-atomic measures is no longer a necessary condition, and consistent learnability cannot in general be expected. Our criterion is stated in terms of a combinatorial parameter VC(Cmod@w"1) which we call the VC dimension of C modulo countable sets. The new parameter is obtained by ''thickening up'' single points in the definition of VC dimension to uncountable ''clusters''. Equivalently, VC(Cmod@w"1)@?d if and only if every countable subclass of C has VC dimension @?d outside a countable subset of @W. The new parameter can be also expressed as the classical VC dimension of C calculated on a suitable subset of a compactification of @W. We do not make any measurability assumptions on C, assuming instead the validity of Martin's Axiom (MA). Similar results are obtained for function learning in terms of the fat-shattering dimension modulo countable sets, but, just like in the classical distribution-free case, the finiteness of this parameter is sufficient but not necessary for PAC learnability under non-atomic measures.