Machine learning with data dependent hypothesis classes

  • Authors:
  • Adam Cannon;J. Mark Ettinger;Don Hush;Clint Scovel

  • Affiliations:
  • Department of Computer Science, Columbia University, New York, NY;Nonproliferation andInternational Security Group, NIS-8, Los Alamos National Laboratory, Los Alamos, NM;Modeling, Algorithms, and Informatics Group, CCS-3, Los Alamos National Laboratory, Los Alamos, NM;Modeling, Algorithms, and Informatics Group, CCS-3, Los Alamos National Laboratory, Los Alamos, NM

  • Venue:
  • The Journal of Machine Learning Research
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We extend the VC theory of statistical learning to data dependent spaces of classifiers. This theory can be viewed as a decomposition of classifier design into two components; the first component is a restriction to a data dependent hypothesis class and the second is empirical risk minimization within that class. We define a measure of complexity for data dependent hypothesis classes and provide data dependent versions of bounds on error deviance and estimation error. We also provide a structural risk minimization procedure over data dependent hierarchies and prove consistency. We use this theory to provide a framework for studying the trade-offs between performance and computational complexity in classifier design. As a consequence we obtain a new family of classifiers with dimension independent performance bounds and efficient learning procedures.