Learning figures with the Hausdorff metric by fractals--towards computable binary classification

  • Authors:
  • Mahito Sugiyama;Eiju Hirowatari;Hideki Tsuiki;Akihiro Yamamoto

  • Affiliations:
  • The Institute of Scientific and Industrial Research (ISIR), Osaka University, Osaka, Japan 567-0047;Center for Fundamental Education, The University of Kitakyushu, Kitakyushu, Japan;Graduate School of Human and Environmental Studies, Kyoto University, Kyoto, Japan;Graduate School of Informatics, Kyoto University, Kyoto, Japan

  • Venue:
  • Machine Learning
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present learning of figures, nonempty compact sets in Euclidean space, based on Gold's learning model aiming at a computable foundation for binary classification of multivariate data. Encoding real vectors with no numerical error requires infinite sequences, resulting in a gap between each real vector and its discretized representation used for the actual machine learning process. Our motivation is to provide an analysis of machine learning problems that explicitly tackles this aspect which has been glossed over in the literature on binary classification as well as in other machine learning tasks such as regression and clustering. In this paper, we amalgamate two processes: discretization and binary classification. Each learning target, the set of real vectors classified as positive, is treated as a figure. A learning machine receives discretized vectors as input data and outputs a sequence of discrete representations of the target figure in the form of self-similar sets, known as fractals. The generalization error of each output is measured by the Hausdorff metric. Using this learning framework, we reveal a hierarchy of learnable classes under various learning criteria in the track of traditional analysis based on Gold's learning model, and show a mathematical connection between machine learning and fractal geometry by measuring the complexity of learning using the Hausdorff dimension and the VC dimension. Moreover, we analyze computability aspects of learning of figures using the framework of Type-2 Theory of Effectivity (TTE).