On a learnability question associated to neural networks with continuous activations (extended abstract)

  • Authors:
  • Bhaskar DasGupta;Hava T. Siegelmann;Eduardo Sontag

  • Affiliations:
  • Department of Computer Science, University of Minnesota, Minneapolis, MN;Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel;Department of Mathematics, Rutgers University, New Brunswick, NJ

  • Venue:
  • COLT '94 Proceedings of the seventh annual conference on Computational learning theory
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper deals with learnability of concept classes defined by neural networks, showing the hardness of PAC-learning (in the complexity, not merely information-theoretic sense) for networks with a particular class of activation. The obstruction lies not with the VC dimension, which is known to grow slowly; instead, the result follows the fact that the loading problem is NP-complete. (The complexity scales badly with input dimension; the loading problem is polynomial-time if the input dimension is constant.) Similar and well-known theorems had already been proved by Megiddo and by Blum and Rivest, for binary-threshold networks. It turns out the general problem for continuous sigmoidal-type functions, as used in practical applications involving steepest descent, is not NP-hard—there are “sigmoidals” for which the problem is in fact trivial—so it is an open question to determine what properties of the activation function cause difficulties. Ours is the first result on the hardness of loading networks which do not consist of binary neurons; we employ a piecewise-linear activation function that has been used in the neural network literature. Our theoretical results lend further justification to the use of incremental (architecture-changing) techniques for training networks.