On exact specification by examples

  • Authors:
  • Martin Anthony;Graham Brightwell;Dave Cohen;John Shawe-Taylor

  • Affiliations:
  • Dept. of Statistical and Mathematical Sciences, London School of Economics, University of London, Houghton Street, London WC2A 2AE, U.K.;Dept. of Statistical and Mathematical Sciences, London School of Economics, University of London, Houghton Street, London WC2A 2AE, U.K.;Computer Science Dept., Royal Holloway and Bedford New College, University of London, Egham Hill, Egham, Surrey TW20 0EX, U.K.;Computer Science Dept., Royal Holloway and Bedford New College, University of London, Egham Hill, Egham, Surrey TW20 0EX, U.K.

  • Venue:
  • COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

Some recent work [7, 14, 15] in computational learning theory has discussed learning in situations where the teacher is helpful, and can choose to present carefully chosen sequences of labelled examples to the learner. We say a function t in a set H of functions (a hypothesis space) defined on a set X is specified by S***X if the only function in H which agrees with t on S is t itself. The specification number &sgr;(t) of t is the least cardinality of such an S. For a general hypothesis space, we show that the specification number of any hypotheis is at least equal to a parameter from [14] known as the testing dimension of H. We investigate in some detail the specification numbers of hypotheses in the set Hn of linearly separable boolean functions: We present general methods for finding upper bounds on &sgr;(t) and we characterise those t which have largest &sgr;(t). We obtain a general lower bound on the number of examples required and we show that for all nested hypotheses, this lower bound is attained. We prove that for any t &egr; Hn, there is exactly one set of examples of minimal cardinality (i.e., of cardinality &sgr;(t)) which specifies t. We then discuss those t &egr; Hn which have limited dependence, in the sense that some of the variables are redundant (i.e., there are irrelevant attributes), giving tight upper and lower bounds on &sgr;(t) for such hypotheses. In the final section of the paper, we address the complexity of computing specification numbers and related parameters.