Massive online teaching to bounded learners

Authors:
Brendan Juba;Ryan Williams
Affiliations:
Harvard University, Cambridge, MA, USA;Stanford University, Stanford, CA, USA
Venue:
Proceedings of the 4th conference on Innovations in Theoretical Computer Science
Year:
2013

Citing 25
Cited 0

A theory of the learnable

Communications of the ACM
Occam's razor

Information Processing Letters
From on-line to batch learning

COLT '89 Proceedings of the second annual workshop on Computational learning theory
Teachability in computational learning

New Generation Computing - Selected papers from the international workshop on algorithmic learning theory,1990
A computational model of teaching

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
On the power of inductive inference from good examples

Theoretical Computer Science
On the complexity of teaching

Journal of Computer and System Sciences
On specifying Boolean functions by labelled examples

Discrete Applied Mathematics
Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension

Machine Learning
Sequential PAC learning

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Teaching a smarter learner

Journal of Computer and System Sciences
A model of interactive teaching

Journal of Computer and System Sciences - special issue on complexity theory
P = BPP if E requires exponential circuits: derandomizing the XOR lemma

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
On the limits of efficient teachability

Information Processing Letters
In search of an easy witness: exponential time vs. probabilistic polynomial time

Journal of Computer and System Sciences - Complexity 2001
Learning Decision Lists

Machine Learning
Learning from Different Teachers

Machine Learning
On space-bounded learning and the vapnik-chervonenkis dimension

On space-bounded learning and the vapnik-chervonenkis dimension
Derandomizing polynomial identity tests means proving circuit lower bounds

Computational Complexity
Measuring teachability using variants of the teaching dimension

Theoretical Computer Science
Recent Developments in Algorithmic Teaching

LATA '09 Proceedings of the 3rd International Conference on Language and Automata Theory and Applications
Models of Cooperative Teaching and Learning

The Journal of Machine Learning Research
Teaching learners with restricted mind changes

ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Teaching randomized learners

COLT'06 Proceedings of the 19th annual conference on Learning Theory
A theory of goal-oriented communication

Journal of the ACM (JACM)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider a model of teaching in which the learners are consistent and have bounded state, but are otherwise arbitrary. The teacher is non-interactive and "massively open": the teacher broadcasts a sequence of examples of an arbitrary target concept, intended for every possible on-line learning algorithm to learn from. We focus on the problem of designing interesting teachers: efficient sequences of examples that lead all capable and consistent learners to learn concepts, regardless of the underlying algorithm used by the learner. We use two measures of teaching efficiency: the number of mistakes made by the worst-case learner, and the maximum length of the example sequence needed for the worst-case learner. Our results are summarized as follows: Given a uniform random sequence of examples of an n-bit concept function, learners (capable of consistently learning the concept) with s(n) bits of state are guaranteed to make only O(n ⋅ s(n)) mistakes and exactly learn the concept, with high probability. This theorem has interesting corollaries; for instance, every concept c has a sequence of examples can teach c to all capable consistent on-line learners implementable with s(n)-size circuits, such that every learner makes only ~O(s(n)^2) mistakes. That is, all resource-bounded algorithms capable of consistently learning a concept can be simultaneously taught that concept with few mistakes, on a single example sequence. We also show how to efficiently generate such a sequence of examples on-line: using Nisan's pseudorandom generator, each example in the sequence can be generated with polynomial-time overhead per example, with an O(n ⋅ s(n))-bit initial seed. To justify our use of randomness, we prove that any non-trivial derandomization of our sequences would imply new circuit lower bounds. For instance, if there is a deterministic 2n O(1) time algorithm that generates a sequence of examples, such that all consistent and capable polynomial-size circuit learners learn the all-zeroes concept with less than 2n mistakes, then EXP ⊄ P. We present examples illustrating that the key differences in our model -- our focus on mistakes rather than the total number of examples, and our use of a state bound -- must be considered together to obtain our results. We show that for every consistent s(n)-state bounded learner A, and every n-bit concept that A is capable of learning, there is a custom "tutoring" sequence of only O(n ⋅ s(n)) examples that teaches A the concept. That is, in principle, there are no slow learners, only bad teachers: if a state-bounded learner is capable of learning a concept at all, then it can always be taught that concept quickly via some short sequence of examples.