Communications of the ACM
Linear function neurons: Structure and training
Biological Cybernetics
Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension
STOC '86 Proceedings of the eighteenth annual ACM symposium on Theory of computing
On the learnability of Boolean formulae
STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
Information Processing Letters
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations
Inductive Inference: Theory and Methods
ACM Computing Surveys (CSUR)
Learning disjunction of conjunctions
IJCAI'85 Proceedings of the 9th international joint conference on Artificial intelligence - Volume 1
Las Vegas algorithms for linear and integer programming when the dimension is small
Journal of the ACM (JACM)
On Rectangular Partitionings in Two Dimensions: Algorithms, Complexity, and Applications
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Ultraconservative Online Algorithms for Multiclass Problems
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Hi-index | 0.00 |
Valiant and others have studied the problem of learning various classes of Boolean functions from examples. Here we discuss on-line learning of these functions. In on-line learning, the learner responds to each example according to a current hypothesis. Then the learner updates the hypothesis, if necessary, based on the correct classification of the example. One natural measure of the quality of learning in the on-line setting is the number of mistakes the learner makes. For suitable classes of functions, on-line learning algorithms are available that make a bounded number of mistakes, with the bound independent of the number of examples seen by the learner. We present one such algorithm, which learns disjunctive Boolean functions, and variants of the algorithm for learning other classes of Boolean functions. The algorithm can be expressed as a linear-threshold algorithm. A primary advantage of this algorithm is that the number of mistakes that it makes is relatively little affected by the presence of large numbers of irrelevant attributes in the examples; we show that the number of mistakes grows only logarithmically with the number of irrelevant attributes. At the same time, the algorithm is computationaUy time and space efficient.