Communications of the ACM
Learnability and the Vapnik-Chervonenkis dimension
Journal of the ACM (JACM)
What size net gives valid generalization?
Neural Computation
Tracking drifting concepts using random examples
COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
An introduction to computational learning theory
An introduction to computational learning theory
Learning changing concepts by exploiting the structure of change
COLT '96 Proceedings of the ninth annual conference on Computational learning theory
The Journal of Machine Learning Research
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Online learning of conditionally I.I.D. data
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Weakly convergent nonparametric forecasting of stationary time series
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
A simple randomized algorithm for sequential prediction of ergodic time series
IEEE Transactions on Information Theory
Data-dependent kn-NN and kernel estimators consistent for arbitrary processes
IEEE Transactions on Information Theory
Extension of the PAC framework to finite and countable Markov chains
IEEE Transactions on Information Theory
Nonparametric estimation via empirical risk minimization
IEEE Transactions on Information Theory
Rates of convergence of nearest neighbor estimation under arbitrary sampling
IEEE Transactions on Information Theory
Learning with stochastic inputs and adversarial outputs
Journal of Computer and System Sciences
Hi-index | 0.00 |
In this work we consider the task of relaxing the i.i.d. assumption in pattern recognition (or classification), aiming to make existing learning algorithms applicable to a wider range of tasks. Pattern recognition is guessing a discrete label of some object based on a set of given examples (pairs of objects and labels). We consider the case of deterministically defined labels. Traditionally, this task is studied under the assumption that examples are independent and identically distributed. However, it turns out that many results of pattern recognition theory carry over a weaker assumption. Namely, under the assumption of conditional independence and identical distribution of objects, while the only assumption on the distribution of labels is that the rate of occurrence of each label should be above some positive threshold. We find a broad class of learning algorithms for which estimations of the probability of the classification error achieved under the classical i.i.d. assumption can be generalized to the similar estimates for case of conditionally i.i.d. examples.