COLT '90 Proceedings of the third annual workshop on Computational learning theory
The weighted majority algorithm
Information and Computation
On-line learning of linear functions
Computational Complexity
Predicting a binary sequence almost as well as the optimal biased coin
COLT '96 Proceedings of the ninth annual conference on Computational learning theory
Exponentiated gradient versus gradient descent for linear predictors
Information and Computation
Journal of the ACM (JACM)
General convergence results for linear discriminant updates
COLT '97 Proceedings of the tenth annual conference on Computational learning theory
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Relative Loss Bounds for Multidimensional Regression Problems
Machine Learning
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Learning algorithms for tracking changing concepts and an investigation into the error surfaces of single artificial neurons
Approximate solutions to markov decision processes
Approximate solutions to markov decision processes
Relative loss bounds for on-line density estimation with the exponential family of distributions
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Fisher information and stochastic complexity
IEEE Transactions on Information Theory
Universal portfolios with side information
IEEE Transactions on Information Theory
Minimax redundancy for the class of memoryless sources
IEEE Transactions on Information Theory
A decision-theoretic extension of stochastic complexity and its applications to learning
IEEE Transactions on Information Theory
Worst-case quadratic loss bounds for prediction using linear functions and gradient descent
IEEE Transactions on Neural Networks
Relative loss bounds for single neurons
IEEE Transactions on Neural Networks
Relative Loss Bounds for Temporal-Difference Learning
Machine Learning
A Second-Order Perceptron Algorithm
COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Tracking the best linear predictor
The Journal of Machine Learning Research
The Robustness of the p-Norm Algorithms
Machine Learning
Privacy-preserving Distributed Clustering using Generative Models
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
An information theoretic analysis of maximum likelihood mixture estimation for exponential families
ICML '04 Proceedings of the twenty-first international conference on Machine learning
A privacy-sensitive approach to distributed clustering
Pattern Recognition Letters - Special issue: Advances in pattern recognition
Clustering with Bregman Divergences
The Journal of Machine Learning Research
Step Size Adaptation in Reproducing Kernel Hilbert Space
The Journal of Machine Learning Research
Worst-Case Analysis of Selective Sampling for Linear Classification
The Journal of Machine Learning Research
A primal-dual perspective of online learning algorithms
Machine Learning
Relational learning via collective matrix factorization
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Leading strategies in competitive on-line prediction
Theoretical Computer Science
Mixed Bregman Clustering with Approximation Guarantees
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
A Unified View of Matrix Factorization Models
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Aggregating Algorithm for a Space of Analytic Functions
ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Learning rates of gradient descent algorithm for classification
Journal of Computational and Applied Mathematics
Bregman Divergences and the Self Organising Map
IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
Clustering with Lower Bound on Similarity
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Adaptive fuzzy filtering in a deterministic setting
IEEE Transactions on Fuzzy Systems
Sequential probability assignment via online convex programming using exponential families
ISIT'09 Proceedings of the 2009 IEEE international conference on Symposium on Information Theory - Volume 2
Learning Permutations with Exponential Weights
The Journal of Machine Learning Research
Kullback Leibler divergence based curve matching method
SSVM'07 Proceedings of the 1st international conference on Scale space and variational methods in computer vision
On-line estimation with the multivariate Gaussian distribution
COLT'07 Proceedings of the 20th annual conference on Learning theory
An identity for kernel ridge regression
ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Independent component analysis using bregman divergences
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part II
Ensembles and multiple classifiers: a game-theoretic view
MCS'11 Proceedings of the 10th international conference on Multiple classifier systems
Adaptive and optimal online linear regression on l1-balls
ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Leading strategies in competitive on-line prediction
ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Online learning meets optimization in the dual
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Loss bounds for online category ranking
COLT'05 Proceedings of the 18th annual conference on Learning Theory
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Asymptotic log-loss of prequential maximum likelihood codes
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Online Learning and Online Convex Optimization
Foundations and Trends® in Machine Learning
Weighted last-step min-max algorithm with improved sub-logarithmic regret
ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Kernelization of matrix updates, when and how?
ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
An identity for kernel ridge regression
Theoretical Computer Science
Efficient Market Making via Convex Optimization, and a Connection to Online Learning
ACM Transactions on Economics and Computation - Special Issue on Algorithmic Game Theory
Sparsity regret bounds for individual sequences in online linear regression
The Journal of Machine Learning Research
Selective sampling and active learning from single and multiple teachers
The Journal of Machine Learning Research
Dimensionality reduction with generalized linear models
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Adaptive and optimal online linear regression on ℓ1-balls
Theoretical Computer Science
Hi-index | 0.00 |
We consider on-line density estimation with a parameterized density from the exponential family. The on-line algorithm receives one example at a time and maintains a parameter that is essentially an average of the past examples. After receiving an example the algorithm incurs a loss, which is the negative log-likelihood of the example with respect to the current parameter of the algorithm. An off-line algorithm can choose the best parameter based on all the examples. We prove bounds on the additional total loss of the on-line algorithm over the total loss of the best off-line parameter. These relative loss bounds hold for an arbitrary sequence of examples. The goal is to design algorithms with the best possible relative loss bounds. We use a Bregman divergence to derive and analyze each algorithm. These divergences are relative entropies between two exponential distributions. We also use our methods to prove relative loss bounds for linear regression.