Machine Learning
Making large-scale support vector machine learning practical
Advances in kernel methods
Sparse Greedy Matrix Approximation for Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Efficient svm training using low-rank kernel representations
The Journal of Machine Learning Research
Rademacher and gaussian complexities: risk bounds and structural results
The Journal of Machine Learning Research
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Communication-efficient distributed monitoring of thresholded counts
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Suppression and failures in sensor networks: a Bayesian approach
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Nonparametric decentralized detection using kernel methods
IEEE Transactions on Signal Processing
Consistency of support vector machines and other regularized kernel classifiers
IEEE Transactions on Information Theory
Advances and challenges in log analysis
Communications of the ACM
Advances and Challenges in Log Analysis
Queue - Log Analysis
Hi-index | 0.02 |
The computational and/or communication constraints associated with processing large-scale data sets using support vector machines (SVM) in contexts such as distributed networking systems are often prohibitively high, resulting in practitioners of SVM learning algorithms having to apply the algorithm on approximate versions of the kernel matrix induced by a certain degree of data reduction. In this paper, we study the tradeoffs between data reduction and the loss in an algorithm's classification performance. We introduce and analyze a consistent estimator of the SVM's achieved classification error, and then derive approximate upper bounds on the perturbation on our estimator. The bound is shown to be empirically tight in a wide range of domains, making it practical for the practitioner to determine the amount of data reduction given a permissible loss in the classification performance.