On the limited memory BFGS method for large scale optimization
Mathematical Programming: Series A and B
Hierarchical mixtures of experts and the EM algorithm
Neural Computation
Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Unsupervised Learning of Finite Mixture Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Machine Learning
Numerical Recipes in C: The Art of Scientific Computing
Numerical Recipes in C: The Art of Scientific Computing
Model Selection and Error Estimation
Machine Learning
Sparse Greedy Matrix Approximation for Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Support Vector Mixture for Classification and Regression Problems
ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 1 - Volume 1
Structural Adaptation in Mixture of Experts
ICPR '96 Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume IV-Volume 7472 - Volume 7472
Efficient svm training using low-rank kernel representations
The Journal of Machine Learning Research
Learning the Kernel Matrix with Semidefinite Programming
The Journal of Machine Learning Research
Predictive low-rank decomposition for kernel methods
ICML '05 Proceedings of the 22nd international conference on Machine learning
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Invariant Object Material Identification via Discriminant Learning on Absorption Features
CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
Mixture of Support Vector Machines for HMM based Speech Recognition
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 04
Working Set Selection Using Second Order Information for Training Support Vector Machines
The Journal of Machine Learning Research
Dual unification of bi-class support vector machine formulations
Pattern Recognition
Adaptive mixtures of local experts
Neural Computation
A dual coordinate descent method for large-scale linear SVM
Proceedings of the 25th international conference on Machine learning
On Mixtures of Linear SVMs for Nonlinear Classification
SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Building sparse multiple-kernel SVM classifiers
IEEE Transactions on Neural Networks
Using unsupervised analysis to constrain generalization bounds for support vector classifiers
IEEE Transactions on Neural Networks
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Quiet in class: classification, noise and the dendritic cell algorithm
ICARIS'11 Proceedings of the 10th international conference on Artificial immune systems
Hierarchical linear support vector machine
Pattern Recognition
Integrated Fisher linear discriminants: An empirical study
Pattern Recognition
Hi-index | 0.00 |
In this paper, we address the problem of combining linear support vector machines (SVMs) for classification of large-scale nonlinear datasets. The motivation is to exploit both the efficiency of linear SVMs (LSVMs) in learning and prediction and the power of nonlinear SVMs in classification. To this end, we develop a LSVM mixture model that exploits a divide-and-conquer strategy by partitioning the feature space into subregions of linearly separable datapoints and learning a LSVM for each of these regions. We do this implicitly by deriving a generative model over the joint data and label distributions. Consequently, we can impose priors on the mixing coefficients and do implicit model selection in a top-down manner during the parameter estimation process. This guarantees the sparsity of the learned model. Experimental results show that the proposed method can achieve the efficiency of LSVMs in the prediction phase while still providing a classification performance comparable to nonlinear SVMs.