Practical methods of optimization; (2nd ed.)
Practical methods of optimization; (2nd ed.)
Multilayer feedforward networks are universal approximators
Neural Networks
Learning internal representations by error propagation
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
The nature of statistical learning theory
The nature of statistical learning theory
Making large-scale support vector machine learning practical
Advances in kernel methods
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Neural Networks: Tricks of the Trade, this book is an outgrowth of a 1996 NIPS workshop
Reducing Communication for Distributed Learning in Neural Networks
ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Ho-Kashyap classifier with early stopping for regularization
Pattern Recognition Letters
Maximal Margin Estimation with Perceptron-Like Algorithm
ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
A New Variant of the Optimum-Path Forest Classifier
ISVC '08 Proceedings of the 4th International Symposium on Advances in Visual Computing
Learning Deep Architectures for AI
Foundations and Trends® in Machine Learning
Spoken emotion recognition through optimum-path forest classification using glottal features
Computer Speech and Language
A discrete approach for supervised pattern recognition
IWCIA'08 Proceedings of the 12th international conference on Combinatorial image analysis
Learning to rank with (a lot of) word features
Information Retrieval
Margin perceptron for word sense disambiguation
Proceedings of the 2010 Symposium on Information and Communication Technology
Engineering Applications of Artificial Intelligence
A neural network for text representation
ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Improved answer ranking in social question-answering portals
Proceedings of the 3rd international workshop on Search and mining user-generated contents
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Online learning with multiple kernels: A review
Neural Computation
Hi-index | 0.00 |
We propose to study links between three important classification algorithms: Perceptrons, Multi-Layer Perceptrons (MLPs) and Support Vector Machines (SVMs). We first study ways to control the capacity of Perceptrons (mainly regularization parameters and early stopping), using the margin idea introduced with SVMs. After showing that under simple conditions a Perceptron is equivalent to an SVM, we show it can be computationally expensive in time to train an SVM (and thus a Perceptron) with stochastic gradient descent, mainly because of the margin maximization term in the cost function. We then show that if we remove this margin maximization term, the learning rate or the use of early stopping can still control the margin. These ideas are extended afterward to the case of MLPs. Moreover, under some assumptions it also appears that MLPs are a kind of mixture of SVMs, maximizing the margin in the hidden layer space. Finally, we present a very simple MLP based on the previous findings, which yields better performances in generalization and speed than the other models.