Sparse activity and sparse connectivity in supervised learning

Authors:
Markus Thom;Günther Palm
Affiliations:
Institute of Measurement, Control and Microtechnology, Ulm University, Ulm, Germany;Institute of Neural Information Processing, Ulm University, Ulm, Germany
Venue:
The Journal of Machine Learning Research
Year:
2013

Citing 30
Cited 0

A finite algorithm for finding the projection of a point onto the Canonical simplex of Rn

Journal of Optimization Theory and Applications
Multiple comparison procedures

Multiple comparison procedures
Multilayer feedforward networks are universal approximators

Neural Networks
On the approximate realization of continuous mappings by neural networks

Neural Networks
Optimal brain damage

Advances in neural information processing systems 2
Neural networks and the bias/variance dilemma

Neural Computation
Sparse Approximate Solutions to Linear Systems

SIAM Journal on Computing
Sparse code shrinkage: denoising by nonlinear maximum likelihood estimation

Proceedings of the 1998 conference on Advances in neural information processing systems II
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Training Invariant Support Vector Machines

Machine Learning
Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Use of the zero norm with linear models and kernel methods

The Journal of Machine Learning Research
The general inefficiency of batch training for gradient descent learning

Neural Networks
Non-negative Matrix Factorization with Sparseness Constraints

The Journal of Machine Learning Research
Matrix Analysis For Scientists And Engineers

Matrix Analysis For Scientists And Engineers
Multiclass Object Recognition with Sparse, Localized Features

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
A fast learning algorithm for deep belief nets

Neural Computation
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Learning Sparse Representations by Non-Negative Matrix Factorization and Sequential Cone Programming

The Journal of Machine Learning Research
Efficient projections onto the l1-ball for learning in high dimensions

Proceedings of the 25th international conference on Machine learning
Efficient Euclidean projections in linear time

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
An efficient projection for l1, ∞ regularization

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A simple, efficient and near optimal algorithm for compressed sensing

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Correlation Matrix Memories

IEEE Transactions on Computers
Comparing measures of sparsity

IEEE Transactions on Information Theory
Online Learning for Matrix Factorization and Sparse Coding

The Journal of Machine Learning Research
Deep, big, simple neural nets for handwritten digit recognition

Neural Computation
Training of sparsely connected MLPs

DAGM'11 Proceedings of the 33rd international conference on Pattern recognition
Fast projections onto mixed-norm balls with applications

Data Mining and Knowledge Discovery
Multi-column deep neural networks for image classification

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sparseness is a useful regularizer for learning in a wide range of applications, in particular in neural networks. This paper proposes a model targeted at classification tasks, where sparse activity and sparse connectivity are used to enhance classification capabilities. The tool for achieving this is a sparseness-enforcing projection operator which finds the closest vector with a pre-defined sparseness for any given vector. In the theoretical part of this paper, a comprehensive theory for such a projection is developed. In conclusion, it is shown that the projection is differentiable almost everywhere and can thus be implemented as a smooth neuronal transfer function. The entire model can hence be tuned end-to-end using gradient-based methods. Experiments on the MNIST database of handwritten digits show that classification performance can be boosted by sparse activity or sparse connectivity. With a combination of both, performance can be significantly better compared to classical non-sparse approaches.