The importance of dilution in the inference of biological networks

Authors:
Alejandro Lage-Castellanos;Andrea Pagnani;Martin Weigt
Affiliations:
Physics Faculty, University of Havana, La Habana, CP, Cuba;Institute for Scientific Interchange, Turin, Italy;Institute for Scientific Interchange, Turin, Italy
Venue:
Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
Year:
2009

Citing 9
Cited 0

Optimal brain damage

Advances in neural information processing systems 2
Neural networks for pattern recognition

Neural networks for pattern recognition
Second Order Derivatives for Network Pruning: Optimal Brain Surgeon

Advances in Neural Information Processing Systems 5, [NIPS Conference]
An introduction to variable and feature selection

The Journal of Machine Learning Research
Information Theory, Inference & Learning Algorithms

Information Theory, Inference & Learning Algorithms
Convex optimization techniques for fitting sparse Gaussian graphical models

ICML '06 Proceedings of the 23rd international conference on Machine learning
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)

Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Learning graphical model structure using L1-regularization paths

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Factor graphs and the sum-product algorithm

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the crucial tasks in many inference problems is the extraction of an underlying sparse graphical model from a given number of high-dimensional measurements. In machine learning, this is frequently achieved using, as a penalty term, the Lp norm of the model parameters, with p≤ 1 for efficient dilution. Here we propose a statistical-mechanics analysis of the problem in the setting of perceptron memorization and generalization. Using a replica approach, we are able to evaluate the relative performance of naive dilution (obtained by learning without dilution, following by applying a threshold to the model parameters), L1 dilution (which is frequently used in convex optimization) and L0 dilution (which is optimal but computationally hard to implement). Whereas both Lp diluted approaches clearly outperform the naive approach, we find a small region where L0 works almost perfectly and strongly outperforms the simpler to implement L1 dilution. In the second part we propose an efficient message-passing strategy in the simpler case of discrete classification vectors, where the norm L0 norm coincides with the L1. Some examples are discussed.