Towards the geometry of estimation of distribution algorithms based on the exponential family

Authors:
Luigi Malagò;Matteo Matteucci;Giovanni Pistone
Affiliations:
Politecnico di Milano, Milano, Italy;Politecnico di Milano, Milano, Italy;Collegio Carlo Alberto, Moncalieri, Italy
Venue:
Proceedings of the 11th workshop proceedings on Foundations of genetic algorithms
Year:
2011

Citing 14
Cited 4

Gradient systems in view of information geometry

Physica D
Natural gradient works efficiently in learning

Neural Computation
The Simple Genetic Algorithm: Foundations and Theory

The Simple Genetic Algorithm: Foundations and Theory
Essays and Surveys in Metaheuristics

Essays and Surveys in Metaheuristics
Global Optimization with Polynomials and the Problem of Moments

SIAM Journal on Optimization
Schemata, Distributions and Graphical Models in Evolutionary Optimization

Journal of Heuristics
Using Optimal Dependency-Trees for Combinational Optimization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
From Recombination of Genes to the Estimation of Distributions I. Binary Parameters

PPSN IV Proceedings of the 4th International Conference on Parallel Problem Solving from Nature
Pseudo-boolean optimization

Discrete Applied Mathematics
The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (Information Science and Statistics)

The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (Information Science and Statistics)
Estimation of Distribution Algorithms with Kikuchi Approximations

Evolutionary Computation
Graphical Models, Exponential Families, and Variational Inference

Foundations and Trends® in Machine Learning
The compact genetic algorithm

IEEE Transactions on Evolutionary Computation
Information geometry on hierarchy of probability distributions

IEEE Transactions on Information Theory

Convergence of the continuous time trajectories of isotropic evolution strategies on monotonic C2 - composite functions

PPSN'12 Proceedings of the 12th international conference on Parallel Problem Solving from Nature - Volume Part I
Variable transformations in estimation of distribution algorithms

PPSN'12 Proceedings of the 12th international conference on Parallel Problem Solving from Nature - Volume Part I
Optimization by ℓ1-constrained Markov fitness modelling

LION'12 Proceedings of the 6th international conference on Learning and Intelligent Optimization
Objective improvement in information-geometric optimization

Proceedings of the twelfth workshop on Foundations of genetic algorithms XII

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a geometrical framework for the analysis of Estimation of Distribution Algorithms (EDAs) based on the exponential family. From a theoretical point of view, an EDA can be modeled as a sequence of densities in a statistical model that converges towards distributions with reduced support. Under this framework, at each iteration the empirical mean of the fitness function decreases in probability, until convergence of the population. This is the context of stochastic relaxation, i.e., the idea of looking for the minima of a function by minimizing its expected value over a set of probability densities. Our main interest is in the study of the gradient of the expected value of the function to be minimized, and in particular on how its landscape changes according to the fitness function and the statistical model used in the relaxation. After introducing some properties of the exponential family, such as the description of its topological closure and of its tangent space, we provide a characterization of the stationary points of the relaxed problem, together with a study of the minimizing sequences with reduced support. The analysis developed in the paper aims to provide a theoretical understanding of the behavior of EDAs, and in particular their ability to converge to the global minimum of the fitness function. The theoretical results of this paper, beside providing a formal framework for the analysis of EDAs, lead to the definition of a new class algorithms for binary functions optimization based on Stochastic Natural Gradient Descent (SNGD), where the estimation of the parameters of the distribution is replaced by the direct update of the model parameters by estimating the natural gradient of the expected value of the fitness function.