A Bregman extension of quasi-Newton updates I: an information geometrical framework

  • Authors:
  • Takafumi Kanamori;Atsumi Ohara

  • Affiliations:
  • Department of Computer Science and Mathematical Informatics, Nagoya University, Furocho, Chikusaku, Nagoya, 464-8603, Japan;Department of Electrical and Electronics Engineering, University of Fukui, 3-9-1 Bunkyo, Fukui, 910-8507, Japan

  • Venue:
  • Optimization Methods & Software
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study quasi-Newton methods from the viewpoint of information geometry. Fletcher has studied a variational problem which derives approximate Hessian update formulae of quasi-Newton methods. We point out that the variational problem is identical to the optimization of the Kullback–Leibler KL divergence, which is a discrepancy measure between two probability distributions. The KL-divergence introduces a differential geometrical structure on the set of positive-definite matrices, and the geometric view helps our intuitive understanding of the Hessian update in quasi-Newton methods. Then, we introduce the Bregman divergence as an extension of the KL-divergence. As well as the KL-divergence, the Bregman divergence introduces the information geometrical structure on the set of positive-definite matrices. We derive extended quasi-Newton update formulae based on the variational problem of the Bregman divergence. From the geometrical viewpoint, we study the invariance property of Hessian update formulae. We also propose an extension of the sparse quasi-Newton update. Especially, we point out that the sparse quasi-Newton method is closely related to statistical algorithm such as em-algorithm and boosting. We show that the information geometry is a useful tool not only to better understand the numerical algorithm but also to design new update formulae in quasi-Newton methods.