Data-Efficient Information-Theoretic Test Selection

  • Authors:
  • Marianne Mueller;Rómer Rosales;Harald Steck;Sriram Krishnan;Bharat Rao;Stefan Kramer

  • Affiliations:
  • Institut für Informatik, Technische Universität München, Garching, Germany 85748;IKM CAD and Knowledge Solutions, Siemens Healthcare, Malvern, USA 19335;IKM CAD and Knowledge Solutions, Siemens Healthcare, Malvern, USA 19335;IKM CAD and Knowledge Solutions, Siemens Healthcare, Malvern, USA 19335;IKM CAD and Knowledge Solutions, Siemens Healthcare, Malvern, USA 19335;Institut für Informatik, Technische Universität München, Garching, Germany 85748

  • Venue:
  • AIME '09 Proceedings of the 12th Conference on Artificial Intelligence in Medicine: Artificial Intelligence in Medicine
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We use the concept of conditional mutual information (MI) to approach problems involving the selection of variables in the area of medical diagnosis. Computing MI requires estimates of joint distributions over collections of variables. However, in general computing accurate joint distributions conditioned on a large set of variables is expensive in terms of data and computing power. Therefore, one must seek alternative ways to calculate the relevant quantities and still use all the available observations. We describe and compare a basic approach consisting of averaging MI estimates conditioned on individual observations and another approach where it is possible to condition on all observations at once by making some conditional independence assumptions. This yields a data-efficient variant of information maximization for test selection. We present experimental results on public heart disease data and data from a controlled study in the area of breast cancer diagnosis.