Data-Efficient Information-Theoretic Test Selection

Authors:
Marianne Mueller;Rómer Rosales;Harald Steck;Sriram Krishnan;Bharat Rao;Stefan Kramer
Affiliations:
Institut für Informatik, Technische Universität München, Garching, Germany 85748;IKM CAD and Knowledge Solutions, Siemens Healthcare, Malvern, USA 19335;IKM CAD and Knowledge Solutions, Siemens Healthcare, Malvern, USA 19335;IKM CAD and Knowledge Solutions, Siemens Healthcare, Malvern, USA 19335;IKM CAD and Knowledge Solutions, Siemens Healthcare, Malvern, USA 19335;Institut für Informatik, Technische Universität München, Garching, Germany 85748
Venue:
AIME '09 Proceedings of the 12th Conference on Artificial Intelligence in Medicine: Artificial Intelligence in Medicine
Year:
2009

Citing 1
Cited 1

Elements of information theory

Elements of information theory

Subgroup Discovery for Test Selection: A Novel Approach and Its Application to Breast Cancer Diagnosis

IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII

Quantified Score

Hi-index	0.00

Visualization

Abstract

We use the concept of conditional mutual information (MI) to approach problems involving the selection of variables in the area of medical diagnosis. Computing MI requires estimates of joint distributions over collections of variables. However, in general computing accurate joint distributions conditioned on a large set of variables is expensive in terms of data and computing power. Therefore, one must seek alternative ways to calculate the relevant quantities and still use all the available observations. We describe and compare a basic approach consisting of averaging MI estimates conditioned on individual observations and another approach where it is possible to condition on all observations at once by making some conditional independence assumptions. This yields a data-efficient variant of information maximization for test selection. We present experimental results on public heart disease data and data from a controlled study in the area of breast cancer diagnosis.