Using literature and data to learn Bayesian networks as clinical models of ovarian tumors

  • Authors:
  • Peter Antal;Geert Fannes;Dirk Timmerman;Yves Moreau;Bart De Moor

  • Affiliations:
  • Department of Electrical Engineering, ESAT/SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium;Department of Electrical Engineering, ESAT/SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium;Department of Obstetrics and Gynecology, University Hospitals Leuven, Herestraat 49, B-3000 Leuven, Belgium;Department of Electrical Engineering, ESAT/SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium;Department of Electrical Engineering, ESAT/SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium

  • Venue:
  • Artificial Intelligence in Medicine
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Thanks to its increasing availability, electronic literature has become a potential source of information for the development of complex Bayesian networks (BN), when human expertise is missing or data is scarce or contains much noise. This opportunity raises the question of how to integrate information from free-text resources with statistical data in learning Bayesian networks. Firstly, we report on the collection of prior information resources in the ovarian cancer domain, which includes ''kernel'' annotations of the domain variables. We introduce methods based on the annotations and literature to derive informative pairwise dependency measures, which are derived from the statistical cooccurrence of the names of the variables, from the similarity of the ''kernel'' descriptions of the variables and from a combined method. We perform wide-scale evaluation of these text-based dependency scores against an expert reference and against data scores (the mutual information (MI) and a Bayesian score). Next, we transform the text-based dependency measures into informative text-based priors for Bayesian network structures. Finally, we report the benefit of such informative text-based priors on the performance of a Bayesian network for the classification of ovarian tumors from clinical data.