Choosing where to look next in a mutation sequence space

  • Authors:
  • Samuel A. Danziger;Jue Zeng;Ying Wang;Rainer K. Brachmann;Richard H. Lathrop

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • Bioinformatics
  • Year:
  • 2007

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Many biomedical projects would benefit from reducing the time and expense of in vitro experimentation by using computer models for in silico predictions. These models may help determine which expensive biological data are most useful to acquire next. Active Learning techniques for choosing the most informative data enable biologists and computer scientists to optimize experimental data choices for rapid discovery of biological function. To explore design choices that affect this desirable behavior, five novel and five existing Active Learning techniques, together with three control methods, were tested on 57 previously unknown p53 cancer rescue mutants for their ability to build classifiers that predict protein function. The best of these techniques, Maximum Curiosity, improved the baseline accuracy of 56–77%. This article shows that Active Learning is a useful tool for biomedical research, and provides a case study of interest to others facing similar discovery challenges.