Developing classification techniques from biological databases using simulated annealing

  • Authors:
  • B. de la Iglesia;J. J. Wesselink;V. J. Rayward-Smith;J. Dicks;I. N. Roberts;V. Robert;T. Boekhout

  • Affiliations:
  • School of Information Systems, University of East Anglia, Norwich, England;School of Information Systems, University of East Anglia, Norwich, England;School of Information Systems, University of East Anglia, Norwich, England;John Innes Centre, Norwich Research Park, Colney, Norwich, England;Institute of Food Research, Norwich Research Park, Colney, Norwich, England;Centraalbureau voor Schimmelcultures, Utrecht, The Netherlands;Centraalbureau voor Schimmelcultures, Utrecht, The Netherlands

  • Venue:
  • Metaheuristics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes new approaches to classification/identification of biological data. It is expected that the work: may be extensible to other domains such as the medical domain or fault diagnostic problems. Organisms are often classified according to the value of tests which are used for measuring some characteristic of the organism. When selecting a suitable test set it is important to choose one of minimum cost. Equally, when classification models are constructed for the posterior identification of unnamed individuals it is important to produce optimal models in terms of identification performance and cost. In this paper, we first describe the problem of selecting an economic test set for classification. We develop a criterion for differentiation of organisms which may encompass fuzzy differentiability. Then, we describe the problem of using batches of tests sequentially for identification of unknown organisms, and we explore the problem of constructing the best sequence of batches of tests in terms of cost and identification performance. We discuss how metaheuristic algorithms may be used in the solution of these problems. We also present an application of the above to the problem of yeast classification and identification.