Cancer informatics by prototype networks in mass spectrometry

  • Authors:
  • Frank-Michael Schleif;Thomas Villmann;Markus Kostrzewa;Barbara Hammer;Alexander Gammerman

  • Affiliations:
  • University Leipzig, Department of Medicine, Computational Intelligence Group, Semmelweisstrasse 10, 04103 Leipzig, Germany;University Leipzig, Department of Medicine, Computational Intelligence Group, Semmelweisstrasse 10, 04103 Leipzig, Germany;Bruker Daltonik GmbH, Department of Bioanalytics, Research & Development, Permoserstrasse 15, 04318 Leipzig, Germany;Technical University of Clausthal, Department of Computer Science, Computational Intelligence Group, Julius-Albert-Street 4, 38678 Clausthal-Zellerfeld, Germany;The Computer Learning Research Center, Royal Holloway, University of London, Egham, Surrey TW20 0EX, United Kingdom

  • Venue:
  • Artificial Intelligence in Medicine
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Objective: Mass spectrometry has become a standard technique to analyze clinical samples in cancer research. The obtained spectrometric measurements reveal a lot of information of the clinical sample at the peptide and protein level. The spectra are high dimensional and, due to the small number of samples a sparse coverage of the population is very common. In clinical research the calculation and evaluation of classification models is important. For classical statistics this is achieved by hypothesis testing with respect to a chosen level of confidence. In clinical proteomics the application of statistical tests is limited due to the small number of samples and the high dimensionality of the data. Typically soft methods from the field of machine learning are used to generate such models. However for these methods no or only few additional information about the safety of the model decision is available. In this contribution the spectral data are processed as functional data and conformal classifier models are generated. The obtained models allow the detection of potential biomarker candidates and provide confidence measures for the classification decision. Methods: First, wavelet-based techniques for the efficient processing and encoding of mass spectrometric measurements from clinical samples are presented. A prototype-based classifier is extended by a functional metric and combined with the concept of conformal prediction to classify the clinical proteomic spectra and to evaluate the results. Results: Clinical proteomic data of a colorectal cancer and a lung cancer study are used to test the performance of the proposed algorithm. The prototype classifiers are evaluated with respect to prediction accuracy and the confidence of the classification decisions. The adapted metric parameters are analyzed and interpreted to find potential biomarker candidates. Conclusions: The proposed algorithm can be used to analyze functional data as obtained from clinical mass spectrometry, to find discriminating mass positions and to judge the confidence of the obtained classifications, providing robust and interpretable classification models.