Semisupervised profiling of gene expressions and clinical data

  • Authors:
  • Silvano Paoli;Giuseppe Jurman;Davide Albanese;Stefano Merler;Cesare Furlanello

  • Affiliations:
  • ITC-irst, Trento, Italy;ITC-irst, Trento, Italy;ITC-irst, Trento, Italy;ITC-irst, Trento, Italy;ITC-irst, Trento, Italy

  • Venue:
  • WILF'05 Proceedings of the 6th international conference on Fuzzy Logic and Applications
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an application of BioDCV, a computational environment for semisupervised profiling with Support Vector Machines, aimed at detecting outliers and deriving informative subtypes of patients with respect to pathological features. First, a sample-tracking curve is extracted for each sample as a by-product of the profiling process. The curves are then clustered according to a distance derived from Dynamic Time Warping. The procedure allows identification of noisy cases, whose removal is shown to improve predictive accuracy and the stability of derived gene profiles. After removal of outliers, the semisupervised process is repeated and subgroups of patients are specified. The procedure is demonstrated through the analysis of a liver cancer dataset of 213 samples described by 1 993 genes and by pathological features.