Semisupervised profiling of gene expressions and clinical data

Authors:
Silvano Paoli;Giuseppe Jurman;Davide Albanese;Stefano Merler;Cesare Furlanello
Affiliations:
ITC-irst, Trento, Italy;ITC-irst, Trento, Italy;ITC-irst, Trento, Italy;ITC-irst, Trento, Italy;ITC-irst, Trento, Italy
Venue:
WILF'05 Proceedings of the 6th international conference on Fuzzy Logic and Applications
Year:
2005

Citing 3
Cited 0

Semisupervised Learning for Molecular Profiling

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Constrained clusters of gene expression profiles with pathological features

Bioinformatics
Improving generalization by data categorization

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an application of BioDCV, a computational environment for semisupervised profiling with Support Vector Machines, aimed at detecting outliers and deriving informative subtypes of patients with respect to pathological features. First, a sample-tracking curve is extracted for each sample as a by-product of the profiling process. The curves are then clustered according to a distance derived from Dynamic Time Warping. The procedure allows identification of noisy cases, whose removal is shown to improve predictive accuracy and the stability of derived gene profiles. After removal of outliers, the semisupervised process is repeated and subgroups of patients are specified. The procedure is demonstrated through the analysis of a liver cancer dataset of 213 samples described by 1 993 genes and by pathological features.