On the power of topological kernel in microarray-based detection of cancer

Authors:
Vilen Jumutc;Pawel Zayakin
Affiliations:
Riga Technical University, Riga, Latvia;Latvian BioMedical Research & Study Center, Riga, Latvia
Venue:
IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
Year:
2010

Citing 6
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
Choosing Multiple Parameters for Support Vector Machines

Machine Learning
Rademacher and gaussian complexities: risk bounds and structural results

The Journal of Machine Learning Research
Learning the Kernel Matrix with Semidefinite Programming

The Journal of Machine Learning Research
Multiple kernel learning, conic duality, and the SMO algorithm

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Large Scale Multiple Kernel Learning

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we propose a new topological kernel for the microarray-based detection of cancer. During many decades microarrays were a convenient approach in detecting and observing tumor-derived proteins and involved genes. Despite of its biomedical success microarray-based diagnostics is still out of common sense in practical biomedicine due to the lack of robust classification methods that would be capable of correct and insensitive to underlying distribution diagnosis of unseen serum samples. This dismal property of microarray datasets comes from probabilistically infeasible difference between cancer specific and healthy samples where only very small number of (anti)genes has prominent tumor-driven expression values. Kernel methods such as SVM partially address this problem being a "state-of-art" general-purpose classification and regression toolbox. Nevertheless, a purely performed normalization or preprocessing steps could easily bias encoded via SVM kernel similarity measures preventing from proper generalization on unseen data. In this paper, the topological kernel effectively addresses the above mentioned issue by incorporating indirect topological similarities between samples and taking into consideration ranking of every attribute within each sample. The experimental evaluations were performed on different microarray datasets and verify that proposed kernel improves performance on purely conditioned and even very small datasets resulting in statistically significant P-values. Finally we demonstrate that proposed kernel works even better without applying cross-sample normalization and rescaling of input space.