On the combination of dissimilarities for gene expression data analysis

  • Authors:
  • Ángela Blanco;Manuel Martín-Merino;Javier De Las Rivas

  • Affiliations:
  • Universidad Pontificia de Salamanca, Salamanca, Spain;Universidad Pontificia de Salamanca, Salamanca, Spain;Cancer Research Center, CIC-IBMCC, CSIC/USAL, Salamanca, Spain

  • Venue:
  • ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

DNA Microarray technology allows us to monitor the expression level of thousands of genes simultaneously. This technique has become a relevant tool to identify different types of cancer. Several machine learning techniques such as the Support Vector Machines (SVM) have been proposed to this aim. However, common SVM algorithms are based on Euclidean distances which do not reflect accurately the proximities among the sample profiles. The SVM has been extended to work with non-Euclidean dissimilarities. However, no dissimilarity can be considered superior to the others because each one reflects different features of the data. In this paper, we propose to combine several Support Vector Machines that are based on different dissimilarities to improve the performance of classifiers based on a single measure. The experimental results suggest that our method reduces the misclassification errors of classifiers based on a single dissimilarity and a widely used combination strategy such as Bagging.