Kernel Alignment k-NN for Human Cancer Classification Using the Gene Expression Profiles

Authors:
Manuel Martín-Merino;Javier Las Rivas
Affiliations:
Universidad Pontificia de Salamanca, Salamanca, Spain 37002;Cancer Research Center (CIC-IBMCC, CSIC/USAL), Salamanca, Spain
Venue:
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
Year:
2009

Citing 8
Cited 0

A generalized kernel approach to dissimilarity-based classification

The Journal of Machine Learning Research
Learning the Kernel Matrix with Semidefinite Programming

The Journal of Machine Learning Research
Cluster Analysis for Gene Expression Data: A Survey

IEEE Transactions on Knowledge and Data Engineering
Formulating distance functions via the kernel trick

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Learning the Kernel with Hyperkernels

The Journal of Machine Learning Research
A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis

Bioinformatics
Bioinformatics and Computational Biology Solutions Using R and Bioconductor (Statistics for Biology and Health)

Bioinformatics and Computational Biology Solutions Using R and Bioconductor (Statistics for Biology and Health)
Distance Metric Learning for Large Margin Nearest Neighbor Classification

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

The k Nearest Neighbor classifier has been applied to the identification of cancer samples using the gene expression profiles with encouraging results. However, the performance of k -NN depends strongly on the distance considered to evaluate the sample proximities. Besides, the choice of a good dissimilarity is a difficult task and depends on the problem at hand. In this paper, we learn a linear combination of dissimilarities using a regularized version of the kernel alignment algorithm. The error function can be optimized using a semi-definite programming approach and incorporates a term that penalizes the complexity of the family of distances avoiding overfitting. The method proposed has been applied to the challenging problem of cancer identification using the gene expression profiles. Kernel alignment k -NN outperforms other metric learning strategies and improves the classical k -NN based on a single dissimilarity.