Unsupervised feature selection for biomarker identification in chromatography and gene expression data

  • Authors:
  • Marc Strickert;Nese Sreenivasulu;Silke Peterek;Winfriede Weschke;Hans-Peter Mock;Udo Seiffert

  • Affiliations:
  • Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben, Pattern Recognition Group;Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben, Gene Expression Group;Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben, Applied Biochemistry;Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben, Gene Expression Group;Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben, Applied Biochemistry;Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben, Pattern Recognition Group

  • Venue:
  • ANNPR'06 Proceedings of the Second international conference on Artificial Neural Networks in Pattern Recognition
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

A novel approach to feature selection from unlabeled vector data is presented. It is based on the reconstruction of original data relationships in an auxiliary space with either weighted or omitted features. Feature weighting, on one hand, is related to the return forces of factors in a parametric data similarity measure as response to disturbance of their optimum values. Feature omission, on the other hand, inducing measurable loss of reconstruction quality, is realized in an iterative greedy way. The proposed framework allows to apply custom data similarity measures. Here, adaptive Euclidean distance and adaptive Pearson correlation are considered, the former serving as standard reference, the latter being usefully for intensity data. Results of the different strategies are given for chromatography and gene expression data.