Relationship preserving feature selection for unlabelled clinical trials time-series

  • Authors:
  • Fatih Altiparmak;Michael Gibas;Hakan Ferhatosmanoglu

  • Affiliations:
  • ASELSAN A.S. Radar, EW, Turkey;The Ohio State University, Columbus, OH;The Ohio State University, Columbus, OH

  • Venue:
  • Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Feature selection has been widely studied in supervised data mining applications, where the typical goal is to create clusters through the selection of a reduced attribute set that maximizes classification accuracies. Such a goal may not be appropriate for preserving inter-attribute relationships of unlabelled time-series, such as the case of clinical trials data. In this paper, we select the features based on the time-series relationships of attributes by measuring their inter-attribute movement. We present performance measures and methods for feature selection over unlabelled time-series with the aim of preserving inter-attribute relationships. The performance metrics estimate the effectiveness of a given feature set with respect to representation quality by measuring the nearest neighbors before and after feature selection. We provide techniques to combine and compare data from non-standard variable-length time-series sources and provide a mechanism to inject expert opinion into the feature selection process. The methodologies and comparative results are presented in the context of a real pharmaceutical database application.