Relationship preserving feature selection for unlabelled clinical trials time-series

Authors:
Fatih Altiparmak;Michael Gibas;Hakan Ferhatosmanoglu
Affiliations:
ASELSAN A.S. Radar, EW, Turkey;The Ohio State University, Columbus, OH;The Ohio State University, Columbus, OH
Venue:
Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
Year:
2010

Citing 7
Cited 1

Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Making every bit count: fast nonlinear axis scaling

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Minimum Redundancy Feature Selection from Microarray Gene Expression Data

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Feature Selection for Unsupervised Learning

The Journal of Machine Learning Research
Feature Subset Selection and Feature Ranking for Multivariate Time Series

IEEE Transactions on Knowledge and Data Engineering
A Multi-metric Similarity Based Analysis of Microarray Data

BIBM '07 Proceedings of the 2007 IEEE International Conference on Bioinformatics and Biomedicine
Information mining over heterogeneous and high-dimensional time-series data in clinical trials databases

IEEE Transactions on Information Technology in Biomedicine

Win percentage: a novel measure for assessing the suitability of machine classifiers for biological problems

Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Feature selection has been widely studied in supervised data mining applications, where the typical goal is to create clusters through the selection of a reduced attribute set that maximizes classification accuracies. Such a goal may not be appropriate for preserving inter-attribute relationships of unlabelled time-series, such as the case of clinical trials data. In this paper, we select the features based on the time-series relationships of attributes by measuring their inter-attribute movement. We present performance measures and methods for feature selection over unlabelled time-series with the aim of preserving inter-attribute relationships. The performance metrics estimate the effectiveness of a given feature set with respect to representation quality by measuring the nearest neighbors before and after feature selection. We provide techniques to combine and compare data from non-standard variable-length time-series sources and provide a mechanism to inject expert opinion into the feature selection process. The methodologies and comparative results are presented in the context of a real pharmaceutical database application.