ACM Computing Surveys (CSUR)
Database-friendly random projections
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Stability-based validation of clustering solutions
Neural Computation
Randomized maps for assessing the reliability of patients clusters in DNA microarray data analyses
Artificial Intelligence in Medicine
Unsupervised Stability-Based Ensembles to Discover Reliable Structures in Complex Bio-molecular Data
Computational Intelligence Methods for Bioinformatics and Biostatistics
Hi-index | 0.00 |
Searching for structures in complex bio-molecular data is a central issue in several branches of bioinformatics. In particular, the reliability of clusters discovered by a given clustering algorithm have been recently assessed through methods based on the concept of stability with respect to random perturbations of the data. In this context, a major problem is to assess the confidence of the measures of reliability. We discuss a partially "distribution independent" method based on the classical Bernstein inequality to assess the statistical significance of the discovered clusterings. Experimental results with gene expression data show the effectiveness of the proposed approach.