Latent semantic indexing: a probabilistic analysis
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Finding motifs using random projections
RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Database-friendly random projections
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Random projection in dimensionality reduction: applications to image and text data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering by pattern similarity in large data sets
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Feature selection for high-dimensional genomic microarray data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Biclustering of Expression Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Redundancy based feature selection for microarray data
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
An Introduction to R
Direct integration of microarrays for selecting informative genes and phenotype classification
Information Sciences: an International Journal
An adaptive approach for integration analysis of multiple gene expression datasets
AIMSA'10 Proceedings of the 14th international conference on Artificial intelligence: methodology, systems, and applications
Clustering of multiple microarray experiments using information integration
ITBAM'11 Proceedings of the Second international conference on Information technology in bio- and medical informatics
Heterogeneous clustering ensemble method for combining different cluster results
BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
Hi-index | 0.00 |
Microarrays are one of the latest breakthroughs in experimental molecular biology. Thousands of different research groups generate tens of thousands of microarray gene expression profiles based on different tissues, species, and conditions. Combining such vast amount of microarray data sets is an important and yet challenging problem. In this paper, we introduce a “correlation signature” method that allows the coherent interpretation and integration of microarray data across disparate sources. The proposed algorithm first builds, for each gene (row) in a table, a correlation signature that captures the system-wide dependencies existing between the gene and the other genes within the table, and then compares the signatures across the tables for further analysis. We validate our framework with an experimental study using real microarray data sets, the result of which suggests that such an approach can be a viable solution for the microarray data integration and analysis problems.