Integrating heterogeneous microarray data sources using correlation signatures

Authors:
Jaewoo Kang;Jiong Yang;Wanhong Xu;Pankaj Chopra
Affiliations:
NC State University, Raleigh, NC;Case Western Reserve University, Cleveland, OH;NC State University, Raleigh, NC;NC State University, Raleigh, NC
Venue:
DILS'05 Proceedings of the Second international conference on Data Integration in the Life Sciences
Year:
2005

Citing 12
Cited 4

Latent semantic indexing: a probabilistic analysis

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Approximate nearest neighbors: towards removing the curse of dimensionality

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Finding motifs using random projections

RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Database-friendly random projections

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Random projection in dimensionality reduction: applications to image and text data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering by pattern similarity in large data sets

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Feature selection for high-dimensional genomic microarray data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Subspace clustering for high dimensional data: a review

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Redundancy based feature selection for microarray data

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding disease specific alterations in the co-expression of genes

Bioinformatics
An Introduction to R

An Introduction to R

Direct integration of microarrays for selecting informative genes and phenotype classification

Information Sciences: an International Journal
An adaptive approach for integration analysis of multiple gene expression datasets

AIMSA'10 Proceedings of the 14th international conference on Artificial intelligence: methodology, systems, and applications
Clustering of multiple microarray experiments using information integration

ITBAM'11 Proceedings of the Second international conference on Information technology in bio- and medical informatics
Heterogeneous clustering ensemble method for combining different cluster results

BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Microarrays are one of the latest breakthroughs in experimental molecular biology. Thousands of different research groups generate tens of thousands of microarray gene expression profiles based on different tissues, species, and conditions. Combining such vast amount of microarray data sets is an important and yet challenging problem. In this paper, we introduce a “correlation signature” method that allows the coherent interpretation and integration of microarray data across disparate sources. The proposed algorithm first builds, for each gene (row) in a table, a correlation signature that captures the system-wide dependencies existing between the gene and the other genes within the table, and then compares the signatures across the tables for further analysis. We validate our framework with an experimental study using real microarray data sets, the result of which suggests that such an approach can be a viable solution for the microarray data integration and analysis problems.