Multidimensional partitioning and bi-partitioning: analysis and application to gene expression data sets

Authors:
Gabriela Kalna;J. Keith Vass;Desmond J. Higham
Affiliations:
Department of Mathematics, University of Strathclyde, Glasgow, UK;The Beatson Institute for Cancer Research, Glasgow, UK;Department of Mathematics, University of Strathclyde, Glasgow, UK
Venue:
International Journal of Computer Mathematics - Recent Advances in Computational and Applied Mathematics in Science and Engineering
Year:
2008

Citing 8
Cited 0

An improved spectral bisection algorithm and its application to dynamic load balancing

Parallel Computing
Spectral partitioning: the more eigenvectors, the better

DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A spectral method to separate disconnected and nearly-disconnected web graph components

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Information Theory, Inference & Learning Algorithms

Information Theory, Inference & Learning Algorithms
Differential coexpression analysis using microarray data and its application to human cancer

Bioinformatics
Spectral clustering and its use in bioinformatics

Journal of Computational and Applied Mathematics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Eigenvectors and, more generally, singular vectors, have proved to be useful tools for data mining and dimension reduction. Spectral clustering and reordering algorithms have been designed and implemented in many disciplines, and they can be motivated from several different standpoints. Here we give a general, unified derivation from an applied linear algebra perspective. We use a variational approach that has the benefit of (a) naturally introducing an appropriate scaling, (b) allowing for a solution in any desired dimension, and (c) dealing with both the clustering and bi-clustering issues in the same framework. The motivation and analysis is then backed up with examples involving two large data sets from modern, high-throughput, experimental cell biology. Here, the objects of interest are genes and tissue samples, and the experimental data represents gene activity. We show that looking beyond the dominant, or Fiedler, direction reveals important information.