Spectral clustering and its use in bioinformatics

Authors:
Desmond J. Higham;Gabriela Kalna;Milla Kibble
Affiliations:
Department of Mathematics, University of Strathclyde, Glasgow, G1 1XH Scotland, UK;Department of Mathematics, University of Strathclyde, Glasgow, G1 1XH Scotland, UK;Department of Mathematics, University of Turku, FIN-20014 Turku, Finland
Venue:
Journal of Computational and Applied Mathematics
Year:
2007

Citing 9
Cited 11

Matrix analysis

Matrix analysis
An improved spectral graph partitioning algorithm for mapping parallel computations

SIAM Journal on Scientific Computing
An improved spectral bisection algorithm and its application to dynamic load balancing

Parallel Computing
Spectral partitioning: the more eigenvectors, the better

DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
On clusterings-good, bad and spectral

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Segmentation Using Eigenvectors: A Unifying View

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Spectral partitioning works: planar graphs and finite element meshes

FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science

A clustering coefficient for weighted networks, with application to gene expression data

AI Communications - Network Analysis in Natural Sciences and Engineering
Multidimensional partitioning and bi-partitioning: analysis and application to gene expression data sets

International Journal of Computer Mathematics - Recent Advances in Computational and Applied Mathematics in Science and Engineering
Newtonian Spectral Clustering

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
Network Properties Revealed through Matrix Functions

SIAM Review
TRACEMIN-Fiedler: a parallel algorithm for computing the Fiedler vector

VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
Self-adjust local connectivity analysis for spectral clustering

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Survey: Graph clustering

Computer Science Review
A Laplacian spectral method in phase I analysis of profiles

Applied Stochastic Models in Business and Industry
SC³: Triple Spectral Clustering-Based Consensus Clustering Framework for Class Discovery from Cancer Gene Expression Profiles

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Random walk distances in data clustering and applications

Advances in Data Analysis and Classification
From biological to social networks: Link prediction based on multi-way spectral clustering

Data & Knowledge Engineering

Quantified Score

Hi-index	7.29

Visualization

Abstract

We formulate a discrete optimization problem that leads to a simple and informative derivation of a widely used class of spectral clustering algorithms. Regarding the algorithms as attempting to bi-partition a weighted graph with N vertices, our derivation indicates that they are inherently tuned to tolerate all partitions into two non-empty sets, independently of the cardinality of the two sets. This approach also helps to explain the difference in behaviour observed between methods based on the unnormalized and normalized graph Laplacian. We also give a direct explanation of why Laplacian eigenvectors beyond the Fiedler vector may contain fine-detail information of relevance to clustering. We show numerical results on synthetic data to support the analysis. Further, we provide examples where normalized and unnormalized spectral clustering is applied to microarray data-here the graph summarizes similarity of gene activity across different tissue samples, and accurate clustering of samples is a key task in bioinformatics.