d-Clusters: Capturing Subspace Correlation in a Large Data Set

Authors:
Affiliations:
Venue:
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Year:
2002

Citing 0
Cited 69

Clustering by pattern similarity in large data sets

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
An iterative strategy for pattern discovery in high-dimensional data sets

Proceedings of the eleventh international conference on Information and knowledge management
An Index Structure for Pattern Similarity Searching in DNA Microarray Data

CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
OP-Cluster: Clustering by Tendency in High Dimensional Space

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining phenotypes and informative genes from gene expression data

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining multiple phenotype structures underlying gene expression profiles

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Towards interactive exploration of gene expression patterns

ACM SIGKDD Explorations Newsletter
Computing Clusters of Correlation Connected objects

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Subspace clustering for high dimensional data: a review

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Cluster Analysis for Gene Expression Data: A Survey

IEEE Transactions on Knowledge and Data Engineering
Iterative Projected Clustering by Subspace Mining

IEEE Transactions on Knowledge and Data Engineering
Subspace clustering for high dimensional categorical data

ACM SIGKDD Explorations Newsletter
TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
An Interactive Approach to Mining Gene Expression Data

IEEE Transactions on Knowledge and Data Engineering
A Generic Framework for Efficient Subspace Clustering of High-Dimensional Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Biclustering of Expression Data with Evolutionary Computation

IEEE Transactions on Knowledge and Data Engineering
Comparing Subspace Clusterings

IEEE Transactions on Knowledge and Data Engineering
Deriving quantitative models for correlation clusters

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Knowledge guided analysis of microarray data

Journal of Biomedical Informatics
Locally adaptive metrics for clustering high dimensional data

Data Mining and Knowledge Discovery
Linear manifold clustering in high dimensional spaces by stochastic search

Pattern Recognition
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data

IEEE Transactions on Knowledge and Data Engineering
A multi-objective approach to discover biclusters in microarray data

Proceedings of the 9th annual conference on Genetic and evolutionary computation
Possibilistic approach for biclustering microarray data

Computers in Biology and Medicine
Learning correlations using the mixture-of-subsets model

ACM Transactions on Knowledge Discovery from Data (TKDD)
Continuous subspace clustering in streaming time series

Information Systems
Random walk biclustering for microarray data

Information Sciences: an International Journal
Biclustering in data mining

Computers and Operations Research
Discovering Biclusters by Iteratively Sorting with Weighted Correlation Coefficient in Gene Expression Data

Journal of Signal Processing Systems
Approximation algorithms for co-clustering

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mining typical patterns from databases

Information Sciences: an International Journal
Finding Additive Biclusters with Random Background

CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
ELKI: A Software System for Evaluation of Subspace Clustering Algorithms

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Coclustering of Human Cancer Microarrays Using Minimum Sum-Squared Residue Coclustering

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

ACM Transactions on Knowledge Discovery from Data (TKDD)
Intelligent system for the analysis of microarray data using principal components and estimation of distribution algorithms

Expert Systems with Applications: An International Journal
Clustering by pattern similarity

Journal of Computer Science and Technology
Gene Specific Co-regulation Discovery: An Improved Approach

ICCS '09 Proceedings of the 9th International Conference on Computational Science: Part I
A probabilistic relaxation labeling framework for reducing the noise effect in geometric biclustering of gene expression data

Pattern Recognition
Discovering pattern-based subspace clusters by pattern tree

Knowledge-Based Systems
Enhanced soft subspace clustering integrating within-cluster and between-cluster information

Pattern Recognition
Efficiently mining local conserved clusters from gene expression data

Neurocomputing
Order preserving clustering by finding frequent orders in gene expression data

PRIB'07 Proceedings of the 2nd IAPR international conference on Pattern recognition in bioinformatics
Detection and visualization of subspace cluster hierarchies

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Mining Outliers in Correlated Subspaces for High Dimensional Data Sets

Fundamenta Informaticae - Intelligent Data Analysis in Granular Computing
Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values

Journal of Biomedical Informatics
Mining microarray gene expression data with unsupervised possibilistic clustering and proximity graphs

Applied Intelligence
Automatic parameter determination in subspace clustering with gravitation function

Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Stability-based validation of bicluster solutions

Pattern Recognition
A novel approach for biclustering gene expression data using modular singular value decomposition

CIBB'09 Proceedings of the 6th international conference on Computational intelligence methods for bioinformatics and biostatistics
An evolutionary approach for biclustering of gene expression data

International Journal of Bio-Inspired Computation
A novel attribute weighting algorithm for clustering high-dimensional categorical data

Pattern Recognition
A novel probabilistic encoding for EAs applied to biclustering of microarray data

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Algorithm for low-variance biclusters to identify coregulation modules in sequencing datasets

Proceedings of the Tenth International Workshop on Data Mining in Bioinformatics
EEW-SC: Enhanced Entropy-Weighting Subspace Clustering for high dimensional gene expression data clustering analysis

Applied Soft Computing
An effective measure for assessing the quality of biclusters

Computers in Biology and Medicine
Parallelized Evolutionary Learning for Detection of Biclusters in Gene Expression Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A fuzzy subspace algorithm for clustering high dimensional data

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Evolutionary biclustering of microarray data

EC'05 Proceedings of the 3rd European conference on Applications of Evolutionary Computing
Feature interaction in subspace clustering using the Choquet integral

Pattern Recognition
Clustering in applications with multiple data sources-A mutual subspace clustering approach

Neurocomputing
Subspace correlation clustering: finding locally correlated dimensions in subspace projections of the data

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Post-processing strategies for improving local gene expression pattern analysis

International Journal of Data Mining and Bioinformatics
A survey on enhanced subspace clustering

Data Mining and Knowledge Discovery
Novel soft subspace clustering with multi-objective evolutionary approach for high-dimensional data

Pattern Recognition
Finding multiple global linear correlations in sparse and noisy data sets

Knowledge-Based Systems
A new measure for gene expression biclustering based on non-parametric correlation

Computer Methods and Programs in Biomedicine
Mining low-variance biclusters to discover coregulation modules in sequencing datasets

Scientific Programming - Biological Knowledge Discovery and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering has been an active research area of great practical importance for recent years. Most previous clustering models have focused on grouping objects with similar values on a (sub)set of dimensions (e.g., subspace cluster) and assumed that every object has an associated value on every dimension (e.g., bicluster). These existing cluster models may not always be adequate in capturing coherence exhibited among objects. Strong coherence may still exist among a set of objects (on a subset of attributes) even if they take quite different values on each attribute and the attribute values are not fully specified.This is very common in many applications including bio-informatics analysis as well as collaborative filtering analysis, where the data may be incomplete and subject to biases. In bio-informatics, a bicluster model has recently been proposed to capture coherence among a subset of the attributes. Here, we introduce a more general model, referred to as the delta-cluster model, to capture coherence exhibited by a subset of objects on a subset of attributes, while allowing absent attribute values. A move-based algorithm (FLOC) is devised to efficiently produce a near-optimal clustering results.The delta-cluster model takes the bicluster model as a special case,where the FLOC algorithm performs far superior to the bicluster algorithm. We demonstrate the correctness and efficiency of the delta-cluster model and the FLOC algorithm on a number of real and synthetic data sets.