OP-Cluster: Clustering by Tendency in High Dimensional Space

Authors:
Jinze Liu;Wei Wang
Affiliations:
-;-
Venue:
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Year:
2003

Citing 18
Cited 37

BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding generalized projected clusters in high dimensional spaces

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Depth first generation of long patterns

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering by pattern similarity in large data sets

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Mining long sequential patterns in a noisy environment

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Discovering local structure in gene expression data: the order-preserving submatrix problem

Proceedings of the sixth annual international conference on Computational biology
Parallel Algorithms for Discovery of Association Rules

Data Mining and Knowledge Discovery
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
When Is ''Nearest Neighbor'' Meaningful?

ICDT '99 Proceedings of the 7th International Conference on Database Theory
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Semantic Compression and Pattern Extraction with Fascicles

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Sequential PAttern mining using a bitmap representation

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
d-Clusters: Capturing Subspace Correlation in a Large Data Set

ICDE '02 Proceedings of the 18th International Conference on Data Engineering

Computing Clusters of Correlation Connected objects

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Biclustering in Gene Expression Data by Tendency

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Gene Ontology Friendly Biclustering of Expression Profiles

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
An Interactive Approach to Mining Gene Expression Data

IEEE Transactions on Knowledge and Data Engineering
Biclustering of Expression Data with Evolutionary Computation

IEEE Transactions on Knowledge and Data Engineering
Deriving quantitative models for correlation clusters

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering significant OPSM subspace clusters in massive gene expression data

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering Frequent Closed Partial Orders from Strings

IEEE Transactions on Knowledge and Data Engineering
A novel approach to revealing positive and negative co-regulated genes

Journal of Computer Science and Technology
Maximal Subspace Coregulated Gene Clustering

IEEE Transactions on Knowledge and Data Engineering
Biclustering in data mining

Computers and Operations Research
Discovering Biclusters by Iteratively Sorting with Weighted Correlation Coefficient in Gene Expression Data

Journal of Signal Processing Systems
On mining micro-array data by Order-Preserving Submatrix

International Journal of Bioinformatics Research and Applications
Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

ACM Transactions on Knowledge Discovery from Data (TKDD)
A Biclustering Method to Discover Co-regulated Genes Using Diverse Gene Expression Datasets

BICoB '09 Proceedings of the 1st International Conference on Bioinformatics and Computational Biology
Efficiently mining local conserved clusters from gene expression data

Neurocomputing
Gfba: a biclustering algorithm for discovering value-coherent biclusters

ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Mining bi-sets in numerical data

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Discovering significant relaxed order-preserving submatrices

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Comparative analysis of biclustering algorithms

Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
Iterated local search for biclustering of microarray data

PRIB'10 Proceedings of the 5th IAPR international conference on Pattern recognition in bioinformatics
Noise-robust algorithm for identifying functionally associated biclusters from gene expression data

Information Sciences: an International Journal
Gene expression network discovery: a pattern based biclustering approach

Proceedings of the 2011 International Conference on Communication, Computing & Security
The ParTriCluster algorithm for gene expression analysis

International Journal of Parallel Programming
Mining maximal correlated member clusters in high dimensional database

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Mining biologically significant co-regulation patterns from microarray data

RSKT'06 Proceedings of the First international conference on Rough Sets and Knowledge Technology
Visual terrain analysis of high-dimensional datasets

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Finding similar patterns in microarray data

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Mining maximal local conserved gene clusters from microarray data

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
A general approach to mining quality pattern-based clusters from microarray data

DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Ensemble methods for biclustering tasks

Pattern Recognition
BiMine+: An efficient algorithm for discovering relevant biclusters of DNA microarray data

Knowledge-Based Systems
Local correlation detection with linearity enhancement in streaming data

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Mining order-preserving submatrices from probabilistic matrices

ACM Transactions on Database Systems (TODS)
A new measure for gene expression biclustering based on non-parametric correlation

Computer Methods and Programs in Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering is the process of grouping a set of objects intoclasses of similar objects. Because of unknownness of thehidden patterns in the data sets, the definition of similarityis very subtle. Until recently, similarity measures are typicallybased on distances, e.g Euclidean distance and cosinedistance. In this paper, we propose a flexible yet powerfulclustering model, namely OP-Cluster (Order PreservingCluster). Under this new model, two objects are similaron a subset of dimensions if the values of these twoobjects induce the same relative order of those dimensions.Such a cluster might arise when the expression levels of (co-regulated)genes can rise or fall synchronously in responseto a sequence of environment stimuli. Hence, discovery ofOP-Cluster is essential in revealing significant gene regulatorynetworks. A deterministic algorithm is designed andimplemented to discover all the significant OP-Clusters. Aset of extensive experiments has been done on several realbiological data sets to demonstrate its effectiveness and efficiencyin detecting co-regulated patterns.