Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Algorithms for association rule mining — a general survey and comparison
ACM SIGKDD Explorations Newsletter
Clustering by pattern similarity in large data sets
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Discovering local structure in gene expression data: the order-preserving submatrix problem
Proceedings of the sixth annual international conference on Computational biology
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Biclustering of Expression Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
d-Clusters: Capturing Subspace Correlation in a Large Data Set
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Biclustering in Gene Expression Data by Tendency
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Biclustering Algorithms for Biological Data Analysis: A Survey
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach
IEEE Transactions on Knowledge and Data Engineering
Order preserving clustering over multiple time course experiments
EC'05 Proceedings of the 3rd European conference on Applications of Evolutionary Computing
Hi-index | 0.00 |
This paper concerns the discovery of Order Preserving Clusters (OP-Clusters) in gene expression data, in each of which a subset of genes induce a similar linear ordering along a subset of conditions. After converting each gene vector into an ordered label sequence. The problem is transferred into finding frequent orders appearing in the sequence set. We propose an algorithm of finding the frequent orders by iteratively Combining the most Frequent Prefixes and Suffixes (CFPS) in a statistical way. We also define the significance of an OP-Cluster. Our method has good scale-up property with dimension of the dataset and size of the cluster. Experimental study on both synthetic datasets and real gene expression dataset shows our approach is very effective and efficient.