Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Discovering local structure in gene expression data: the order-preserving submatrix problem
Proceedings of the sixth annual international conference on Computational biology
Biclustering of Expression Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
The maximum edge biclique problem is NP-complete
Discrete Applied Mathematics
A Time Series Analysis of Microarray Data
BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
A framework for ontology-driven subspace clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering in Gene Expression Data by Tendency
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Gene Ontology Friendly Biclustering of Expression Profiles
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Biclustering Gene-Feature Matrices for Statistically Significant Dense Patterns
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Biclustering Algorithms for Biological Data Analysis: A Survey
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Mining Sequential Patterns from Large Data Sets (The Kluwer International Series on Advances in Database Systems)
A Time-Series Biclustering Algorithm for Revealing Co-Regulated Genes
ITCC '05 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume I - Volume 01
Analyzing time series gene expression data
Bioinformatics
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
A linear time biclustering algorithm for time series gene expression data
WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics
Order preserving clustering over multiple time course experiments
EC'05 Proceedings of the 3rd European conference on Applications of Evolutionary Computing
Efficient Biclustering Algorithms for Time Series Gene Expression Data Analysis
IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part II: Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living
BARTMAP: A viable structure for biclustering
Neural Networks
BiMine+: An efficient algorithm for discovering relevant biclusters of DNA microarray data
Knowledge-Based Systems
Heuristic approaches for time-lagged biclustering
Proceedings of the 12th International Workshop on Data Mining in Bioinformatics
Mining order-preserving submatrices from probabilistic matrices
ACM Transactions on Database Systems (TODS)
Hi-index | 0.01 |
Although most biclustering formulations are NP-hard, in time series expression data analysis, it is reasonable to restrict the problem to the identification of maximal biclusters with contiguous columns, which correspond to coherent expression patterns shared by a group of genes in consecutive time points. This restriction leads to a tractable problem. We propose an algorithm that finds and reports all maximal contiguous column coherent biclusters in time linear in the size of the expression matrix. The linear time complexity of CCC-Biclustering relies on the use of a discretized matrix and efficient string processing techniques based on suffix trees. We also propose a method for ranking biclusters based on their statistical significance and a methodology for filtering highly overlapping and, therefore, redundant biclusters. We report results in synthetic and real data showing the effectiveness of the approach and its relevance in the discovery of regulatory modules. Results obtained using the transcriptomic expression patterns occurring in Saccharomyces cerevisiae in response to heat stress show not only the ability of the proposed methodology to extract relevant information compatible with documented biological knowledge but also the utility of using this algorithm in the study of other environmental stresses and of regulatory modules in general.