Information retrieval: data structures and algorithms
Information retrieval: data structures and algorithms
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
New techniques for the union-find problem
SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Supporting electronic ink databases
Information Systems
On effective multi-dimensional indexing for strings
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Finding motifs using random projections
RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Efficient Similarity Search In Sequence Databases
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Algorithm for Approximate Tandem Repeats
CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
On effective classification of strings with wavelets
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering Algorithms and Validity Measures
SSDBM '01 Proceedings of the 13th International Conference on Scientific and Statistical Database Management
Coding and Information Theory
Hi-index | 0.00 |
Consensus patterns, like motifs and tandem repeats, are highly conserved patterns with very few substitutions where no gaps are allowed. In this paper, we present a progressive hierarchical clustering technique for discovering consensus patterns in biological databases over a certain length range. This technique can discover consensus patterns with various requirements by applying a post-processing phase. The progressive nature of the hierarchical clustering algorithm makes it scalable and efficient. Experiments to discover motifs and tandem repeats on real biological databases show significant performance gain over non-progressive clustering techniques.