Classifying molecular sequences using a linkage graph with their pairwise similarities
Theoretical Computer Science - Special issue: Genome informatics
Mining long sequential patterns in a noisy environment
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
A generic motif discovery algorithm for sequential data
Bioinformatics
Multiple alignment by aligning alignments
Bioinformatics
BAG: a graph theoretic sequence clustering algorithm
International Journal of Data Mining and Bioinformatics
An efficient motif discovery algorithm with unknown motif length and number of binding sites
International Journal of Data Mining and Bioinformatics
Establishing relationships among patterns in stock market data
Data & Knowledge Engineering
Generalised Sequence Signatures through symbolic clustering
International Journal of Data Mining and Bioinformatics
Semi-supervised clustering algorithm for haplotype assembly problem based on MEC model
International Journal of Data Mining and Bioinformatics
Alns: a new searchable and filterable sequence alignment format
International Journal of Data Mining and Bioinformatics
Hi-index | 0.00 |
A clustering algorithm is introduced that combines the strengths of clustering and motif finding techniques. Clusters are identified based on unambiguously defined sequence sections as in motif finding algorithms. The definition of similarity within clusters allows transitive matches and, thereby, enables the discovery of remote homologies that cannot be found through motif-finding algorithms. Directed Acyclic Graph (DAG) structures are constructed that link short clusters to the longer ones. We compare the clustering results to the corresponding domains in the InterPro database. A second comparison shows that annotations based on our domains are inherently more consistent than those based on InterPro domains.