Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization
Machine Learning - Special issue on applications in molecular biology
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Motif discovery without alignment or enumeration (extended abstract)
RECOMB '98 Proceedings of the second annual international conference on Computational molecular biology
An algorithm for finding novel gapped motifs in DNA sequences
RECOMB '98 Proceedings of the second annual international conference on Computational molecular biology
Algorithms for phylogenetic footprinting
RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
An Exact Algorithm to Identify Motifs in Orthologous Sequences from Multiple Species
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
A Statistical Method for Finding Transcription Factor Binding Sites
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Spelling Approximate Repeated or Common Motifs Using a Suffix Tree
LATIN '98 Proceedings of the Third Latin American Symposium on Theoretical Informatics
Provably sensitive Indexing strategies for biosequence similarity search
Proceedings of the sixth annual international conference on Computational biology
Finding motifs in the twilight zone
Proceedings of the sixth annual international conference on Computational biology
From promoter sequence to expression: a probabilistic framework
Proceedings of the sixth annual international conference on Computational biology
Modeling dependencies in protein-DNA binding sites
RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
A Simple Hyper-Geometric Approach for Discovering Putative Transcription Factor Binding Sites
WABI '01 Proceedings of the First International Workshop on Algorithms in Bioinformatics
Assessing the Statistical Significance of Overrepresented Oligonucleotides
WABI '01 Proceedings of the First International Workshop on Algorithms in Bioinformatics
Statistical Identification of Uniformly Mutated Segments within Repeats
CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
Pattern discovery in sequences under a Markov assumption
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Accelerating Approximate Subsequence Search on Large Protein Sequence Databases
CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
A symbolic representation of time series, with implications for streaming algorithms
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Finding Higher Order Motifs under the Levenshtein Measure
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Probabilistic discovery of time series motifs
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
A combined model and a varied Gibbs sampling algorithm used for motif discovery
APBC '04 Proceedings of the second conference on Asia-Pacific bioinformatics - Volume 29
Particle Swarm Optimisation for Protein Motif Discovery
Genetic Programming and Evolvable Machines
Locality-sensitive hashing scheme based on p-stable distributions
SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
PRUNER: Algorithms for Finding Monad Patterns in DNA Sequences
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Efficient algorithms for substring near neighbor problem
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
NP-completeness for all computer science undergraduates: a novel project-based curriculum
Journal of Computing Sciences in Colleges
Algorithms for extracting motifs from biological weighted sequences
Journal of Discrete Algorithms
An Exact Data Mining Method for Finding Center Strings and All Their Instances
IEEE Transactions on Knowledge and Data Engineering
Detecting time series motifs under uniform scaling
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Experiencing SAX: a novel symbolic representation of time series
Data Mining and Knowledge Discovery
Fast and Practical Algorithms for Planted (l, d) Motif Search
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions
Communications of the ACM - 50th anniversary issue: 1958 - 2008
DNA Motif Representation with Nucleotide Dependency
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
An efficient motif discovery algorithm with unknown motif length and number of binding sites
International Journal of Data Mining and Bioinformatics
Efficiently finding unusual shapes in large image databases
Data Mining and Knowledge Discovery
MOGAMOD: Multi-objective genetic algorithm for motif discovery
Expert Systems with Applications: An International Journal
Establishing relationships among patterns in stock market data
Data & Knowledge Engineering
Efficient discovery of unusual patterns in time series
New Generation Computing
Toward unsupervised activity discovery using multi-dimensional motif detection in time series
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
A simple algorithm for (l, d) motif search
CIBCB'09 Proceedings of the 6th Annual IEEE conference on Computational Intelligence in Bioinformatics and Computational Biology
Efficient selection of unique and popular oligos for large EST databases
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Space and time efficient algorithms to discover endogenous RNAi patterns in complete genome data
ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Protein sequence motif discovery on distributed supercomputer
GPC'08 Proceedings of the 3rd international conference on Advances in grid and pervasive computing
Approximate variable-length time series motif discovery using grammar inference
Proceedings of the Tenth International Workshop on Multimedia Data Mining
A parallel combinatorial algorithm for subtle motifs
International Journal of Bioinformatics Research and Applications
Generalised Sequence Signatures through symbolic clustering
International Journal of Data Mining and Bioinformatics
A frequent pattern mining method for finding planted (l, d)-motifs of unknown length
RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
Characterizing the fine structure of a neural sensory code through information distortion
Journal of Computational Neuroscience
Searching historical manuscripts for near-duplicate figures
Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
Discovering consensus patterns in biological databases
VDMB'06 Proceedings of the First international conference on Data Mining and Bioinformatics
Sharper upper and lower bounds for an approximation scheme for consensus-pattern
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Assessing the significance of sets of words
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Flexible pattern discovery with (extended) disjunctive logic programming
ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems
Generalized planted (l,d)-motif problem with negative set
WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics
Integrating heterogeneous microarray data sources using correlation signatures
DILS'05 Proceedings of the Second international conference on Data Integration in the Life Sciences
Discovering time series motifs based on multidimensional index and early abandoning
ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part I
Combining motif information and neural network for time series prediction
International Journal of Business Intelligence and Data Mining
LSH-based large scale chinese calligraphic character recognition
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Hi-index | 0.00 |
Pevzner and Sze [23] considered a precise version of the motif discovery problem and simultaneously issued an algorithmic challenge: find a motif M of length 15, where each planted instance differs from M in 4 positions. Whereas previous algorithms all failed to solve this (15,4)-motif problem. Pevzner and Sze introduced algorithms that succeeded. However, their algorithms failed to solve the considerably more difficult (14,4)-, (16,5)-, and (18,6)-motif problems.We introduce a novel motif discovery algorithm based on the use of random projections of the input's substrings. Experiments on simulated data demonstrate that this algorithm performs better than existing algorithms and, in particular, typically solves the difficult (14,4)-, (16,5)-, and (18,6)-motif problems quite efficiently. A probabilistic estimate shows that the small values of d for which the algorithm fails to recover the planted (l, d)-motif are in all likelihood inherently impossible to solve. We also present experimental results on realistic biological data by identifying ribosome binding sites in prokaryotes as well as a number of known transcriptional regulatory motifs in eukaryotes.