Bioinformatics: the machine learning approach
Bioinformatics: the machine learning approach
Explaining and Controlling Ambiguity in Dynamic Programming
COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
A discipline of dynamic programming over sequence data
Science of Computer Programming - Methods of software design: Techniques and applications
Formal languages and their relation to automata
Formal languages and their relation to automata
The most probable annotation problem in HMMs and its application to bioinformatics
Journal of Computer and System Sciences
Bioinformatics
Analyzing ambiguity of context-free grammars
CIAA'07 Proceedings of the 12th international conference on Implementation and application of automata
Prediction of RNA secondary structure including kissing hairpin motifs
WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
Bellman's GAP: a declarative language for dynamic programming
Proceedings of the 13th international ACM SIGPLAN symposium on Principles and practices of declarative programming
Forest alignment with affine gaps and anchors
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Sneaking around concatMap: efficient combinators for dynamic programming
Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
Forest alignment with affine gaps and anchors, applied in RNA structure comparison
Theoretical Computer Science
Hi-index | 0.00 |
Stochastic models, such as hidden Markov models or stochastic context-free grammars (SCFGs) can fail to return the correct, maximum likelihood solution in the case of semantic ambiguity. This problem arises when the algorithm implementing the model inspects the same solution in different guises. It is a difficult problem in the sense that proving semantic nonambiguity has been shown to be algorithmically undecidable, while compensating for it (by coalescing scores of equivalent solutions) has been shown to be NP-hard. For stochastic context-free grammars modeling RNA secondary structure, it has been shown that the distortion of results can be quite severe. Much less is known about the case when stochastic context-free grammars model the matching of a query sequence to an implicit consensus structure for an RNA family. We find that three different, meaningful semantics can be associated with the matching of a query against the model—a structural, an alignment, and a trace semantics. Rfam models correctly implement the alignment semantics, and are ambiguous with respect to the other two semantics, which are more abstract. We show how provably correct models can be generated for the trace semantics. For approaches, where such a proof is not possible, we present an automated pipeline to check post factum for ambiguity of the generated models. We propose that both the structure and the trace semantics are worth-while concepts for further study, possibly better suited to capture remotely related family members.