Optimizing Multiple Seeds for Protein Homology Search

Authors:
Daniel G. Brown
Affiliations:
-
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2005

Citing 9
Cited 6

A threshold of ln n for approximating set cover

Journal of the ACM (JACM)
Designing seeds for similarity search in genomic DNA

RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Designing multiple simultaneous seeds for DNA similarity search

RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Sensitivity analysis and efficient method for identifying optimal spaced seeds

Journal of Computer and System Sciences
On spaced seeds for similarity search

Discrete Applied Mathematics
Estimating Seed Sensitivity on Homogeneous Alignments

BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
Vector seeds: An extension to spaced seeds

Journal of Computer and System Sciences - Special issue on bioinformatics II
Good spaced seeds for homology search

Bioinformatics
tPatternHunter: gapped, fast and sensitive translated homology search

Bioinformatics

Computing Alignment Seed Sensitivity with Probabilistic Arithmetic Automata

WABI '08 Proceedings of the 8th international workshop on Algorithms in Bioinformatics
On Subset Seeds for Protein Alignment

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
New algorithms for the spaced seeds

FAW'07 Proceedings of the 1st annual international conference on Frontiers in algorithmics
Protein similarity search with subset seeds on a dedicated reconfigurable hardware

PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Quality of algorithms for sequence comparison

PReMI'11 Proceedings of the 4th international conference on Pattern recognition and machine intelligence
A unifying framework for seed sensitivity and its application to subset seeds

WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a framework for improving local protein alignment algorithms. Specifically, we discuss how to extend local protein aligners to use a collection of vector seeds or ungapped alignment seeds to reduce noise hits. We model picking a set of seed models as an integer programming problem and give algorithms to choose such a set of seeds. While the problem is NP-hard, and Quasi-NP-hard to approximate to within a logarithmic factor, it can be solved easily in practice. A good set of seeds we have chosen allows four to five times fewer false positive hits, while preserving essentially identical sensitivity as BLASTP.