Exhaustive peptide searching using relations

  • Authors:
  • Ela Hunt

  • Affiliations:
  • Department of Computer Science, ETH Zurich, Zurich, Switzerland

  • Venue:
  • BNCOD'07 Proceedings of the 24th British national conference on Databases
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a new robust solution to short peptide searching, tested on a relational platform, with a set of biological queries. Our algorithm is appropriate for large scale scientific data analysis, and has been tested with 1.4GB of amino-acids. Protein sequences are indexed as short overlapping string windows, and stored in a relation. To find approximate matches, we use a neighbourhood generation algorithm. The words in the neighbourhood are then fetched and stored in a relation. We measure execution time and compare the matches found to those delivered by BLAST. We report some performance gains in exact matching and searching within edit distance 1, and very significant quality improvements over heuristics, as we guarantee to deliver all relevant matches.