UASMAs (universal automated SNP mapping algorithms): a set of algorithms to instantaneously map SNPs in real time to aid functional SNP discovery

  • Authors:
  • James T. L. Mah;Danny C. C. Poo;Shaojiang Cai

  • Affiliations:
  • A*STAR, Singapore;National University of Singapore;National University of Singapore

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Currently, submission of new SNP entries into SNP repositories such as dbSNP by NCBI is done by manual curation. This gives rise to errors and ambiguities in SNP data entries. Due to the exponential increase in SNP discovery, there is a necessity to create algorithms to accurately and rapidly map SNPs as they are discovered in real time and depositing these entries automatically into a central SNP database. UASMAs are a set of algorithms to instantaneously map SNPs efficiently and accurately by their unique chromosome position in real time. It is the result of integration of structures and algorithms in state of the art alignment methods MAQ, BWT-SW, Bowtie, SOAP2 and BWA. Using BLAST employed by NCBI as benchmark where recall was at most 91%, recall performance of components Bowtie and BWA were much better at up to 99% for longer reads. Similarly, Bowtie and BWA performed better in terms of precision at greater than 91% whereas BLAST was only 78--88%. BLAST performed poorly in terms of recall and precision for longer reads. Bowtie and BWA algorithms in UASMAs were superior in terms of performances in alignment of longer sequences and locating the precise chromosome position of any SNP with respect to the NCBI reference assembly. Results obtained are fast, instantaneous and accurate. Using UASMAs prove to be fast and optimal in mapping new variants onto the genome in view of depositing these entries accurately into a central database. Because it is done in real-time and with increased accuracy, recall and precision, the database created will be complete, up-to-date and devoid of ambiguities and redundancies.