Compression of DNA sequence reads in FASTQ format

Authors:
Sebastian Deorowicz;Szymon Grabowski
Affiliations:
-;-
Venue:
Bioinformatics
Year:
2011

Citing 0
Cited 4

A New Efficient Data Structure for Storage and Retrieval of Multiple Biosequences

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
KungFQ: A Simple and Powerful Approach to Compress fastq Files

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
High-Throughput Compression of FASTQ Data with SeqDB

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Practical compression for multi-alignment genomic files

ACSC '13 Proceedings of the Thirty-Sixth Australasian Computer Science Conference - Volume 135

Quantified Score

Hi-index	3.85

Visualization

Abstract

Motivation: Modern sequencing instruments are able to generate at least hundreds of millions short reads of genomic data. Those huge volumes of data require effective means to store them, provide quick access to any record and enable fast decompression. Results: We present a specialized compression algorithm for genomic data in FASTQ format which dominates its competitor, G-SQZ, as is shown on a number of datasets from the 1000 Genomes Project ( www.1000genomes.org). Availability: DSRC is freely available at http:/sun.aei.polsl.pl/dsrc. Contact: sebastian.deorowicz@polsl.pl Supplementary information:Supplementary data are available at Bioinformatics online.