Benchmarking the performance of human antibody gene alignment utilities using a 454 sequence dataset

Authors:
Katherine J. L. Jackson;Scott Boyd;Bruno A. Gaëta;Andrew M. Collins
Affiliations:
-;-;-;-
Venue:
Bioinformatics
Year:
2010

Citing 0
Cited 1

Predicting v(d)j recombination using conditional random fields

PRIB'12 Proceedings of the 7th IAPR international conference on Pattern Recognition in Bioinformatics

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Immunoglobulin heavy chain genes are formed by recombination of genes randomly selected from sets of IGHV, IGHD and IGHJ genes. Utilities have been developed to identify genes that contribute to observed VDJ rearrangements, but in the absence of datasets of known rearrangements, the evaluation of these utilities is problematic. We have analyzed thousands of VDJ rearrangements from an individual (S22) whose IGHV, IGHD and IGHJ genotype can be inferred from the dataset. Knowledge of this genotype means that the Stanford_S22 dataset can serve to benchmark the performance of IGH alignment utilities. Results: We evaluated the performance of seven utilities. Failure to partition a sequence into genes present in the S22 genome was considered an error, and error rates for different utilities ranged from 7.1% to 13.7%. Availability: Supplementary data includes the S22 genotypes and alignments. The Stanford_S22 dataset and an evaluation tool is available at http://www.emi.unsw.edu.au/~ihmmune/IGHUtilityEval/. Contact: katherine.jackson@unsw.edu.au Supplementary information:Supplementary data are available at Bioinformatics online.