A linear-time algorithm for studying genetic variation

Authors:
Nikola Stojanovic;Piotr Berman
Affiliations:
Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, Texas;Department of Computer Science and Engineering, The Pennsylvania State University, University Park, Pennsylvania
Venue:
WABI'06 Proceedings of the 6th international conference on Algorithms in Bioinformatics
Year:
2006

Citing 3
Cited 0

Finding similar regions in many strings

STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
On the closest string and substring problems

Journal of the ACM (JACM)
A Linear-Time Algorithm for the 1-Mismatch Problem

WADS '97 Proceedings of the 5th International Workshop on Algorithms and Data Structures

Quantified Score

Hi-index	0.00

Visualization

Abstract

The study of variation in DNA sequences, within the framework of phylogeny or population genetics, for instance, is one of the most important subjects in modern genomics. We here present a new linear-time algorithm for finding maximal k-regions in alignments of three sequences, which can be used for the detection of segments featuring a certain degree of similarity, as well as the boundaries of distinct genomic environments such as gene clusters or haplotype blocks. k-regions are defined as these which have a center sequence whose Hamming distance from any of the alignment rows is at most k, and their determination in the general case is known to be NP-hard.