Efficient algorithms for locating the length-constrained heaviest segments with applications to biomolecular sequence analysis

Authors:
Yaw-Ling Lin;Tao Jiang;Kun-Mao Chao
Affiliations:
Department of Computer Science and Information Management, Providence University, 200 Chung Chi Road, Shalu, Taichung County, 433 Taiwan;Department of Computer Science, University of California Riverside, Riverside, CA;Department of Life Science, National Yang-Ming University, Taipei, 112 Taiwan
Venue:
Journal of Computer and System Sciences - Computational biology 2002
Year:
2002

Citing 5
Cited 34

Programming pearls

Programming pearls
Introduction to algorithms

Introduction to algorithms
An efficient algorithm for the length-constrained heaviest path problem on a tree

Information Processing Letters
Algorithms for Local Alignment with Length Constraints

LATIN '02 Proceedings of the 5th Latin American Symposium on Theoretical Informatics
The Conserved Exon Method for Gene Finding

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology

Fast Algorithms for Finding Maximum-Density Segments of a Sequence with Applications to Bioinformatics

WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Maximum-Scoring Segment Sets

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Linear-time algorithms for computing maximum-density sequence segments with bioinformatics applications

Journal of Computer and System Sciences
Optimal algorithms for locating the longest and shortest segments satisfying a sum or an average constraint

Information Processing Letters
Improved algorithmms for the k maximum-sums problems

Theoretical Computer Science
Randomized algorithm for the sum selection problem

Theoretical Computer Science
Dynamic Programming Based Approximation Algorithms for Sequence Alignment with Constraints

INFORMS Journal on Computing
A geometric framework for solving subsequence problems in computational biology efficiently

SCG '07 Proceedings of the twenty-third annual symposium on Computational geometry
On the range maximum-sum segment query problem

Discrete Applied Mathematics
Finding a maximum-density path in a tree under the weight and length constraints

Information Processing Letters
Optimal Algorithms for the Interval Location Problem with Range Constraints on Length and Average

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Algorithms for finding the weight-constrained k longest paths in a tree and the length-constrained k maximum-sum segments of a sequence

Theoretical Computer Science
An improved algorithm for finding a length-constrained maximum-density subtree in a tree

Information Processing Letters
Optimal algorithms for the average-constrained maximum-sum segment problem

Information Processing Letters
An optimal algorithm for the maximum-density path in a tree

Information Processing Letters
Finding long and similar parts of trajectories

Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Summarizing multiple spoken documents: finding evidence from untranscribed audio

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Optimal algorithms for locating the longest and shortest segments satisfying a sum or an average constraint

Information Processing Letters
An optimal algorithm for maximum-sum segment and its application in bioinformatics

CIAA'03 Proceedings of the 8th international conference on Implementation and application of automata
Algorithms for computing the length-constrained max-score segments with applications to DNA copy number data analysis

ISAAC'07 Proceedings of the 18th international conference on Algorithms and computation
The weight-constrained maximum-density subtree problem and related problems in trees

The Journal of Supercomputing
Finding long and similar parts of trajectories

Computational Geometry: Theory and Applications
The density maximization problem in graphs

COCOON'11 Proceedings of the 17th annual international conference on Computing and combinatorics
Improved algorithms for the k maximum-sums problems

ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Finding a weight-constrained maximum-density subtree in a tree

ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
On the range maximum-sum segment query problem

ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
Disjoint segments with maximum density

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
On locating disjoint segments with maximum sum of densities

ISAAC'06 Proceedings of the 17th international conference on Algorithms and Computation
An algorithm for a generalized maximum subsequence problem

LATIN'06 Proceedings of the 7th Latin American conference on Theoretical Informatics
Finding maximum sum segments in sequences with uncertainty

ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Note: Finding a length-constrained maximum-sum or maximum-density subtree and its application to logistics

Discrete Optimization
Calculational developments of new parallel algorithms for size-constrained maximum-sum segment problems

FLOPS'12 Proceedings of the 11th international conference on Functional and Logic Programming
Optimal eviction policies for stochastic address traces

Theoretical Computer Science
The density maximization problem in graphs

Journal of Combinatorial Optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study two fundamental problems concerning the search for interesting regions in sequences: (i) given a sequence of real numbers of length n and an upper bound U, find a consecutive subsequence of length at most U with the maximum sum and (ii) given a sequence of real numbers of length n and a lower bound L, find a consecutive subsequence of length at least L with the maximum average. We present an O(n)-time algorithm for the first problem and an O(n log L)-time algorithm for the second. The algorithms have potential applications in several areas of biomolecular sequence analysis including locating GC-rich regions in a genomic DNA sequence, post-processing sequence alignments, annotating multiple sequence alignments, and computing length-constrained ungapped local alignment. Our preliminary tests on both simulated and real data demonstrate that the algorithms are very efficient and able to locate useful (such as GC-rich) regions.