Analogs and Duals of teh MAST Problem for Sequences and Trees

  • Authors:
  • Michael R. Fellows;Michael T. Hallett;Chantal Korostensky;Ulrike Stege

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ESA '98 Proceedings of the 6th Annual European Symposium on Algorithms
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Two natural kinds of problems about "structured collections of symbols" can be generally refered to as the LARGEST COMMON SUBOBJECT and the SMALLEST COMMON SUPEROBJECT problems, which we consider here as the dual problems of interest. For the case of rooted binary trees where the symbols occur as leaf-labels and a subobject is defined by label-respecting hereditary topological containment, both of these problems are NP-complete, as are the analogous problems for sequences (the well-known LONGEST COMMON SUBSEQUENCE AND SHORTEST COMMON SUPERSEQUENCE problems). However, when the trees are restricted by allowing each symbol to occur as a leaf-label at most once (which we call a phylogenetic tree or p-tree), then the LARGEST COMMON SUBOBJECT problem, better known as the MAXIMUM AGREEMENT SUBTREE (MAST) problem, is solvable in polynomial time. We explore the complexity of the basic subobject and superobject problems for sequences and binary trees when the inputs are restricted to p-trees and psequences (p-sequences are sequences where each symbol occurs at most once). We prove that the sequence analog of MAST can be solved in polynomial time. The SHORTEST COMMON SUPERSEQUENCE problem restricted to inputs consisting of a collection of p-sequences (pSCS) remains NP-complete, as does the analogous SMALLEST COMMON SUPERTREE problem restricted to p-trees (pSCT). We also show that both problems are hard for the parameterized complexity classes W[1] where the parameter is the number of input trees or sequences. We prove fixedparameter tractability for pSCS and pSCT when the k input sequences (trees) are restricted to be complete: every symbol of Σ occurs exactly once in each object and the question is whether there is a common superobject of size bounded by |Σ|+r and the parameter is the pair (k; r). We show that without this restriction, both problems are harder than DIRECTED FEEDBACK VERTEX SET, for which parameterized complexity is famously unresolved. We describe an application of the tractability result for pSCT in the study of gene duplication events, where k and r are naturally small.