Approximation algorithms for multiple sequence alignment under a fixed evolutionary tree

Research output: Contribution to journalArticle

11 Scopus citations

Abstract

We consider the problem of multiple sequence alignment under a fixed evolutionary tree: given a tree whose leaves are labeled by sequences, find ancestral sequences to label its internal nodes so as to minimize the total length of the tree, where the length of an edge is the edit distance between the sequences labeling its endpoints. We present a new polynomial-time approximation algorithm for this problem, and analyze its performance on regular d-ary trees with d a constant. On such a tree, the algorithm finds a solution within a factor (d + 1)/(d - 1) of the minimum in Q(kdT(d, n)+k2dn2) time, where k is the number of leaves in the tree, n is the length of the longest sequence labeling a leaf, and T(d, n) is the time to compute a Steiner point for d sequences of length at most n. (A Steiner point for a set script capital L sign of sequences is a sequence P that minimizes the sum of the edit distances from P to each sequence in script capital L sign. The time T(d, n) is O(d2dnd), given O(dsd+l)-time preprocessing for an alphabet of size s.) The approximation algorithm is conceptually simple and easy to implement, and actually applies to any metric space in which a Steiner point for any fixed-sized set can be computed in polynomial time. We also introduce a new problem, bottleneck tree-alignment, in which the objective is to label the internal nodes of the tree so as to minimize the length of the longest edge. We describe an exponential-time exact algorithm for the case of unit-cost edit operations, and show there is a simple linear-time approximation algorithm for the general case that finds a solution within a factor O(log k) of the minimum. 1998 Published by Elsevier Science B.V. All rights reserved.

Original languageEnglish (US)
Pages (from-to)355-366
Number of pages12
JournalDiscrete Applied Mathematics
Volume88
Issue number1-3
DOIs
StatePublished - Nov 9 1998
Externally publishedYes

Keywords

  • Approximation algorithms
  • Computational biology
  • Evolutionary trees
  • Multiple sequence alignment

ASJC Scopus subject areas

  • Discrete Mathematics and Combinatorics
  • Applied Mathematics

Fingerprint Dive into the research topics of 'Approximation algorithms for multiple sequence alignment under a fixed evolutionary tree'. Together they form a unique fingerprint.

  • Cite this