A polyhedral approach to sequence alignment problems

John D Kececioglu, Hans Peter Lenhof, Kurt Mehlhorn, Petra Mutzel, Knut Reinert, Martin Vingron

Research output: Contribution to journalArticle

33 Citations (Scopus)

Abstract

We study two new problems in sequence alignment both from a practical and a theoretical view, using tools from combinatorial optimization to develop branch-and-cut algorithms. The generalized maximum trace formulation captures several forms of multiple sequence alignment problems in a common framework, among them the original formulation of maximum trace. The RNA sequence alignment problem captures the comparison of RNA molecules on the basis of their primary sequence and their secondary structure. Both problems have a characterization in terms of graphs which we reformulate in terms of integer linear programming. We then study the polytopes (or convex hulls of all feasible solutions) associated with the integer linear program for both problems. For each polytope we derive several classes of facet-defining inequalities and show that for some of these classes the corresponding separation problem can be solved in polynomial time. This leads to a polynomial-time algorithm for pairwise sequence alignment that is not based on dynamic programming. Moreover, for multiple sequences the branch-and-cut algorithms for both sequence alignment problems are able to solve to optimality instances that are beyond the range of present dynamic programming approaches.

Original languageEnglish (US)
Pages (from-to)143-186
Number of pages44
JournalDiscrete Applied Mathematics
Volume104
Issue number1-3
StatePublished - Aug 15 2000
Externally publishedYes

Fingerprint

Sequence Alignment
RNA
Dynamic programming
Branch-and-cut
Polynomials
Dynamic Programming
Combinatorial optimization
Trace
Linear programming
Multiple Sequence Alignment
Formulation
Integer Linear Programming
Integer Program
Combinatorial Optimization
Secondary Structure
Polytopes
Polytope
Linear Program
Convex Hull
Facet

Keywords

  • Branch-and-cut
  • Combinatorial optimization
  • Computational biology
  • Multiple sequence alignment
  • RNA sequence alignment

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Applied Mathematics
  • Discrete Mathematics and Combinatorics
  • Theoretical Computer Science

Cite this

Kececioglu, J. D., Lenhof, H. P., Mehlhorn, K., Mutzel, P., Reinert, K., & Vingron, M. (2000). A polyhedral approach to sequence alignment problems. Discrete Applied Mathematics, 104(1-3), 143-186.

A polyhedral approach to sequence alignment problems. / Kececioglu, John D; Lenhof, Hans Peter; Mehlhorn, Kurt; Mutzel, Petra; Reinert, Knut; Vingron, Martin.

In: Discrete Applied Mathematics, Vol. 104, No. 1-3, 15.08.2000, p. 143-186.

Research output: Contribution to journalArticle

Kececioglu, JD, Lenhof, HP, Mehlhorn, K, Mutzel, P, Reinert, K & Vingron, M 2000, 'A polyhedral approach to sequence alignment problems', Discrete Applied Mathematics, vol. 104, no. 1-3, pp. 143-186.
Kececioglu JD, Lenhof HP, Mehlhorn K, Mutzel P, Reinert K, Vingron M. A polyhedral approach to sequence alignment problems. Discrete Applied Mathematics. 2000 Aug 15;104(1-3):143-186.
Kececioglu, John D ; Lenhof, Hans Peter ; Mehlhorn, Kurt ; Mutzel, Petra ; Reinert, Knut ; Vingron, Martin. / A polyhedral approach to sequence alignment problems. In: Discrete Applied Mathematics. 2000 ; Vol. 104, No. 1-3. pp. 143-186.
@article{3a11040ff7bc4f24b5695347d8fc2350,
title = "A polyhedral approach to sequence alignment problems",
abstract = "We study two new problems in sequence alignment both from a practical and a theoretical view, using tools from combinatorial optimization to develop branch-and-cut algorithms. The generalized maximum trace formulation captures several forms of multiple sequence alignment problems in a common framework, among them the original formulation of maximum trace. The RNA sequence alignment problem captures the comparison of RNA molecules on the basis of their primary sequence and their secondary structure. Both problems have a characterization in terms of graphs which we reformulate in terms of integer linear programming. We then study the polytopes (or convex hulls of all feasible solutions) associated with the integer linear program for both problems. For each polytope we derive several classes of facet-defining inequalities and show that for some of these classes the corresponding separation problem can be solved in polynomial time. This leads to a polynomial-time algorithm for pairwise sequence alignment that is not based on dynamic programming. Moreover, for multiple sequences the branch-and-cut algorithms for both sequence alignment problems are able to solve to optimality instances that are beyond the range of present dynamic programming approaches.",
keywords = "Branch-and-cut, Combinatorial optimization, Computational biology, Multiple sequence alignment, RNA sequence alignment",
author = "Kececioglu, {John D} and Lenhof, {Hans Peter} and Kurt Mehlhorn and Petra Mutzel and Knut Reinert and Martin Vingron",
year = "2000",
month = "8",
day = "15",
language = "English (US)",
volume = "104",
pages = "143--186",
journal = "Discrete Applied Mathematics",
issn = "0166-218X",
publisher = "Elsevier",
number = "1-3",

}

TY - JOUR

T1 - A polyhedral approach to sequence alignment problems

AU - Kececioglu, John D

AU - Lenhof, Hans Peter

AU - Mehlhorn, Kurt

AU - Mutzel, Petra

AU - Reinert, Knut

AU - Vingron, Martin

PY - 2000/8/15

Y1 - 2000/8/15

N2 - We study two new problems in sequence alignment both from a practical and a theoretical view, using tools from combinatorial optimization to develop branch-and-cut algorithms. The generalized maximum trace formulation captures several forms of multiple sequence alignment problems in a common framework, among them the original formulation of maximum trace. The RNA sequence alignment problem captures the comparison of RNA molecules on the basis of their primary sequence and their secondary structure. Both problems have a characterization in terms of graphs which we reformulate in terms of integer linear programming. We then study the polytopes (or convex hulls of all feasible solutions) associated with the integer linear program for both problems. For each polytope we derive several classes of facet-defining inequalities and show that for some of these classes the corresponding separation problem can be solved in polynomial time. This leads to a polynomial-time algorithm for pairwise sequence alignment that is not based on dynamic programming. Moreover, for multiple sequences the branch-and-cut algorithms for both sequence alignment problems are able to solve to optimality instances that are beyond the range of present dynamic programming approaches.

AB - We study two new problems in sequence alignment both from a practical and a theoretical view, using tools from combinatorial optimization to develop branch-and-cut algorithms. The generalized maximum trace formulation captures several forms of multiple sequence alignment problems in a common framework, among them the original formulation of maximum trace. The RNA sequence alignment problem captures the comparison of RNA molecules on the basis of their primary sequence and their secondary structure. Both problems have a characterization in terms of graphs which we reformulate in terms of integer linear programming. We then study the polytopes (or convex hulls of all feasible solutions) associated with the integer linear program for both problems. For each polytope we derive several classes of facet-defining inequalities and show that for some of these classes the corresponding separation problem can be solved in polynomial time. This leads to a polynomial-time algorithm for pairwise sequence alignment that is not based on dynamic programming. Moreover, for multiple sequences the branch-and-cut algorithms for both sequence alignment problems are able to solve to optimality instances that are beyond the range of present dynamic programming approaches.

KW - Branch-and-cut

KW - Combinatorial optimization

KW - Computational biology

KW - Multiple sequence alignment

KW - RNA sequence alignment

UR - http://www.scopus.com/inward/record.url?scp=0003321583&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0003321583&partnerID=8YFLogxK

M3 - Article

VL - 104

SP - 143

EP - 186

JO - Discrete Applied Mathematics

JF - Discrete Applied Mathematics

SN - 0166-218X

IS - 1-3

ER -