### Abstract

We define a new problem in multiple sequence alignment, called maximum weight trace. The problem formalizes in a natural way the common practice of merging pairwise alignments to form multiple sequence alignments, and contains a version of the minimum sum of pairs alignment problem as a special case. Informally, the input is a set of pairs of matched characters from the sequences; each pair has an associated weight. The output is a subset of the pairs of maximum total weight that satisfies the following property: there is a multiple alignment that places each pair of characters selected by the subset together in the same column. A set of pairs with this property is called a trace. Intuitively a trace of maximum weight specifies a multiple alignment that agrees as much as possible with the character matches of the input. We develop a branch and bound algorithm for maximum weight trace. Though the problem is NP-complete, an implementation of the algorithm shows we can solve instances on as many as 6 sequences of length 250 in a few minutes. These are among the largest instances that have been solved to optimality to date for any formulation of multiple sequence alignment.

Original language | English (US) |
---|---|

Title of host publication | Combinatorial Pattern Matching - 4th Annual Symposium, CPM 1993, Proceedings |

Publisher | Springer Verlag |

Pages | 106-119 |

Number of pages | 14 |

Volume | 684 LNCS |

ISBN (Print) | 9783540567646 |

State | Published - 1993 |

Externally published | Yes |

Event | 4th Annual Symposium on Combinatorial Pattern Matching, CPM 1993 - Padova, Italy Duration: Jun 2 1993 → Jun 4 1993 |

### Publication series

Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|

Volume | 684 LNCS |

ISSN (Print) | 0302-9743 |

ISSN (Electronic) | 1611-3349 |

### Other

Other | 4th Annual Symposium on Combinatorial Pattern Matching, CPM 1993 |
---|---|

Country | Italy |

City | Padova |

Period | 6/2/93 → 6/4/93 |

### Fingerprint

### ASJC Scopus subject areas

- Theoretical Computer Science
- Computer Science(all)

### Cite this

*Combinatorial Pattern Matching - 4th Annual Symposium, CPM 1993, Proceedings*(Vol. 684 LNCS, pp. 106-119). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 684 LNCS). Springer Verlag.

**The maximum weight trace problem in multiple sequence alignment.** / Kececioglu, John D.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*Combinatorial Pattern Matching - 4th Annual Symposium, CPM 1993, Proceedings.*vol. 684 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 684 LNCS, Springer Verlag, pp. 106-119, 4th Annual Symposium on Combinatorial Pattern Matching, CPM 1993, Padova, Italy, 6/2/93.

}

TY - GEN

T1 - The maximum weight trace problem in multiple sequence alignment

AU - Kececioglu, John D

PY - 1993

Y1 - 1993

N2 - We define a new problem in multiple sequence alignment, called maximum weight trace. The problem formalizes in a natural way the common practice of merging pairwise alignments to form multiple sequence alignments, and contains a version of the minimum sum of pairs alignment problem as a special case. Informally, the input is a set of pairs of matched characters from the sequences; each pair has an associated weight. The output is a subset of the pairs of maximum total weight that satisfies the following property: there is a multiple alignment that places each pair of characters selected by the subset together in the same column. A set of pairs with this property is called a trace. Intuitively a trace of maximum weight specifies a multiple alignment that agrees as much as possible with the character matches of the input. We develop a branch and bound algorithm for maximum weight trace. Though the problem is NP-complete, an implementation of the algorithm shows we can solve instances on as many as 6 sequences of length 250 in a few minutes. These are among the largest instances that have been solved to optimality to date for any formulation of multiple sequence alignment.

AB - We define a new problem in multiple sequence alignment, called maximum weight trace. The problem formalizes in a natural way the common practice of merging pairwise alignments to form multiple sequence alignments, and contains a version of the minimum sum of pairs alignment problem as a special case. Informally, the input is a set of pairs of matched characters from the sequences; each pair has an associated weight. The output is a subset of the pairs of maximum total weight that satisfies the following property: there is a multiple alignment that places each pair of characters selected by the subset together in the same column. A set of pairs with this property is called a trace. Intuitively a trace of maximum weight specifies a multiple alignment that agrees as much as possible with the character matches of the input. We develop a branch and bound algorithm for maximum weight trace. Though the problem is NP-complete, an implementation of the algorithm shows we can solve instances on as many as 6 sequences of length 250 in a few minutes. These are among the largest instances that have been solved to optimality to date for any formulation of multiple sequence alignment.

UR - http://www.scopus.com/inward/record.url?scp=85010105210&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85010105210&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85010105210

SN - 9783540567646

VL - 684 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 106

EP - 119

BT - Combinatorial Pattern Matching - 4th Annual Symposium, CPM 1993, Proceedings

PB - Springer Verlag

ER -