The maximum weight trace problem in multiple sequence alignment

Research output: Chapter in Book/Report/Conference proceedingConference contribution

84 Citations (Scopus)

Abstract

We define a new problem in multiple sequence alignment, called maximum weight trace. The problem formalizes in a natural way the common practice of merging pairwise alignments to form multiple sequence alignments, and contains a version of the minimum sum of pairs alignment problem as a special case. Informally, the input is a set of pairs of matched characters from the sequences; each pair has an associated weight. The output is a subset of the pairs of maximum total weight that satisfies the following property: there is a multiple alignment that places each pair of characters selected by the subset together in the same column. A set of pairs with this property is called a trace. Intuitively a trace of maximum weight specifies a multiple alignment that agrees as much as possible with the character matches of the input. We develop a branch and bound algorithm for maximum weight trace. Though the problem is NP-complete, an implementation of the algorithm shows we can solve instances on as many as 6 sequences of length 250 in a few minutes. These are among the largest instances that have been solved to optimality to date for any formulation of multiple sequence alignment.

Original languageEnglish (US)
Title of host publicationCombinatorial Pattern Matching - 4th Annual Symposium, CPM 1993, Proceedings
PublisherSpringer Verlag
Pages106-119
Number of pages14
Volume684 LNCS
ISBN (Print)9783540567646
StatePublished - 1993
Externally publishedYes
Event4th Annual Symposium on Combinatorial Pattern Matching, CPM 1993 - Padova, Italy
Duration: Jun 2 1993Jun 4 1993

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume684 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other4th Annual Symposium on Combinatorial Pattern Matching, CPM 1993
CountryItaly
CityPadova
Period6/2/936/4/93

Fingerprint

Multiple Sequence Alignment
Trace
Alignment
Subset
Branch and Bound Algorithm
Merging
Pairwise
Optimality
NP-complete problem
Computational complexity
Formulation
Output
Character

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Kececioglu, J. D. (1993). The maximum weight trace problem in multiple sequence alignment. In Combinatorial Pattern Matching - 4th Annual Symposium, CPM 1993, Proceedings (Vol. 684 LNCS, pp. 106-119). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 684 LNCS). Springer Verlag.

The maximum weight trace problem in multiple sequence alignment. / Kececioglu, John D.

Combinatorial Pattern Matching - 4th Annual Symposium, CPM 1993, Proceedings. Vol. 684 LNCS Springer Verlag, 1993. p. 106-119 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 684 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kececioglu, JD 1993, The maximum weight trace problem in multiple sequence alignment. in Combinatorial Pattern Matching - 4th Annual Symposium, CPM 1993, Proceedings. vol. 684 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 684 LNCS, Springer Verlag, pp. 106-119, 4th Annual Symposium on Combinatorial Pattern Matching, CPM 1993, Padova, Italy, 6/2/93.
Kececioglu JD. The maximum weight trace problem in multiple sequence alignment. In Combinatorial Pattern Matching - 4th Annual Symposium, CPM 1993, Proceedings. Vol. 684 LNCS. Springer Verlag. 1993. p. 106-119. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Kececioglu, John D. / The maximum weight trace problem in multiple sequence alignment. Combinatorial Pattern Matching - 4th Annual Symposium, CPM 1993, Proceedings. Vol. 684 LNCS Springer Verlag, 1993. pp. 106-119 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{bdd9934eda05499684203bb848425deb,
title = "The maximum weight trace problem in multiple sequence alignment",
abstract = "We define a new problem in multiple sequence alignment, called maximum weight trace. The problem formalizes in a natural way the common practice of merging pairwise alignments to form multiple sequence alignments, and contains a version of the minimum sum of pairs alignment problem as a special case. Informally, the input is a set of pairs of matched characters from the sequences; each pair has an associated weight. The output is a subset of the pairs of maximum total weight that satisfies the following property: there is a multiple alignment that places each pair of characters selected by the subset together in the same column. A set of pairs with this property is called a trace. Intuitively a trace of maximum weight specifies a multiple alignment that agrees as much as possible with the character matches of the input. We develop a branch and bound algorithm for maximum weight trace. Though the problem is NP-complete, an implementation of the algorithm shows we can solve instances on as many as 6 sequences of length 250 in a few minutes. These are among the largest instances that have been solved to optimality to date for any formulation of multiple sequence alignment.",
author = "Kececioglu, {John D}",
year = "1993",
language = "English (US)",
isbn = "9783540567646",
volume = "684 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "106--119",
booktitle = "Combinatorial Pattern Matching - 4th Annual Symposium, CPM 1993, Proceedings",
address = "Germany",

}

TY - GEN

T1 - The maximum weight trace problem in multiple sequence alignment

AU - Kececioglu, John D

PY - 1993

Y1 - 1993

N2 - We define a new problem in multiple sequence alignment, called maximum weight trace. The problem formalizes in a natural way the common practice of merging pairwise alignments to form multiple sequence alignments, and contains a version of the minimum sum of pairs alignment problem as a special case. Informally, the input is a set of pairs of matched characters from the sequences; each pair has an associated weight. The output is a subset of the pairs of maximum total weight that satisfies the following property: there is a multiple alignment that places each pair of characters selected by the subset together in the same column. A set of pairs with this property is called a trace. Intuitively a trace of maximum weight specifies a multiple alignment that agrees as much as possible with the character matches of the input. We develop a branch and bound algorithm for maximum weight trace. Though the problem is NP-complete, an implementation of the algorithm shows we can solve instances on as many as 6 sequences of length 250 in a few minutes. These are among the largest instances that have been solved to optimality to date for any formulation of multiple sequence alignment.

AB - We define a new problem in multiple sequence alignment, called maximum weight trace. The problem formalizes in a natural way the common practice of merging pairwise alignments to form multiple sequence alignments, and contains a version of the minimum sum of pairs alignment problem as a special case. Informally, the input is a set of pairs of matched characters from the sequences; each pair has an associated weight. The output is a subset of the pairs of maximum total weight that satisfies the following property: there is a multiple alignment that places each pair of characters selected by the subset together in the same column. A set of pairs with this property is called a trace. Intuitively a trace of maximum weight specifies a multiple alignment that agrees as much as possible with the character matches of the input. We develop a branch and bound algorithm for maximum weight trace. Though the problem is NP-complete, an implementation of the algorithm shows we can solve instances on as many as 6 sequences of length 250 in a few minutes. These are among the largest instances that have been solved to optimality to date for any formulation of multiple sequence alignment.

UR - http://www.scopus.com/inward/record.url?scp=85010105210&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85010105210&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85010105210

SN - 9783540567646

VL - 684 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 106

EP - 119

BT - Combinatorial Pattern Matching - 4th Annual Symposium, CPM 1993, Proceedings

PB - Springer Verlag

ER -