Inverse sequence alignment from partial examples

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

When aligning biological sequences, the choice of parameter values for the alignment scoring function is critical. Small changes in gap penalties, for example, can yield radically different alignments. A rigorous way to compute parameter values that are appropriate for biological sequences is inverse parametric sequence alignment. Given a collection of examples of biologically correct alignments, this is the problem of finding parameter values that make the example alignments score close to optimal. We extend prior work on inverse alignment to partial examples and to an improved model based on minimizing the average error of the examples. Experiments on benchmark biological alignments show we can find parameters that generalize across protein families and that boost the recovery rate for multiple sequence alignment by up to 25%.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages359-370
Number of pages12
Volume4645 LNBI
StatePublished - 2007
Event7th International Workshop on Algorithms in Bioinformatics, WABI 2007 - PhiIadelphia, PA, United States
Duration: Sep 8 2007Sep 9 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4645 LNBI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other7th International Workshop on Algorithms in Bioinformatics, WABI 2007
CountryUnited States
CityPhiIadelphia, PA
Period9/8/079/9/07

Fingerprint

Sequence Alignment
Alignment
Partial
Benchmarking
Proteins
Multiple Sequence Alignment
Scoring
Penalty
Recovery
Model-based
Benchmark
Protein
Generalise
Experiment

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Kim, E., & Kececioglu, J. D. (2007). Inverse sequence alignment from partial examples. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4645 LNBI, pp. 359-370). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4645 LNBI).

Inverse sequence alignment from partial examples. / Kim, Eagu; Kececioglu, John D.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4645 LNBI 2007. p. 359-370 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4645 LNBI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kim, E & Kececioglu, JD 2007, Inverse sequence alignment from partial examples. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4645 LNBI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4645 LNBI, pp. 359-370, 7th International Workshop on Algorithms in Bioinformatics, WABI 2007, PhiIadelphia, PA, United States, 9/8/07.
Kim E, Kececioglu JD. Inverse sequence alignment from partial examples. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4645 LNBI. 2007. p. 359-370. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Kim, Eagu ; Kececioglu, John D. / Inverse sequence alignment from partial examples. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4645 LNBI 2007. pp. 359-370 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{73520fa85cc64295b9032a1d0c006564,
title = "Inverse sequence alignment from partial examples",
abstract = "When aligning biological sequences, the choice of parameter values for the alignment scoring function is critical. Small changes in gap penalties, for example, can yield radically different alignments. A rigorous way to compute parameter values that are appropriate for biological sequences is inverse parametric sequence alignment. Given a collection of examples of biologically correct alignments, this is the problem of finding parameter values that make the example alignments score close to optimal. We extend prior work on inverse alignment to partial examples and to an improved model based on minimizing the average error of the examples. Experiments on benchmark biological alignments show we can find parameters that generalize across protein families and that boost the recovery rate for multiple sequence alignment by up to 25{\%}.",
author = "Eagu Kim and Kececioglu, {John D}",
year = "2007",
language = "English (US)",
isbn = "9783540741251",
volume = "4645 LNBI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "359--370",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Inverse sequence alignment from partial examples

AU - Kim, Eagu

AU - Kececioglu, John D

PY - 2007

Y1 - 2007

N2 - When aligning biological sequences, the choice of parameter values for the alignment scoring function is critical. Small changes in gap penalties, for example, can yield radically different alignments. A rigorous way to compute parameter values that are appropriate for biological sequences is inverse parametric sequence alignment. Given a collection of examples of biologically correct alignments, this is the problem of finding parameter values that make the example alignments score close to optimal. We extend prior work on inverse alignment to partial examples and to an improved model based on minimizing the average error of the examples. Experiments on benchmark biological alignments show we can find parameters that generalize across protein families and that boost the recovery rate for multiple sequence alignment by up to 25%.

AB - When aligning biological sequences, the choice of parameter values for the alignment scoring function is critical. Small changes in gap penalties, for example, can yield radically different alignments. A rigorous way to compute parameter values that are appropriate for biological sequences is inverse parametric sequence alignment. Given a collection of examples of biologically correct alignments, this is the problem of finding parameter values that make the example alignments score close to optimal. We extend prior work on inverse alignment to partial examples and to an improved model based on minimizing the average error of the examples. Experiments on benchmark biological alignments show we can find parameters that generalize across protein families and that boost the recovery rate for multiple sequence alignment by up to 25%.

UR - http://www.scopus.com/inward/record.url?scp=37249040464&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=37249040464&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:37249040464

SN - 9783540741251

VL - 4645 LNBI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 359

EP - 370

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -