### Abstract

Motivated by the problem in computational biology of reconstructing the series of chromosome inversions by which one organism evolved from another, we consider the problem of computing the shortest series of reversals that transform one permutation to another. The permutations describe the order of genes on corresponding chromosomes, and a reversal takes an arbitrary substring of elements, and reverses their order. For this problem, we develop two algorithms: a greedy approximation algorithm, that finds a solution provably close to optimal in O(n^{ 2}) time and 0(n) space for n-element permutations, and a branch- and-bound exact algorithm, that finds an optimal solution in 0(mL(n, n)) time and 0(n^{ 2}) space, where m is the size of the branch- and-bound search tree, and L(n, n) is the time to solve a linear program of n variables and n constraints. The greedy algorithm is the first to come within a constant factor of the optimum; it guarantees a solution that uses no more than twice the minimum number of reversals. The lower and upper bounds of the branch- and-bound algorithm are a novel application of maximum-weight matchings, shortest paths, and linear programming. In a series of experiments, we study the performance of an implementation on random permutations, and permutations generated by random reversals. For permutations differing by k random reversals, we find that the average upper bound on reversal distance estimates k to within one reversal for k<1/2n and n<100. For the difficult case of random permutations, we find that the average difference between the upper and lower bounds is less than three reversals for n<50. Due to the tightness of these bounds, we can solve, to optimality, problems on 30 elements in a few minutes of computer time. This approaches the scale of mitochondrial genomes.

Original language | English (US) |
---|---|

Pages (from-to) | 180-210 |

Number of pages | 31 |

Journal | Algorithmica |

Volume | 13 |

Issue number | 1-2 |

DOIs | |

State | Published - Feb 1995 |

Externally published | Yes |

### Fingerprint

### Keywords

- Approximation algorithms
- Branch- and-bound algorithms
- Chromosome inversions
- Computational biology
- Edit distance
- Experimental analysis of algorithms
- Genome rearrangements
- Permutations
- Sorting by reversals

### ASJC Scopus subject areas

- Applied Mathematics
- Safety, Risk, Reliability and Quality
- Software
- Computer Graphics and Computer-Aided Design
- Computer Science Applications
- Computer Science(all)

### Cite this

*Algorithmica*,

*13*(1-2), 180-210. https://doi.org/10.1007/BF01188586

**Exact and approximation algorithms for sorting by reversals, with application to genome rearrangement.** / Kececioglu, John D; Sankoff, D.

Research output: Contribution to journal › Article

*Algorithmica*, vol. 13, no. 1-2, pp. 180-210. https://doi.org/10.1007/BF01188586

}

TY - JOUR

T1 - Exact and approximation algorithms for sorting by reversals, with application to genome rearrangement

AU - Kececioglu, John D

AU - Sankoff, D.

PY - 1995/2

Y1 - 1995/2

N2 - Motivated by the problem in computational biology of reconstructing the series of chromosome inversions by which one organism evolved from another, we consider the problem of computing the shortest series of reversals that transform one permutation to another. The permutations describe the order of genes on corresponding chromosomes, and a reversal takes an arbitrary substring of elements, and reverses their order. For this problem, we develop two algorithms: a greedy approximation algorithm, that finds a solution provably close to optimal in O(n 2) time and 0(n) space for n-element permutations, and a branch- and-bound exact algorithm, that finds an optimal solution in 0(mL(n, n)) time and 0(n 2) space, where m is the size of the branch- and-bound search tree, and L(n, n) is the time to solve a linear program of n variables and n constraints. The greedy algorithm is the first to come within a constant factor of the optimum; it guarantees a solution that uses no more than twice the minimum number of reversals. The lower and upper bounds of the branch- and-bound algorithm are a novel application of maximum-weight matchings, shortest paths, and linear programming. In a series of experiments, we study the performance of an implementation on random permutations, and permutations generated by random reversals. For permutations differing by k random reversals, we find that the average upper bound on reversal distance estimates k to within one reversal for k<1/2n and n<100. For the difficult case of random permutations, we find that the average difference between the upper and lower bounds is less than three reversals for n<50. Due to the tightness of these bounds, we can solve, to optimality, problems on 30 elements in a few minutes of computer time. This approaches the scale of mitochondrial genomes.

AB - Motivated by the problem in computational biology of reconstructing the series of chromosome inversions by which one organism evolved from another, we consider the problem of computing the shortest series of reversals that transform one permutation to another. The permutations describe the order of genes on corresponding chromosomes, and a reversal takes an arbitrary substring of elements, and reverses their order. For this problem, we develop two algorithms: a greedy approximation algorithm, that finds a solution provably close to optimal in O(n 2) time and 0(n) space for n-element permutations, and a branch- and-bound exact algorithm, that finds an optimal solution in 0(mL(n, n)) time and 0(n 2) space, where m is the size of the branch- and-bound search tree, and L(n, n) is the time to solve a linear program of n variables and n constraints. The greedy algorithm is the first to come within a constant factor of the optimum; it guarantees a solution that uses no more than twice the minimum number of reversals. The lower and upper bounds of the branch- and-bound algorithm are a novel application of maximum-weight matchings, shortest paths, and linear programming. In a series of experiments, we study the performance of an implementation on random permutations, and permutations generated by random reversals. For permutations differing by k random reversals, we find that the average upper bound on reversal distance estimates k to within one reversal for k<1/2n and n<100. For the difficult case of random permutations, we find that the average difference between the upper and lower bounds is less than three reversals for n<50. Due to the tightness of these bounds, we can solve, to optimality, problems on 30 elements in a few minutes of computer time. This approaches the scale of mitochondrial genomes.

KW - Approximation algorithms

KW - Branch- and-bound algorithms

KW - Chromosome inversions

KW - Computational biology

KW - Edit distance

KW - Experimental analysis of algorithms

KW - Genome rearrangements

KW - Permutations

KW - Sorting by reversals

UR - http://www.scopus.com/inward/record.url?scp=0029185212&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0029185212&partnerID=8YFLogxK

U2 - 10.1007/BF01188586

DO - 10.1007/BF01188586

M3 - Article

AN - SCOPUS:0029185212

VL - 13

SP - 180

EP - 210

JO - Algorithmica

JF - Algorithmica

SN - 0178-4617

IS - 1-2

ER -