Identifying a species tree subject to random lateral gene transfer

Mike Steel, Simone Linz, Daniel H. Huson, Michael Sanderson

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

A major problem for inferring species trees from gene trees is that evolutionary processes can sometimes favor gene tree topologies that conflict with an underlying species tree. In the case of incomplete lineage sorting, this phenomenon has recently been well-studied, and some elegant solutions for species tree reconstruction have been proposed. One particularly simple and statistically consistent estimator of the species tree under incomplete lineage sorting is to combine three-taxon analyses, which are phylogenetically robust to incomplete lineage sorting. In this paper, we consider whether such an approach will also work under lateral gene transfer (LGT). By providing an exact analysis of some cases of this model, we show that there is a zone of inconsistency when majority-rule three-taxon gene trees are used to reconstruct species trees under LGT. However, a triplet-based approach will consistently reconstruct a species tree under models of LGT, provided that the expected number of LGT transfers is not too high. Our analysis involves a novel connection between the LGT problem and random walks on cyclic graphs. We have implemented a procedure for reconstructing trees subject to LGT or lineage sorting in settings where taxon coverage may be patchy and illustrate its use on two sample data sets.

Original languageEnglish (US)
Pages (from-to)81-93
Number of pages13
JournalJournal of Theoretical Biology
Volume322
DOIs
StatePublished - Apr 7 2013

Fingerprint

Gene transfer
Horizontal Gene Transfer
Lateral
Gene
Sorting
sorting
Genes
horizontal gene transfer
Majority Rule
genes
Topology
Consistent Estimator
topology
Inconsistency
Random walk
Coverage

Keywords

  • Modeling LGT
  • Phylogenetic tree
  • Poisson process
  • Statistical consistency

ASJC Scopus subject areas

  • Medicine(all)
  • Immunology and Microbiology(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)
  • Modeling and Simulation
  • Statistics and Probability
  • Applied Mathematics

Cite this

Identifying a species tree subject to random lateral gene transfer. / Steel, Mike; Linz, Simone; Huson, Daniel H.; Sanderson, Michael.

In: Journal of Theoretical Biology, Vol. 322, 07.04.2013, p. 81-93.

Research output: Contribution to journalArticle

Steel, Mike ; Linz, Simone ; Huson, Daniel H. ; Sanderson, Michael. / Identifying a species tree subject to random lateral gene transfer. In: Journal of Theoretical Biology. 2013 ; Vol. 322. pp. 81-93.
@article{0f3046a4d7a24431b99e7e61403e98f3,
title = "Identifying a species tree subject to random lateral gene transfer",
abstract = "A major problem for inferring species trees from gene trees is that evolutionary processes can sometimes favor gene tree topologies that conflict with an underlying species tree. In the case of incomplete lineage sorting, this phenomenon has recently been well-studied, and some elegant solutions for species tree reconstruction have been proposed. One particularly simple and statistically consistent estimator of the species tree under incomplete lineage sorting is to combine three-taxon analyses, which are phylogenetically robust to incomplete lineage sorting. In this paper, we consider whether such an approach will also work under lateral gene transfer (LGT). By providing an exact analysis of some cases of this model, we show that there is a zone of inconsistency when majority-rule three-taxon gene trees are used to reconstruct species trees under LGT. However, a triplet-based approach will consistently reconstruct a species tree under models of LGT, provided that the expected number of LGT transfers is not too high. Our analysis involves a novel connection between the LGT problem and random walks on cyclic graphs. We have implemented a procedure for reconstructing trees subject to LGT or lineage sorting in settings where taxon coverage may be patchy and illustrate its use on two sample data sets.",
keywords = "Modeling LGT, Phylogenetic tree, Poisson process, Statistical consistency",
author = "Mike Steel and Simone Linz and Huson, {Daniel H.} and Michael Sanderson",
year = "2013",
month = "4",
day = "7",
doi = "10.1016/j.jtbi.2013.01.009",
language = "English (US)",
volume = "322",
pages = "81--93",
journal = "Journal of Theoretical Biology",
issn = "0022-5193",
publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Identifying a species tree subject to random lateral gene transfer

AU - Steel, Mike

AU - Linz, Simone

AU - Huson, Daniel H.

AU - Sanderson, Michael

PY - 2013/4/7

Y1 - 2013/4/7

N2 - A major problem for inferring species trees from gene trees is that evolutionary processes can sometimes favor gene tree topologies that conflict with an underlying species tree. In the case of incomplete lineage sorting, this phenomenon has recently been well-studied, and some elegant solutions for species tree reconstruction have been proposed. One particularly simple and statistically consistent estimator of the species tree under incomplete lineage sorting is to combine three-taxon analyses, which are phylogenetically robust to incomplete lineage sorting. In this paper, we consider whether such an approach will also work under lateral gene transfer (LGT). By providing an exact analysis of some cases of this model, we show that there is a zone of inconsistency when majority-rule three-taxon gene trees are used to reconstruct species trees under LGT. However, a triplet-based approach will consistently reconstruct a species tree under models of LGT, provided that the expected number of LGT transfers is not too high. Our analysis involves a novel connection between the LGT problem and random walks on cyclic graphs. We have implemented a procedure for reconstructing trees subject to LGT or lineage sorting in settings where taxon coverage may be patchy and illustrate its use on two sample data sets.

AB - A major problem for inferring species trees from gene trees is that evolutionary processes can sometimes favor gene tree topologies that conflict with an underlying species tree. In the case of incomplete lineage sorting, this phenomenon has recently been well-studied, and some elegant solutions for species tree reconstruction have been proposed. One particularly simple and statistically consistent estimator of the species tree under incomplete lineage sorting is to combine three-taxon analyses, which are phylogenetically robust to incomplete lineage sorting. In this paper, we consider whether such an approach will also work under lateral gene transfer (LGT). By providing an exact analysis of some cases of this model, we show that there is a zone of inconsistency when majority-rule three-taxon gene trees are used to reconstruct species trees under LGT. However, a triplet-based approach will consistently reconstruct a species tree under models of LGT, provided that the expected number of LGT transfers is not too high. Our analysis involves a novel connection between the LGT problem and random walks on cyclic graphs. We have implemented a procedure for reconstructing trees subject to LGT or lineage sorting in settings where taxon coverage may be patchy and illustrate its use on two sample data sets.

KW - Modeling LGT

KW - Phylogenetic tree

KW - Poisson process

KW - Statistical consistency

UR - http://www.scopus.com/inward/record.url?scp=84874438887&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874438887&partnerID=8YFLogxK

U2 - 10.1016/j.jtbi.2013.01.009

DO - 10.1016/j.jtbi.2013.01.009

M3 - Article

VL - 322

SP - 81

EP - 93

JO - Journal of Theoretical Biology

JF - Journal of Theoretical Biology

SN - 0022-5193

ER -