Disentangling methodological and biological sources of gene tree discordance on Oryza (Poaceae) chromosome 3

Derrick J. Zwickl, Joshua C. Stein, Rod A Wing, Doreen Ware, Michael Sanderson

Research output: Contribution to journalArticle

29 Citations (Scopus)

Abstract

We describe new methods for characterizing gene tree discordance in phylogenomic data sets, which screen for deviations from neutral expectations, summarize variation in statistical support among gene trees, and allowcomparison of the patterns of discordance induced by various analysis choices. Using an exceptionally complete set of genome sequences for the short armof chromosome 3 in Oryza (rice) species,we applied these methods to identify the causes and consequences of differing patterns of discordance in the sets of gene trees inferred using a panel of 20 distinct analysis pipelines.We found that discordance patterns were strongly affected by aspects of data selection, alignment, and alignment masking. Unusual patterns of discordance evident when using certain pipelines were reduced or eliminated by using alternative pipelines, suggesting that theywere the product of methodological biases rather than evolutionary processes. In some cases, once such biaseswere eliminated, evolutionary processes such as introgression could be implicated.Additionally, patterns of gene tree discordance had significant downstream impacts on species tree inference. For example, inference from supermatrices was positivelymisleading when pipelines that led to biased gene treeswere used. Several resultsmay generalize to other data sets: we found that gene tree and species tree inference gave more reasonable results when intron sequence was included during sequence alignment and tree inference, the alignment software PRANK was used, and detectable "block-shift" alignment artifacts were removed. We discuss our findings in the context of well-established relationships in Oryza and continuing controversies regarding the domestication history of O. sativa.

Original languageEnglish (US)
Pages (from-to)645-659
Number of pages15
JournalSystematic Biology
Volume63
Issue number5
DOIs
StatePublished - 2014

Fingerprint

Chromosomes, Human, Pair 3
Oryza
Poaceae
chromosome
chromosomes
gene
Genes
genes
Sequence Alignment
domestication
introgression
sequence alignment
Artifacts
Introns
artifact
introns
Oryza sativa
Software
rice
genome

Keywords

  • Gene trees
  • Multilocus data
  • Oryza
  • Phylogenomics
  • Phylogeny reconstruction
  • Species trees

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Genetics

Cite this

Disentangling methodological and biological sources of gene tree discordance on Oryza (Poaceae) chromosome 3. / Zwickl, Derrick J.; Stein, Joshua C.; Wing, Rod A; Ware, Doreen; Sanderson, Michael.

In: Systematic Biology, Vol. 63, No. 5, 2014, p. 645-659.

Research output: Contribution to journalArticle

@article{b6421eed865547a8a692f752c4401a34,
title = "Disentangling methodological and biological sources of gene tree discordance on Oryza (Poaceae) chromosome 3",
abstract = "We describe new methods for characterizing gene tree discordance in phylogenomic data sets, which screen for deviations from neutral expectations, summarize variation in statistical support among gene trees, and allowcomparison of the patterns of discordance induced by various analysis choices. Using an exceptionally complete set of genome sequences for the short armof chromosome 3 in Oryza (rice) species,we applied these methods to identify the causes and consequences of differing patterns of discordance in the sets of gene trees inferred using a panel of 20 distinct analysis pipelines.We found that discordance patterns were strongly affected by aspects of data selection, alignment, and alignment masking. Unusual patterns of discordance evident when using certain pipelines were reduced or eliminated by using alternative pipelines, suggesting that theywere the product of methodological biases rather than evolutionary processes. In some cases, once such biaseswere eliminated, evolutionary processes such as introgression could be implicated.Additionally, patterns of gene tree discordance had significant downstream impacts on species tree inference. For example, inference from supermatrices was positivelymisleading when pipelines that led to biased gene treeswere used. Several resultsmay generalize to other data sets: we found that gene tree and species tree inference gave more reasonable results when intron sequence was included during sequence alignment and tree inference, the alignment software PRANK was used, and detectable {"}block-shift{"} alignment artifacts were removed. We discuss our findings in the context of well-established relationships in Oryza and continuing controversies regarding the domestication history of O. sativa.",
keywords = "Gene trees, Multilocus data, Oryza, Phylogenomics, Phylogeny reconstruction, Species trees",
author = "Zwickl, {Derrick J.} and Stein, {Joshua C.} and Wing, {Rod A} and Doreen Ware and Michael Sanderson",
year = "2014",
doi = "10.1093/sysbio/syu027",
language = "English (US)",
volume = "63",
pages = "645--659",
journal = "Systematic Biology",
issn = "1063-5157",
publisher = "Oxford University Press",
number = "5",

}

TY - JOUR

T1 - Disentangling methodological and biological sources of gene tree discordance on Oryza (Poaceae) chromosome 3

AU - Zwickl, Derrick J.

AU - Stein, Joshua C.

AU - Wing, Rod A

AU - Ware, Doreen

AU - Sanderson, Michael

PY - 2014

Y1 - 2014

N2 - We describe new methods for characterizing gene tree discordance in phylogenomic data sets, which screen for deviations from neutral expectations, summarize variation in statistical support among gene trees, and allowcomparison of the patterns of discordance induced by various analysis choices. Using an exceptionally complete set of genome sequences for the short armof chromosome 3 in Oryza (rice) species,we applied these methods to identify the causes and consequences of differing patterns of discordance in the sets of gene trees inferred using a panel of 20 distinct analysis pipelines.We found that discordance patterns were strongly affected by aspects of data selection, alignment, and alignment masking. Unusual patterns of discordance evident when using certain pipelines were reduced or eliminated by using alternative pipelines, suggesting that theywere the product of methodological biases rather than evolutionary processes. In some cases, once such biaseswere eliminated, evolutionary processes such as introgression could be implicated.Additionally, patterns of gene tree discordance had significant downstream impacts on species tree inference. For example, inference from supermatrices was positivelymisleading when pipelines that led to biased gene treeswere used. Several resultsmay generalize to other data sets: we found that gene tree and species tree inference gave more reasonable results when intron sequence was included during sequence alignment and tree inference, the alignment software PRANK was used, and detectable "block-shift" alignment artifacts were removed. We discuss our findings in the context of well-established relationships in Oryza and continuing controversies regarding the domestication history of O. sativa.

AB - We describe new methods for characterizing gene tree discordance in phylogenomic data sets, which screen for deviations from neutral expectations, summarize variation in statistical support among gene trees, and allowcomparison of the patterns of discordance induced by various analysis choices. Using an exceptionally complete set of genome sequences for the short armof chromosome 3 in Oryza (rice) species,we applied these methods to identify the causes and consequences of differing patterns of discordance in the sets of gene trees inferred using a panel of 20 distinct analysis pipelines.We found that discordance patterns were strongly affected by aspects of data selection, alignment, and alignment masking. Unusual patterns of discordance evident when using certain pipelines were reduced or eliminated by using alternative pipelines, suggesting that theywere the product of methodological biases rather than evolutionary processes. In some cases, once such biaseswere eliminated, evolutionary processes such as introgression could be implicated.Additionally, patterns of gene tree discordance had significant downstream impacts on species tree inference. For example, inference from supermatrices was positivelymisleading when pipelines that led to biased gene treeswere used. Several resultsmay generalize to other data sets: we found that gene tree and species tree inference gave more reasonable results when intron sequence was included during sequence alignment and tree inference, the alignment software PRANK was used, and detectable "block-shift" alignment artifacts were removed. We discuss our findings in the context of well-established relationships in Oryza and continuing controversies regarding the domestication history of O. sativa.

KW - Gene trees

KW - Multilocus data

KW - Oryza

KW - Phylogenomics

KW - Phylogeny reconstruction

KW - Species trees

UR - http://www.scopus.com/inward/record.url?scp=84906251283&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84906251283&partnerID=8YFLogxK

U2 - 10.1093/sysbio/syu027

DO - 10.1093/sysbio/syu027

M3 - Article

VL - 63

SP - 645

EP - 659

JO - Systematic Biology

JF - Systematic Biology

SN - 1063-5157

IS - 5

ER -