Assessment of the accuracy of matrix representation with parsimony analysis supertree construction

Olaf R P Bininda-Emonds, Michael Sanderson

Research output: Contribution to journalArticle

121 Citations (Scopus)

Abstract

Despite the growing popularity of supertree construction for combining phylogenetic information to produce more inclusive phylogenies, large-scale performance testing of this method has not been done. Through simulation, we tested the accuracy of the most widely used supertree method, matrix representation with parsimony analysis (MRP), with respect to a (maximum parsimony) total evidence solution and a known model tree. When source trees overlap completely, MRP provided a reasonable approximation of the total evidence tree; agreement was usually >85%. Performance improved slightly when using smaller, more numerous, or more congruent source trees, and especially when elements were weighted in proportion to the bootstrap frequencies of the nodes they represented on each source tree ("weighted MRP"). Although total evidence always estimated the model tree slightly better than nonweighted MRP methods, weighted MRP in turn usually out-performed total evidence slightly. When source studies were even moderately nonoverlapping (i.e., sharing only three-quarters of the taxa), the high proportion of missing data caused a loss in resolution that severely degraded the performance for all methods, including total evidence. In such cases, even combining more trees, which had positive effects elsewhere, did not improve accuracy. Instead, "seeding" the supertree or total evidence analyses with a single largely complete study improved performance substantially. This finding could be an important strategy for any studies that seek to combine phylogenetic information. Overall, our results suggest that MRP supertree construction provides a reasonable approximation of a total evidence solution and that weighted MRP should be used whenever possible.

Original languageEnglish (US)
Pages (from-to)565-579
Number of pages15
JournalSystematic Biology
Volume50
Issue number4
StatePublished - Aug 2001
Externally publishedYes

Fingerprint

parsimony analysis
matrix
phylogeny
phylogenetics
Phylogeny
seeding
methodology
sowing
method
simulation

Keywords

  • Accuracy
  • Matrix representation
  • Missing data
  • MRP
  • Phylogenetic supertrees
  • Resolution
  • Taxonomic congruence
  • Total evidence

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics

Cite this

Assessment of the accuracy of matrix representation with parsimony analysis supertree construction. / Bininda-Emonds, Olaf R P; Sanderson, Michael.

In: Systematic Biology, Vol. 50, No. 4, 08.2001, p. 565-579.

Research output: Contribution to journalArticle

@article{70c63631143042f1a59bb54d38e9a909,
title = "Assessment of the accuracy of matrix representation with parsimony analysis supertree construction",
abstract = "Despite the growing popularity of supertree construction for combining phylogenetic information to produce more inclusive phylogenies, large-scale performance testing of this method has not been done. Through simulation, we tested the accuracy of the most widely used supertree method, matrix representation with parsimony analysis (MRP), with respect to a (maximum parsimony) total evidence solution and a known model tree. When source trees overlap completely, MRP provided a reasonable approximation of the total evidence tree; agreement was usually >85{\%}. Performance improved slightly when using smaller, more numerous, or more congruent source trees, and especially when elements were weighted in proportion to the bootstrap frequencies of the nodes they represented on each source tree ({"}weighted MRP{"}). Although total evidence always estimated the model tree slightly better than nonweighted MRP methods, weighted MRP in turn usually out-performed total evidence slightly. When source studies were even moderately nonoverlapping (i.e., sharing only three-quarters of the taxa), the high proportion of missing data caused a loss in resolution that severely degraded the performance for all methods, including total evidence. In such cases, even combining more trees, which had positive effects elsewhere, did not improve accuracy. Instead, {"}seeding{"} the supertree or total evidence analyses with a single largely complete study improved performance substantially. This finding could be an important strategy for any studies that seek to combine phylogenetic information. Overall, our results suggest that MRP supertree construction provides a reasonable approximation of a total evidence solution and that weighted MRP should be used whenever possible.",
keywords = "Accuracy, Matrix representation, Missing data, MRP, Phylogenetic supertrees, Resolution, Taxonomic congruence, Total evidence",
author = "Bininda-Emonds, {Olaf R P} and Michael Sanderson",
year = "2001",
month = "8",
language = "English (US)",
volume = "50",
pages = "565--579",
journal = "Systematic Biology",
issn = "1063-5157",
publisher = "Oxford University Press",
number = "4",

}

TY - JOUR

T1 - Assessment of the accuracy of matrix representation with parsimony analysis supertree construction

AU - Bininda-Emonds, Olaf R P

AU - Sanderson, Michael

PY - 2001/8

Y1 - 2001/8

N2 - Despite the growing popularity of supertree construction for combining phylogenetic information to produce more inclusive phylogenies, large-scale performance testing of this method has not been done. Through simulation, we tested the accuracy of the most widely used supertree method, matrix representation with parsimony analysis (MRP), with respect to a (maximum parsimony) total evidence solution and a known model tree. When source trees overlap completely, MRP provided a reasonable approximation of the total evidence tree; agreement was usually >85%. Performance improved slightly when using smaller, more numerous, or more congruent source trees, and especially when elements were weighted in proportion to the bootstrap frequencies of the nodes they represented on each source tree ("weighted MRP"). Although total evidence always estimated the model tree slightly better than nonweighted MRP methods, weighted MRP in turn usually out-performed total evidence slightly. When source studies were even moderately nonoverlapping (i.e., sharing only three-quarters of the taxa), the high proportion of missing data caused a loss in resolution that severely degraded the performance for all methods, including total evidence. In such cases, even combining more trees, which had positive effects elsewhere, did not improve accuracy. Instead, "seeding" the supertree or total evidence analyses with a single largely complete study improved performance substantially. This finding could be an important strategy for any studies that seek to combine phylogenetic information. Overall, our results suggest that MRP supertree construction provides a reasonable approximation of a total evidence solution and that weighted MRP should be used whenever possible.

AB - Despite the growing popularity of supertree construction for combining phylogenetic information to produce more inclusive phylogenies, large-scale performance testing of this method has not been done. Through simulation, we tested the accuracy of the most widely used supertree method, matrix representation with parsimony analysis (MRP), with respect to a (maximum parsimony) total evidence solution and a known model tree. When source trees overlap completely, MRP provided a reasonable approximation of the total evidence tree; agreement was usually >85%. Performance improved slightly when using smaller, more numerous, or more congruent source trees, and especially when elements were weighted in proportion to the bootstrap frequencies of the nodes they represented on each source tree ("weighted MRP"). Although total evidence always estimated the model tree slightly better than nonweighted MRP methods, weighted MRP in turn usually out-performed total evidence slightly. When source studies were even moderately nonoverlapping (i.e., sharing only three-quarters of the taxa), the high proportion of missing data caused a loss in resolution that severely degraded the performance for all methods, including total evidence. In such cases, even combining more trees, which had positive effects elsewhere, did not improve accuracy. Instead, "seeding" the supertree or total evidence analyses with a single largely complete study improved performance substantially. This finding could be an important strategy for any studies that seek to combine phylogenetic information. Overall, our results suggest that MRP supertree construction provides a reasonable approximation of a total evidence solution and that weighted MRP should be used whenever possible.

KW - Accuracy

KW - Matrix representation

KW - Missing data

KW - MRP

KW - Phylogenetic supertrees

KW - Resolution

KW - Taxonomic congruence

KW - Total evidence

UR - http://www.scopus.com/inward/record.url?scp=0035527411&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035527411&partnerID=8YFLogxK

M3 - Article

C2 - 12116654

AN - SCOPUS:0035527411

VL - 50

SP - 565

EP - 579

JO - Systematic Biology

JF - Systematic Biology

SN - 1063-5157

IS - 4

ER -