A novel approach to detecting and measuring recombination: New insights into evolution in viruses, bacteria, and mitochondria

Research output: Contribution to journalArticle

97 Scopus citations

Abstract

An accurate estimate of the extent of recombination is important whenever phylogenetic methods are applied to potentially recombining nucleotide sequences. Here, data sets from viruses, bacteria, and mitochondria were examined for deviations from clonality using a new approach for detecting and measuring recombination. The apparent rate heterogeneity (ARH) among sites in a sequence alignment can be inflated as an artifact of recombination. However, the composition of polymorphic sites will differ in a data set with recombination-generated ARH versus a clonal data set that exhibits the equivalent degree of rate heterogeneity. This is because recombinant data sets, encompassing regions of conflicting phylogenetic history, tend to yield "starlike" trees that are superficially similar to those inferred from clonal data sets with weak phylogenetic signal throughout. Specifically, a recombinant data set will be unexpectedly rich in conflicting phylogenetic information compared with clonally generated data sets supporting the same tree shape. Its value of q - Defined as the proportion of two-state parsimony-informative sites to all polymorphic sites - Will be greater than that expected for nonrecombinant data. The method proposed here, the informative-sites test, compares the value of q against a null distribution of values found using Monte Carlo-simulated data evolved under the null hypothesis of clonality. A significant excess of q indicates that the assumption of clonality is not valid and hence that the ARH in the data is at least partly an artifact of recombination. Investigations of the procedure using simulated sequences indicated that it can successfully detect and measure recombination and that it is unlikely to produce "false positives." Simulations also showed that for recombinant data, naïve use of maximum-likelihood models incorporating rate heterogeneity can lead to overestimation of the time to the most recent common ancestor. Application of the test to real data revealed for the first time that populations of viruses, like those of bacteria, can be brought close to complete linkage equilibrium by pervasive recombination. On the other hand, the test did not reject the hypothesis of clonality when applied to a data set from the coding region of human mitochondrial DNA, despite its high level of ARH and homoplasy.

Original languageEnglish (US)
Pages (from-to)1425-1434
Number of pages10
JournalMolecular biology and evolution
Volume18
Issue number8
DOIs
StatePublished - Jan 1 2001
Externally publishedYes

    Fingerprint

Keywords

  • Clonal
  • GB virus C
  • Maximum likelihood
  • Mitochondria
  • Rate heterogeneity
  • Recombination

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics

Cite this