Joint estimation of gene conversion rates and mean conversion tract lengths from population SNP data

Junming Yin, Michael I. Jordan, Yun S. Song

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

Motivation: Two known types of meiotic recombination are crossovers and gene conversions. Although they leave behind different footprints in the genome, it is a challenging task to tease apart their relative contributions to the observed genetic variation. In particular, for a given population SNP dataset, the joint estimation of the crossover rate, the gene conversion rate and the mean conversion tract length is widely viewed as a very difficult problem. Results: In this article, we devise a likelihood-based method using an interleaved hidden Markov model (HMM) that can jointly estimate the aforementioned three parameters fundamental to recombination. Our method significantly improves upon a recently proposed method based on a factorial HMM. We show that modeling overlapping gene conversions is crucial for improving the joint estimation of the gene conversion rate and the mean conversion tract length. We test the performance of our method on simulated data. We then apply our method to analyze real biological data from the telomere of the X chromosome of Drosophila melanogaster, and show that the ratio of the gene conversion rate to the crossover rate for the region may not be nearly as high as previously claimed.

Original languageEnglish (US)
JournalBioinformatics
Volume25
Issue number12
DOIs
StatePublished - 2009
Externally publishedYes

Fingerprint

Gene Conversion
Single Nucleotide Polymorphism
Genes
Joints
Gene
Population
Hidden Markov models
Crossover
Genetic Recombination
Overlapping Genes
Recombination
Markov Model
Telomere
X Chromosome
Chromosomes
Drosophila melanogaster
Genetic Variation
Drosophilidae
Factorial
Genome

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Computational Mathematics
  • Statistics and Probability

Cite this

Joint estimation of gene conversion rates and mean conversion tract lengths from population SNP data. / Yin, Junming; Jordan, Michael I.; Song, Yun S.

In: Bioinformatics, Vol. 25, No. 12, 2009.

Research output: Contribution to journalArticle

@article{6ee8551578d84eba9f2cfa02fef90850,
title = "Joint estimation of gene conversion rates and mean conversion tract lengths from population SNP data",
abstract = "Motivation: Two known types of meiotic recombination are crossovers and gene conversions. Although they leave behind different footprints in the genome, it is a challenging task to tease apart their relative contributions to the observed genetic variation. In particular, for a given population SNP dataset, the joint estimation of the crossover rate, the gene conversion rate and the mean conversion tract length is widely viewed as a very difficult problem. Results: In this article, we devise a likelihood-based method using an interleaved hidden Markov model (HMM) that can jointly estimate the aforementioned three parameters fundamental to recombination. Our method significantly improves upon a recently proposed method based on a factorial HMM. We show that modeling overlapping gene conversions is crucial for improving the joint estimation of the gene conversion rate and the mean conversion tract length. We test the performance of our method on simulated data. We then apply our method to analyze real biological data from the telomere of the X chromosome of Drosophila melanogaster, and show that the ratio of the gene conversion rate to the crossover rate for the region may not be nearly as high as previously claimed.",
author = "Junming Yin and Jordan, {Michael I.} and Song, {Yun S.}",
year = "2009",
doi = "10.1093/bioinformatics/btp229",
language = "English (US)",
volume = "25",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "12",

}

TY - JOUR

T1 - Joint estimation of gene conversion rates and mean conversion tract lengths from population SNP data

AU - Yin, Junming

AU - Jordan, Michael I.

AU - Song, Yun S.

PY - 2009

Y1 - 2009

N2 - Motivation: Two known types of meiotic recombination are crossovers and gene conversions. Although they leave behind different footprints in the genome, it is a challenging task to tease apart their relative contributions to the observed genetic variation. In particular, for a given population SNP dataset, the joint estimation of the crossover rate, the gene conversion rate and the mean conversion tract length is widely viewed as a very difficult problem. Results: In this article, we devise a likelihood-based method using an interleaved hidden Markov model (HMM) that can jointly estimate the aforementioned three parameters fundamental to recombination. Our method significantly improves upon a recently proposed method based on a factorial HMM. We show that modeling overlapping gene conversions is crucial for improving the joint estimation of the gene conversion rate and the mean conversion tract length. We test the performance of our method on simulated data. We then apply our method to analyze real biological data from the telomere of the X chromosome of Drosophila melanogaster, and show that the ratio of the gene conversion rate to the crossover rate for the region may not be nearly as high as previously claimed.

AB - Motivation: Two known types of meiotic recombination are crossovers and gene conversions. Although they leave behind different footprints in the genome, it is a challenging task to tease apart their relative contributions to the observed genetic variation. In particular, for a given population SNP dataset, the joint estimation of the crossover rate, the gene conversion rate and the mean conversion tract length is widely viewed as a very difficult problem. Results: In this article, we devise a likelihood-based method using an interleaved hidden Markov model (HMM) that can jointly estimate the aforementioned three parameters fundamental to recombination. Our method significantly improves upon a recently proposed method based on a factorial HMM. We show that modeling overlapping gene conversions is crucial for improving the joint estimation of the gene conversion rate and the mean conversion tract length. We test the performance of our method on simulated data. We then apply our method to analyze real biological data from the telomere of the X chromosome of Drosophila melanogaster, and show that the ratio of the gene conversion rate to the crossover rate for the region may not be nearly as high as previously claimed.

UR - http://www.scopus.com/inward/record.url?scp=66349109784&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=66349109784&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btp229

DO - 10.1093/bioinformatics/btp229

M3 - Article

C2 - 19477993

AN - SCOPUS:66349109784

VL - 25

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 12

ER -