Agrobacterium T-DNA integration in Arabidopsis is correlated with DNA sequence compositions that occur frequently in gene promoter regions

Richard G. Schneeberger, Ke Zhang, Tatiana Tatarinova, Max Troukhan, Shing F. Kwok, Josh Drais, Kevin Klinger, Francis Orejudos, Kimberly Macy, Amit Bhakta, James Burns, Gopal Subramanian, Jonathan Donson, Richard Flavell, Kenneth A Feldmann

Research output: Contribution to journalArticle

35 Citations (Scopus)

Abstract

Mobile insertion elements such as transposons and T-DNA generate useful genetic variation and are important tools for functional genomics studies in plants and animals. The spectrum of mutations obtained in different systems can be highly influenced by target site preferences inherent in the mechanism of DNA integration. We investigated the target site preferences of Agrobacterium T-DNA insertions in the chromosomes of the model plant Arabidopsis thaliana. The relative frequencies of insertions in genic and intergenic regions of the genome were calculated and DNA composition features associated with the insertion site flanking sequences were identified. Insertion frequencies across the genome indicate that T-strand integration is suppressed near centromeres and rDNA loci, progressively increases towards telomeres, and is highly correlated with gene density. At the gene level, T-DNA integration events show a statistically significant preference for insertion in the 5′ and 3′ flanking regions of protein coding sequences as well as the promoter region of RNA polymerase I transcribed rRNA gene repeats. The increased insertion frequencies in 5′ upstream regions compared to coding sequences are positively correlated with gene expression activity and DNA sequence composition. Analysis of the relationship between DNA sequence composition and gene activity further demonstrates that DNA sequences with high CG-skew ratios are consistently correlated with T-DNA insertion site preference and high gene expression. The results demonstrate genomic and gene-specific preferences for T-strand integration and suggest that DNA sequences with a pronounced transition in CG- and AT-skew ratios are preferred targets for T-DNA integration.

Original languageEnglish (US)
Pages (from-to)240-253
Number of pages14
JournalFunctional and Integrative Genomics
Volume5
Issue number4
DOIs
StatePublished - Oct 2005
Externally publishedYes

Fingerprint

Agrobacterium
Arabidopsis
Genetic Promoter Regions
Genes
Plant Chromosomes
Genome
3' Flanking Region
RNA Polymerase I
Gene Expression
Intergenic DNA
Centromere
5' Flanking Region
DNA
Telomere
Genomics
Ribosomal DNA
rRNA Genes
T-DNA
Mutation
Proteins

Keywords

  • Arabidopsis
  • Integration
  • Promoter
  • T-DNA

ASJC Scopus subject areas

  • Genetics

Cite this

Agrobacterium T-DNA integration in Arabidopsis is correlated with DNA sequence compositions that occur frequently in gene promoter regions. / Schneeberger, Richard G.; Zhang, Ke; Tatarinova, Tatiana; Troukhan, Max; Kwok, Shing F.; Drais, Josh; Klinger, Kevin; Orejudos, Francis; Macy, Kimberly; Bhakta, Amit; Burns, James; Subramanian, Gopal; Donson, Jonathan; Flavell, Richard; Feldmann, Kenneth A.

In: Functional and Integrative Genomics, Vol. 5, No. 4, 10.2005, p. 240-253.

Research output: Contribution to journalArticle

Schneeberger, RG, Zhang, K, Tatarinova, T, Troukhan, M, Kwok, SF, Drais, J, Klinger, K, Orejudos, F, Macy, K, Bhakta, A, Burns, J, Subramanian, G, Donson, J, Flavell, R & Feldmann, KA 2005, 'Agrobacterium T-DNA integration in Arabidopsis is correlated with DNA sequence compositions that occur frequently in gene promoter regions', Functional and Integrative Genomics, vol. 5, no. 4, pp. 240-253. https://doi.org/10.1007/s10142-005-0138-1
Schneeberger, Richard G. ; Zhang, Ke ; Tatarinova, Tatiana ; Troukhan, Max ; Kwok, Shing F. ; Drais, Josh ; Klinger, Kevin ; Orejudos, Francis ; Macy, Kimberly ; Bhakta, Amit ; Burns, James ; Subramanian, Gopal ; Donson, Jonathan ; Flavell, Richard ; Feldmann, Kenneth A. / Agrobacterium T-DNA integration in Arabidopsis is correlated with DNA sequence compositions that occur frequently in gene promoter regions. In: Functional and Integrative Genomics. 2005 ; Vol. 5, No. 4. pp. 240-253.
@article{5bd6af4b00d84754b96dc58d78e276b8,
title = "Agrobacterium T-DNA integration in Arabidopsis is correlated with DNA sequence compositions that occur frequently in gene promoter regions",
abstract = "Mobile insertion elements such as transposons and T-DNA generate useful genetic variation and are important tools for functional genomics studies in plants and animals. The spectrum of mutations obtained in different systems can be highly influenced by target site preferences inherent in the mechanism of DNA integration. We investigated the target site preferences of Agrobacterium T-DNA insertions in the chromosomes of the model plant Arabidopsis thaliana. The relative frequencies of insertions in genic and intergenic regions of the genome were calculated and DNA composition features associated with the insertion site flanking sequences were identified. Insertion frequencies across the genome indicate that T-strand integration is suppressed near centromeres and rDNA loci, progressively increases towards telomeres, and is highly correlated with gene density. At the gene level, T-DNA integration events show a statistically significant preference for insertion in the 5′ and 3′ flanking regions of protein coding sequences as well as the promoter region of RNA polymerase I transcribed rRNA gene repeats. The increased insertion frequencies in 5′ upstream regions compared to coding sequences are positively correlated with gene expression activity and DNA sequence composition. Analysis of the relationship between DNA sequence composition and gene activity further demonstrates that DNA sequences with high CG-skew ratios are consistently correlated with T-DNA insertion site preference and high gene expression. The results demonstrate genomic and gene-specific preferences for T-strand integration and suggest that DNA sequences with a pronounced transition in CG- and AT-skew ratios are preferred targets for T-DNA integration.",
keywords = "Arabidopsis, Integration, Promoter, T-DNA",
author = "Schneeberger, {Richard G.} and Ke Zhang and Tatiana Tatarinova and Max Troukhan and Kwok, {Shing F.} and Josh Drais and Kevin Klinger and Francis Orejudos and Kimberly Macy and Amit Bhakta and James Burns and Gopal Subramanian and Jonathan Donson and Richard Flavell and Feldmann, {Kenneth A}",
year = "2005",
month = "10",
doi = "10.1007/s10142-005-0138-1",
language = "English (US)",
volume = "5",
pages = "240--253",
journal = "Functional and Integrative Genomics",
issn = "1438-793X",
publisher = "Springer Verlag",
number = "4",

}

TY - JOUR

T1 - Agrobacterium T-DNA integration in Arabidopsis is correlated with DNA sequence compositions that occur frequently in gene promoter regions

AU - Schneeberger, Richard G.

AU - Zhang, Ke

AU - Tatarinova, Tatiana

AU - Troukhan, Max

AU - Kwok, Shing F.

AU - Drais, Josh

AU - Klinger, Kevin

AU - Orejudos, Francis

AU - Macy, Kimberly

AU - Bhakta, Amit

AU - Burns, James

AU - Subramanian, Gopal

AU - Donson, Jonathan

AU - Flavell, Richard

AU - Feldmann, Kenneth A

PY - 2005/10

Y1 - 2005/10

N2 - Mobile insertion elements such as transposons and T-DNA generate useful genetic variation and are important tools for functional genomics studies in plants and animals. The spectrum of mutations obtained in different systems can be highly influenced by target site preferences inherent in the mechanism of DNA integration. We investigated the target site preferences of Agrobacterium T-DNA insertions in the chromosomes of the model plant Arabidopsis thaliana. The relative frequencies of insertions in genic and intergenic regions of the genome were calculated and DNA composition features associated with the insertion site flanking sequences were identified. Insertion frequencies across the genome indicate that T-strand integration is suppressed near centromeres and rDNA loci, progressively increases towards telomeres, and is highly correlated with gene density. At the gene level, T-DNA integration events show a statistically significant preference for insertion in the 5′ and 3′ flanking regions of protein coding sequences as well as the promoter region of RNA polymerase I transcribed rRNA gene repeats. The increased insertion frequencies in 5′ upstream regions compared to coding sequences are positively correlated with gene expression activity and DNA sequence composition. Analysis of the relationship between DNA sequence composition and gene activity further demonstrates that DNA sequences with high CG-skew ratios are consistently correlated with T-DNA insertion site preference and high gene expression. The results demonstrate genomic and gene-specific preferences for T-strand integration and suggest that DNA sequences with a pronounced transition in CG- and AT-skew ratios are preferred targets for T-DNA integration.

AB - Mobile insertion elements such as transposons and T-DNA generate useful genetic variation and are important tools for functional genomics studies in plants and animals. The spectrum of mutations obtained in different systems can be highly influenced by target site preferences inherent in the mechanism of DNA integration. We investigated the target site preferences of Agrobacterium T-DNA insertions in the chromosomes of the model plant Arabidopsis thaliana. The relative frequencies of insertions in genic and intergenic regions of the genome were calculated and DNA composition features associated with the insertion site flanking sequences were identified. Insertion frequencies across the genome indicate that T-strand integration is suppressed near centromeres and rDNA loci, progressively increases towards telomeres, and is highly correlated with gene density. At the gene level, T-DNA integration events show a statistically significant preference for insertion in the 5′ and 3′ flanking regions of protein coding sequences as well as the promoter region of RNA polymerase I transcribed rRNA gene repeats. The increased insertion frequencies in 5′ upstream regions compared to coding sequences are positively correlated with gene expression activity and DNA sequence composition. Analysis of the relationship between DNA sequence composition and gene activity further demonstrates that DNA sequences with high CG-skew ratios are consistently correlated with T-DNA insertion site preference and high gene expression. The results demonstrate genomic and gene-specific preferences for T-strand integration and suggest that DNA sequences with a pronounced transition in CG- and AT-skew ratios are preferred targets for T-DNA integration.

KW - Arabidopsis

KW - Integration

KW - Promoter

KW - T-DNA

UR - http://www.scopus.com/inward/record.url?scp=26444537931&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=26444537931&partnerID=8YFLogxK

U2 - 10.1007/s10142-005-0138-1

DO - 10.1007/s10142-005-0138-1

M3 - Article

VL - 5

SP - 240

EP - 253

JO - Functional and Integrative Genomics

JF - Functional and Integrative Genomics

SN - 1438-793X

IS - 4

ER -