The First Plant Genome Sequence-Arabidopsis thaliana

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

The Arabidopsis thaliana genome was the first plant genome to be sequenced. The substrates for sequencing consisted of a minimum tiling path of BAC, P1, YAC, TAC and cosmid clones, anchored to the genetic map. Using these substrates, 10 contigs were developed from 1569 clones. Annotation at the time the sequence was finished identified 25,498 protein-coding genes. With the continued development of software trained on Arabidopsis genes, along with the availability of large numbers of ESTs and additional plant genome sequences, the number of annotated genes has increased. The final TAIR (TAIR10) genome annotation release contains 27,202 nuclear protein-coding genes, 4827 pseudogenes and transposable element genes and 1359 noncoding RNAs. Gene density (kb/gene) is 4.35, with 5.89 exons/gene, an average exon length of 296. nt and an average intron length of 165. nt. Gene density decreases and transposon density increases near the centromeres. Multiple splice variants have been identified for >. 60% of intron-containing genes. Arabidopsis has experienced a genome triplication and two duplication events during its evolution, giving rise to multiple segmental duplications. These polyploidizations, along with tandem and dispersed single-gene duplications, have contributed to the expansion of gene families and provided raw material for functional divergence.

Original languageEnglish (US)
Pages (from-to)91-117
Number of pages27
JournalAdvances in Botanical Research
Volume69
DOIs
StatePublished - 2014

Fingerprint

Arabidopsis thaliana
genome
genes
transposons
exons
introns
Arabidopsis
clones
pseudogenes
nuclear proteins
centromeres
gene duplication
raw materials

Keywords

  • Arabidopsis thaliana
  • CDNAs
  • Gene number
  • Genome sequence
  • Protein families
  • Segmental duplications
  • Tandem duplications

ASJC Scopus subject areas

  • Plant Science

Cite this

The First Plant Genome Sequence-Arabidopsis thaliana. / Feldmann, Kenneth A; Goff, Stephen A.

In: Advances in Botanical Research, Vol. 69, 2014, p. 91-117.

Research output: Contribution to journalArticle

@article{3669dd2db66f42ae8052219595a69358,
title = "The First Plant Genome Sequence-Arabidopsis thaliana",
abstract = "The Arabidopsis thaliana genome was the first plant genome to be sequenced. The substrates for sequencing consisted of a minimum tiling path of BAC, P1, YAC, TAC and cosmid clones, anchored to the genetic map. Using these substrates, 10 contigs were developed from 1569 clones. Annotation at the time the sequence was finished identified 25,498 protein-coding genes. With the continued development of software trained on Arabidopsis genes, along with the availability of large numbers of ESTs and additional plant genome sequences, the number of annotated genes has increased. The final TAIR (TAIR10) genome annotation release contains 27,202 nuclear protein-coding genes, 4827 pseudogenes and transposable element genes and 1359 noncoding RNAs. Gene density (kb/gene) is 4.35, with 5.89 exons/gene, an average exon length of 296. nt and an average intron length of 165. nt. Gene density decreases and transposon density increases near the centromeres. Multiple splice variants have been identified for >. 60{\%} of intron-containing genes. Arabidopsis has experienced a genome triplication and two duplication events during its evolution, giving rise to multiple segmental duplications. These polyploidizations, along with tandem and dispersed single-gene duplications, have contributed to the expansion of gene families and provided raw material for functional divergence.",
keywords = "Arabidopsis thaliana, CDNAs, Gene number, Genome sequence, Protein families, Segmental duplications, Tandem duplications",
author = "Feldmann, {Kenneth A} and Goff, {Stephen A}",
year = "2014",
doi = "10.1016/B978-0-12-417163-3.00004-4",
language = "English (US)",
volume = "69",
pages = "91--117",
journal = "Advances in Botanical Research",
issn = "0065-2296",
publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - The First Plant Genome Sequence-Arabidopsis thaliana

AU - Feldmann, Kenneth A

AU - Goff, Stephen A

PY - 2014

Y1 - 2014

N2 - The Arabidopsis thaliana genome was the first plant genome to be sequenced. The substrates for sequencing consisted of a minimum tiling path of BAC, P1, YAC, TAC and cosmid clones, anchored to the genetic map. Using these substrates, 10 contigs were developed from 1569 clones. Annotation at the time the sequence was finished identified 25,498 protein-coding genes. With the continued development of software trained on Arabidopsis genes, along with the availability of large numbers of ESTs and additional plant genome sequences, the number of annotated genes has increased. The final TAIR (TAIR10) genome annotation release contains 27,202 nuclear protein-coding genes, 4827 pseudogenes and transposable element genes and 1359 noncoding RNAs. Gene density (kb/gene) is 4.35, with 5.89 exons/gene, an average exon length of 296. nt and an average intron length of 165. nt. Gene density decreases and transposon density increases near the centromeres. Multiple splice variants have been identified for >. 60% of intron-containing genes. Arabidopsis has experienced a genome triplication and two duplication events during its evolution, giving rise to multiple segmental duplications. These polyploidizations, along with tandem and dispersed single-gene duplications, have contributed to the expansion of gene families and provided raw material for functional divergence.

AB - The Arabidopsis thaliana genome was the first plant genome to be sequenced. The substrates for sequencing consisted of a minimum tiling path of BAC, P1, YAC, TAC and cosmid clones, anchored to the genetic map. Using these substrates, 10 contigs were developed from 1569 clones. Annotation at the time the sequence was finished identified 25,498 protein-coding genes. With the continued development of software trained on Arabidopsis genes, along with the availability of large numbers of ESTs and additional plant genome sequences, the number of annotated genes has increased. The final TAIR (TAIR10) genome annotation release contains 27,202 nuclear protein-coding genes, 4827 pseudogenes and transposable element genes and 1359 noncoding RNAs. Gene density (kb/gene) is 4.35, with 5.89 exons/gene, an average exon length of 296. nt and an average intron length of 165. nt. Gene density decreases and transposon density increases near the centromeres. Multiple splice variants have been identified for >. 60% of intron-containing genes. Arabidopsis has experienced a genome triplication and two duplication events during its evolution, giving rise to multiple segmental duplications. These polyploidizations, along with tandem and dispersed single-gene duplications, have contributed to the expansion of gene families and provided raw material for functional divergence.

KW - Arabidopsis thaliana

KW - CDNAs

KW - Gene number

KW - Genome sequence

KW - Protein families

KW - Segmental duplications

KW - Tandem duplications

UR - http://www.scopus.com/inward/record.url?scp=84890672092&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84890672092&partnerID=8YFLogxK

U2 - 10.1016/B978-0-12-417163-3.00004-4

DO - 10.1016/B978-0-12-417163-3.00004-4

M3 - Article

AN - SCOPUS:84890672092

VL - 69

SP - 91

EP - 117

JO - Advances in Botanical Research

JF - Advances in Botanical Research

SN - 0065-2296

ER -