Extending assembly of short DNA sequences to handle error

William R. Jeck, Josephine A. Reinhardt, David A. Baltrus, Matthew T. Hickenbotham, Vincent Magrini, Elaine R. Mardis, Jeffery L. Dangl, Corbin D. Jones

Research output: Contribution to journalArticle

169 Scopus citations

Abstract

Inexpensive de novo genome sequencing, particularly in organisms with small genomes, is now possible using several new sequencing technologies. Some of these technologies such as that from Illumina's Solexa Sequencing, produce high genomic coverage by generating a very large number of small reads (∼30 bp). While prior work shows that partial assembly can be performed by k-mer extension in error-free reads, this algorithm is unsuccessful with the sequencing error rates found in practice. We present VCAKE (Verified Consensus Assembly by K-mer Extension), a modification of simple k-mer extension that overcomes error by using high depth coverage. Though it is a simple modification of a previous approach, we show significant improvements in assembly results on simulated and experimental datasets that include error.

Original languageEnglish (US)
Pages (from-to)2942-2944
Number of pages3
JournalBioinformatics
Volume23
Issue number21
DOIs
StatePublished - Nov 1 2007

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint Dive into the research topics of 'Extending assembly of short DNA sequences to handle error'. Together they form a unique fingerprint.

  • Cite this

    Jeck, W. R., Reinhardt, J. A., Baltrus, D. A., Hickenbotham, M. T., Magrini, V., Mardis, E. R., Dangl, J. L., & Jones, C. D. (2007). Extending assembly of short DNA sequences to handle error. Bioinformatics, 23(21), 2942-2944. https://doi.org/10.1093/bioinformatics/btm451