An informative approach on differential abundance analysis for time-course metagenomic sequencing data

Dan Luo, Sara Ziebell, Lingling An

Research output: Research - peer-reviewArticle

Abstract

Motivation: The advent of high-throughput next generation sequencing technology has greatly promoted the field of metagenomics where previously unattainable information about microbial communities can be discovered. Detecting differentially abundant features (e.g. species or genes) plays a critical role in revealing the contributors (i.e. pathogens) to the biological or medical status of microbial samples. However, currently available statistical methods lack power in detecting differentially abundant features contrasting different biological or medical conditions, in particular, for time series metagenomic sequencing data. We have proposed a novel procedure, metaDprof, which is built upon a spline-based method assuming heterogeneous error, to meet the challenges of detecting differentially abundant features from metagenomic samples by comparing different biological/medical conditions across time. It contains two stages: (i) global detection on features and (ii) time interval detection for significant features. The detection procedures in both stages are based on sound statistical support. Results: Compared with existing methods the new method metaDprof shows the best performance in comprehensive simulation studies. Not only can it accurately detect features relating to the biological condition or disease status of samples but it also can accurately detect the starting and ending time points when the differences arise. The proposed method is also applied to a real metagenomic dataset and the results provide an interesting angle to understand the relationship between the microbiota in mouse gut and diet type.

LanguageEnglish (US)
Pages1286-1292
Number of pages7
JournalBioinformatics
Volume33
Issue number9
DOIs
StatePublished - May 1 2017

Fingerprint

Pathogens
Nutrition
Splines
Time series
Statistical methods
Genes
Throughput
Acoustic waves
Sequencing
Metagenomics
Statistical method
High Throughput
Spline
Mouse
Simulation Study
Gene
Angle
Interval
Community
Sound

ASJC Scopus subject areas

  • Statistics and Probability
  • Medicine(all)
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

An informative approach on differential abundance analysis for time-course metagenomic sequencing data. / Luo, Dan; Ziebell, Sara; An, Lingling.

In: Bioinformatics, Vol. 33, No. 9, 01.05.2017, p. 1286-1292.

Research output: Research - peer-reviewArticle

@article{ca2bbe2d2ff24f2bb04fe239d4e0a7db,
title = "An informative approach on differential abundance analysis for time-course metagenomic sequencing data",
abstract = "Motivation: The advent of high-throughput next generation sequencing technology has greatly promoted the field of metagenomics where previously unattainable information about microbial communities can be discovered. Detecting differentially abundant features (e.g. species or genes) plays a critical role in revealing the contributors (i.e. pathogens) to the biological or medical status of microbial samples. However, currently available statistical methods lack power in detecting differentially abundant features contrasting different biological or medical conditions, in particular, for time series metagenomic sequencing data. We have proposed a novel procedure, metaDprof, which is built upon a spline-based method assuming heterogeneous error, to meet the challenges of detecting differentially abundant features from metagenomic samples by comparing different biological/medical conditions across time. It contains two stages: (i) global detection on features and (ii) time interval detection for significant features. The detection procedures in both stages are based on sound statistical support. Results: Compared with existing methods the new method metaDprof shows the best performance in comprehensive simulation studies. Not only can it accurately detect features relating to the biological condition or disease status of samples but it also can accurately detect the starting and ending time points when the differences arise. The proposed method is also applied to a real metagenomic dataset and the results provide an interesting angle to understand the relationship between the microbiota in mouse gut and diet type.",
author = "Dan Luo and Sara Ziebell and Lingling An",
year = "2017",
month = "5",
doi = "10.1093/bioinformatics/btw828",
volume = "33",
pages = "1286--1292",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "9",

}

TY - JOUR

T1 - An informative approach on differential abundance analysis for time-course metagenomic sequencing data

AU - Luo,Dan

AU - Ziebell,Sara

AU - An,Lingling

PY - 2017/5/1

Y1 - 2017/5/1

N2 - Motivation: The advent of high-throughput next generation sequencing technology has greatly promoted the field of metagenomics where previously unattainable information about microbial communities can be discovered. Detecting differentially abundant features (e.g. species or genes) plays a critical role in revealing the contributors (i.e. pathogens) to the biological or medical status of microbial samples. However, currently available statistical methods lack power in detecting differentially abundant features contrasting different biological or medical conditions, in particular, for time series metagenomic sequencing data. We have proposed a novel procedure, metaDprof, which is built upon a spline-based method assuming heterogeneous error, to meet the challenges of detecting differentially abundant features from metagenomic samples by comparing different biological/medical conditions across time. It contains two stages: (i) global detection on features and (ii) time interval detection for significant features. The detection procedures in both stages are based on sound statistical support. Results: Compared with existing methods the new method metaDprof shows the best performance in comprehensive simulation studies. Not only can it accurately detect features relating to the biological condition or disease status of samples but it also can accurately detect the starting and ending time points when the differences arise. The proposed method is also applied to a real metagenomic dataset and the results provide an interesting angle to understand the relationship between the microbiota in mouse gut and diet type.

AB - Motivation: The advent of high-throughput next generation sequencing technology has greatly promoted the field of metagenomics where previously unattainable information about microbial communities can be discovered. Detecting differentially abundant features (e.g. species or genes) plays a critical role in revealing the contributors (i.e. pathogens) to the biological or medical status of microbial samples. However, currently available statistical methods lack power in detecting differentially abundant features contrasting different biological or medical conditions, in particular, for time series metagenomic sequencing data. We have proposed a novel procedure, metaDprof, which is built upon a spline-based method assuming heterogeneous error, to meet the challenges of detecting differentially abundant features from metagenomic samples by comparing different biological/medical conditions across time. It contains two stages: (i) global detection on features and (ii) time interval detection for significant features. The detection procedures in both stages are based on sound statistical support. Results: Compared with existing methods the new method metaDprof shows the best performance in comprehensive simulation studies. Not only can it accurately detect features relating to the biological condition or disease status of samples but it also can accurately detect the starting and ending time points when the differences arise. The proposed method is also applied to a real metagenomic dataset and the results provide an interesting angle to understand the relationship between the microbiota in mouse gut and diet type.

UR - http://www.scopus.com/inward/record.url?scp=85019690003&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85019690003&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btw828

DO - 10.1093/bioinformatics/btw828

M3 - Article

VL - 33

SP - 1286

EP - 1292

JO - Bioinformatics

T2 - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 9

ER -