Statistical approach of functional profiling for a microbial community

Lingling An, Nauromal Pookhao, Hongmei Jiang, Jiannong Xu

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

BACKGROUND: Metagenomics is a relatively new but fast growing field within environmental biology and medical sciences. It enables researchers to understand the diversity of microbes, their functions, cooperation, and evolution in a particular ecosystem. Traditional methods in genomics and microbiology are not efficient in capturing the structure of the microbial community in an environment. Nowadays, high-throughput next-generation sequencing technologies are powerfully driving the metagenomic studies. However, there is an urgent need to develop efficient statistical methods and computational algorithms to rapidly analyze the massive metagenomic short sequencing data and to accurately detect the features/functions present in the microbial community. Although several issues about functions of metagenomes at pathways or subsystems level have been investigated, there is a lack of studies focusing on functional analysis at a low level of a hierarchical functional tree, such as SEED subsystem tree.

RESULTS: A two-step statistical procedure (metaFunction) is proposed to detect all possible functional roles at the low level from a metagenomic sample/community. In the first step a statistical mixture model is proposed at the base of gene codons to estimate the abundances for the candidate functional roles, with sequencing error being considered. As a gene could be involved in multiple biological processes the functional assignment is therefore adjusted by utilizing an error distribution in the second step. The performance of the proposed procedure is evaluated through comprehensive simulation studies. Compared with other existing methods in metagenomic functional analysis the new approach is more accurate in assigning reads to functional roles, and therefore at more general levels. The method is also employed to analyze two real data sets.

CONCLUSIONS: metaFunction is a powerful tool in accurate profiling functions in a metagenomic sample.

Original languageEnglish (US)
Pages (from-to)e106588
JournalPLoS One
Volume9
Issue number9
DOIs
StatePublished - 2014

Fingerprint

Metagenomics
microbial communities
Functional analysis
Genes
Microbiology
Metagenome
Ecosystems
Biological Phenomena
Statistical methods
medical sciences
Throughput
Statistical Models
Genomics
Codon
Ecosystem
microbiology
codons
Research Personnel
statistical analysis
Technology

ASJC Scopus subject areas

  • Medicine(all)

Cite this

Statistical approach of functional profiling for a microbial community. / An, Lingling; Pookhao, Nauromal; Jiang, Hongmei; Xu, Jiannong.

In: PLoS One, Vol. 9, No. 9, 2014, p. e106588.

Research output: Contribution to journalArticle

An, Lingling ; Pookhao, Nauromal ; Jiang, Hongmei ; Xu, Jiannong. / Statistical approach of functional profiling for a microbial community. In: PLoS One. 2014 ; Vol. 9, No. 9. pp. e106588.
@article{366878cbb97f444388af2473d44d8e02,
title = "Statistical approach of functional profiling for a microbial community",
abstract = "BACKGROUND: Metagenomics is a relatively new but fast growing field within environmental biology and medical sciences. It enables researchers to understand the diversity of microbes, their functions, cooperation, and evolution in a particular ecosystem. Traditional methods in genomics and microbiology are not efficient in capturing the structure of the microbial community in an environment. Nowadays, high-throughput next-generation sequencing technologies are powerfully driving the metagenomic studies. However, there is an urgent need to develop efficient statistical methods and computational algorithms to rapidly analyze the massive metagenomic short sequencing data and to accurately detect the features/functions present in the microbial community. Although several issues about functions of metagenomes at pathways or subsystems level have been investigated, there is a lack of studies focusing on functional analysis at a low level of a hierarchical functional tree, such as SEED subsystem tree.RESULTS: A two-step statistical procedure (metaFunction) is proposed to detect all possible functional roles at the low level from a metagenomic sample/community. In the first step a statistical mixture model is proposed at the base of gene codons to estimate the abundances for the candidate functional roles, with sequencing error being considered. As a gene could be involved in multiple biological processes the functional assignment is therefore adjusted by utilizing an error distribution in the second step. The performance of the proposed procedure is evaluated through comprehensive simulation studies. Compared with other existing methods in metagenomic functional analysis the new approach is more accurate in assigning reads to functional roles, and therefore at more general levels. The method is also employed to analyze two real data sets.CONCLUSIONS: metaFunction is a powerful tool in accurate profiling functions in a metagenomic sample.",
author = "Lingling An and Nauromal Pookhao and Hongmei Jiang and Jiannong Xu",
year = "2014",
doi = "10.1371/journal.pone.0106588",
language = "English (US)",
volume = "9",
pages = "e106588",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "9",

}

TY - JOUR

T1 - Statistical approach of functional profiling for a microbial community

AU - An, Lingling

AU - Pookhao, Nauromal

AU - Jiang, Hongmei

AU - Xu, Jiannong

PY - 2014

Y1 - 2014

N2 - BACKGROUND: Metagenomics is a relatively new but fast growing field within environmental biology and medical sciences. It enables researchers to understand the diversity of microbes, their functions, cooperation, and evolution in a particular ecosystem. Traditional methods in genomics and microbiology are not efficient in capturing the structure of the microbial community in an environment. Nowadays, high-throughput next-generation sequencing technologies are powerfully driving the metagenomic studies. However, there is an urgent need to develop efficient statistical methods and computational algorithms to rapidly analyze the massive metagenomic short sequencing data and to accurately detect the features/functions present in the microbial community. Although several issues about functions of metagenomes at pathways or subsystems level have been investigated, there is a lack of studies focusing on functional analysis at a low level of a hierarchical functional tree, such as SEED subsystem tree.RESULTS: A two-step statistical procedure (metaFunction) is proposed to detect all possible functional roles at the low level from a metagenomic sample/community. In the first step a statistical mixture model is proposed at the base of gene codons to estimate the abundances for the candidate functional roles, with sequencing error being considered. As a gene could be involved in multiple biological processes the functional assignment is therefore adjusted by utilizing an error distribution in the second step. The performance of the proposed procedure is evaluated through comprehensive simulation studies. Compared with other existing methods in metagenomic functional analysis the new approach is more accurate in assigning reads to functional roles, and therefore at more general levels. The method is also employed to analyze two real data sets.CONCLUSIONS: metaFunction is a powerful tool in accurate profiling functions in a metagenomic sample.

AB - BACKGROUND: Metagenomics is a relatively new but fast growing field within environmental biology and medical sciences. It enables researchers to understand the diversity of microbes, their functions, cooperation, and evolution in a particular ecosystem. Traditional methods in genomics and microbiology are not efficient in capturing the structure of the microbial community in an environment. Nowadays, high-throughput next-generation sequencing technologies are powerfully driving the metagenomic studies. However, there is an urgent need to develop efficient statistical methods and computational algorithms to rapidly analyze the massive metagenomic short sequencing data and to accurately detect the features/functions present in the microbial community. Although several issues about functions of metagenomes at pathways or subsystems level have been investigated, there is a lack of studies focusing on functional analysis at a low level of a hierarchical functional tree, such as SEED subsystem tree.RESULTS: A two-step statistical procedure (metaFunction) is proposed to detect all possible functional roles at the low level from a metagenomic sample/community. In the first step a statistical mixture model is proposed at the base of gene codons to estimate the abundances for the candidate functional roles, with sequencing error being considered. As a gene could be involved in multiple biological processes the functional assignment is therefore adjusted by utilizing an error distribution in the second step. The performance of the proposed procedure is evaluated through comprehensive simulation studies. Compared with other existing methods in metagenomic functional analysis the new approach is more accurate in assigning reads to functional roles, and therefore at more general levels. The method is also employed to analyze two real data sets.CONCLUSIONS: metaFunction is a powerful tool in accurate profiling functions in a metagenomic sample.

UR - http://www.scopus.com/inward/record.url?scp=84929942685&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84929942685&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0106588

DO - 10.1371/journal.pone.0106588

M3 - Article

C2 - 25198674

AN - SCOPUS:84929942685

VL - 9

SP - e106588

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 9

ER -