Testing in microbiome-profiling studies with MiRKAT, the microbiome regression-based kernel association test

Ni Zhao, Jun Chen, Ian M. Carroll, Tamar Ringel-Kulka, Michael P. Epstein, Hua Zhou, Jin Zhou, Yehuda Ringel, Hongzhe Li, Michael C. Wu

Research output: Contribution to journalArticle

67 Citations (Scopus)

Abstract

High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Distance-based analysis is a popular strategy for evaluating the overall association between microbiome diversity and outcome, wherein the phylogenetic distance between individuals' microbiome profiles is computed and tested for association via permutation. Despite their practical popularity, distance-based approaches suffer from important challenges, especially in selecting the best distance and extending the methods to alternative outcomes, such as survival outcomes. We propose the microbiome regression-based kernel association test (MiRKAT), which directly regresses the outcome on the microbiome profiles via the semi-parametric kernel machine regression framework. MiRKAT allows for easy covariate adjustment and extension to alternative outcomes while non-parametrically modeling the microbiome through a kernel that incorporates phylogenetic distance. It uses a variance-component score statistic to test for the association with analytical p value calculation. The model also allows simultaneous examination of multiple distances, alleviating the problem of choosing the best distance. Our simulations demonstrated that MiRKAT provides correctly controlled type I error and adequate power in detecting overall association. "Optimal" MiRKAT, which considers multiple candidate distances, is robust in that it suffers from little power loss in comparison to when the best distance is used and can achieve tremendous power gain in comparison to when a poor distance is chosen. Finally, we applied MiRKAT to real microbiome datasets to show that microbial communities are associated with smoking and with fecal protease levels after confounders are controlled for.

Original languageEnglish (US)
Pages (from-to)797-807
Number of pages11
JournalAmerican Journal of Human Genetics
Volume96
Issue number5
DOIs
StatePublished - May 7 2015

Fingerprint

Microbiota
Social Adjustment
Peptide Hydrolases
Smoking
Technology

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Cite this

Testing in microbiome-profiling studies with MiRKAT, the microbiome regression-based kernel association test. / Zhao, Ni; Chen, Jun; Carroll, Ian M.; Ringel-Kulka, Tamar; Epstein, Michael P.; Zhou, Hua; Zhou, Jin; Ringel, Yehuda; Li, Hongzhe; Wu, Michael C.

In: American Journal of Human Genetics, Vol. 96, No. 5, 07.05.2015, p. 797-807.

Research output: Contribution to journalArticle

Zhao, N, Chen, J, Carroll, IM, Ringel-Kulka, T, Epstein, MP, Zhou, H, Zhou, J, Ringel, Y, Li, H & Wu, MC 2015, 'Testing in microbiome-profiling studies with MiRKAT, the microbiome regression-based kernel association test', American Journal of Human Genetics, vol. 96, no. 5, pp. 797-807. https://doi.org/10.1016/j.ajhg.2015.04.003
Zhao, Ni ; Chen, Jun ; Carroll, Ian M. ; Ringel-Kulka, Tamar ; Epstein, Michael P. ; Zhou, Hua ; Zhou, Jin ; Ringel, Yehuda ; Li, Hongzhe ; Wu, Michael C. / Testing in microbiome-profiling studies with MiRKAT, the microbiome regression-based kernel association test. In: American Journal of Human Genetics. 2015 ; Vol. 96, No. 5. pp. 797-807.
@article{1de8e914f0eb47cebf11d3c34727b412,
title = "Testing in microbiome-profiling studies with MiRKAT, the microbiome regression-based kernel association test",
abstract = "High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Distance-based analysis is a popular strategy for evaluating the overall association between microbiome diversity and outcome, wherein the phylogenetic distance between individuals' microbiome profiles is computed and tested for association via permutation. Despite their practical popularity, distance-based approaches suffer from important challenges, especially in selecting the best distance and extending the methods to alternative outcomes, such as survival outcomes. We propose the microbiome regression-based kernel association test (MiRKAT), which directly regresses the outcome on the microbiome profiles via the semi-parametric kernel machine regression framework. MiRKAT allows for easy covariate adjustment and extension to alternative outcomes while non-parametrically modeling the microbiome through a kernel that incorporates phylogenetic distance. It uses a variance-component score statistic to test for the association with analytical p value calculation. The model also allows simultaneous examination of multiple distances, alleviating the problem of choosing the best distance. Our simulations demonstrated that MiRKAT provides correctly controlled type I error and adequate power in detecting overall association. {"}Optimal{"} MiRKAT, which considers multiple candidate distances, is robust in that it suffers from little power loss in comparison to when the best distance is used and can achieve tremendous power gain in comparison to when a poor distance is chosen. Finally, we applied MiRKAT to real microbiome datasets to show that microbial communities are associated with smoking and with fecal protease levels after confounders are controlled for.",
author = "Ni Zhao and Jun Chen and Carroll, {Ian M.} and Tamar Ringel-Kulka and Epstein, {Michael P.} and Hua Zhou and Jin Zhou and Yehuda Ringel and Hongzhe Li and Wu, {Michael C.}",
year = "2015",
month = "5",
day = "7",
doi = "10.1016/j.ajhg.2015.04.003",
language = "English (US)",
volume = "96",
pages = "797--807",
journal = "American Journal of Human Genetics",
issn = "0002-9297",
publisher = "Cell Press",
number = "5",

}

TY - JOUR

T1 - Testing in microbiome-profiling studies with MiRKAT, the microbiome regression-based kernel association test

AU - Zhao, Ni

AU - Chen, Jun

AU - Carroll, Ian M.

AU - Ringel-Kulka, Tamar

AU - Epstein, Michael P.

AU - Zhou, Hua

AU - Zhou, Jin

AU - Ringel, Yehuda

AU - Li, Hongzhe

AU - Wu, Michael C.

PY - 2015/5/7

Y1 - 2015/5/7

N2 - High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Distance-based analysis is a popular strategy for evaluating the overall association between microbiome diversity and outcome, wherein the phylogenetic distance between individuals' microbiome profiles is computed and tested for association via permutation. Despite their practical popularity, distance-based approaches suffer from important challenges, especially in selecting the best distance and extending the methods to alternative outcomes, such as survival outcomes. We propose the microbiome regression-based kernel association test (MiRKAT), which directly regresses the outcome on the microbiome profiles via the semi-parametric kernel machine regression framework. MiRKAT allows for easy covariate adjustment and extension to alternative outcomes while non-parametrically modeling the microbiome through a kernel that incorporates phylogenetic distance. It uses a variance-component score statistic to test for the association with analytical p value calculation. The model also allows simultaneous examination of multiple distances, alleviating the problem of choosing the best distance. Our simulations demonstrated that MiRKAT provides correctly controlled type I error and adequate power in detecting overall association. "Optimal" MiRKAT, which considers multiple candidate distances, is robust in that it suffers from little power loss in comparison to when the best distance is used and can achieve tremendous power gain in comparison to when a poor distance is chosen. Finally, we applied MiRKAT to real microbiome datasets to show that microbial communities are associated with smoking and with fecal protease levels after confounders are controlled for.

AB - High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Distance-based analysis is a popular strategy for evaluating the overall association between microbiome diversity and outcome, wherein the phylogenetic distance between individuals' microbiome profiles is computed and tested for association via permutation. Despite their practical popularity, distance-based approaches suffer from important challenges, especially in selecting the best distance and extending the methods to alternative outcomes, such as survival outcomes. We propose the microbiome regression-based kernel association test (MiRKAT), which directly regresses the outcome on the microbiome profiles via the semi-parametric kernel machine regression framework. MiRKAT allows for easy covariate adjustment and extension to alternative outcomes while non-parametrically modeling the microbiome through a kernel that incorporates phylogenetic distance. It uses a variance-component score statistic to test for the association with analytical p value calculation. The model also allows simultaneous examination of multiple distances, alleviating the problem of choosing the best distance. Our simulations demonstrated that MiRKAT provides correctly controlled type I error and adequate power in detecting overall association. "Optimal" MiRKAT, which considers multiple candidate distances, is robust in that it suffers from little power loss in comparison to when the best distance is used and can achieve tremendous power gain in comparison to when a poor distance is chosen. Finally, we applied MiRKAT to real microbiome datasets to show that microbial communities are associated with smoking and with fecal protease levels after confounders are controlled for.

UR - http://www.scopus.com/inward/record.url?scp=84929159912&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84929159912&partnerID=8YFLogxK

U2 - 10.1016/j.ajhg.2015.04.003

DO - 10.1016/j.ajhg.2015.04.003

M3 - Article

C2 - 25957468

AN - SCOPUS:84929159912

VL - 96

SP - 797

EP - 807

JO - American Journal of Human Genetics

JF - American Journal of Human Genetics

SN - 0002-9297

IS - 5

ER -