Expression quantitative trait loci information improves predictive modeling of disease relevance of non-coding genetic variation

Damien C. Croteau-Chonka, Angela J. Rogers, Towfique Raj, Michael J. McGeachie, Weiliang Qiu, John P. Ziniti, Benjamin J. Stubbs, Liming Liang, Fernando Martinez, Robert C. Strunk, Robert F. Lemanske, Andrew H. Liu, Barbara E. Stranger, Vincent J. Carey, Benjamin A. Raby

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Disease-associated loci identified through genome-wide association studies (GWAS) frequently localize to non-coding sequence. We and others have demonstrated strong enrichment of such single nucleotide polymorphisms (SNPs) for expression quantitative trait loci (eQTLs), supporting an important role for regulatory genetic variation in complex disease pathogenesis. Herein we describe our initial efforts to develop a predictive model of disease-associated variants leveraging eQTL information. We first catalogued cis-acting eQTLs (SNPs within 100kb of target gene transcripts) by meta-analyzing four studies of three blood-derived tissues (n = 586). At a false discovery rate <5%, we mapped eQTLs for 6,535 genes; these were enriched for disease-associated genes (P <10-04), particularly those related to immune diseases and metabolic traits. Based on eQTL information and other variant annotations (distance from target gene transcript, minor allele frequency, and chromatin state), we created multivariate logistic regression models to predict SNP membership in reported GWAS. The complete model revealed independent contributions of specific annotations as strong predictors, including evidence for an eQTL (odds ratio (OR) = 1.2-2.0, P <10-11) and the chromatin states of active promoters, different classes of strong or weak enhancers, or transcriptionally active regions (OR = 1.5-2.3, P <10-11). This complete prediction model including eQTL association information ultimately allowed for better discrimination of SNPs with higher probabilities of GWAS membership (6.3-10.0%, compared to 3.5% for a random SNP) than the other two models excluding eQTL information. This eQTL-based prediction model of disease relevance can help systematically prioritize non-coding GWAS SNPs for further functional characterization.

Original languageEnglish (US)
Article number140758
JournalPLoS One
Volume10
Issue number10
DOIs
StatePublished - Oct 16 2015

Fingerprint

Quantitative Trait Loci
quantitative trait loci
Genes
Polymorphism
genetic variation
single nucleotide polymorphism
Single Nucleotide Polymorphism
Nucleotides
Genome-Wide Association Study
Chromatin
odds ratio
chromatin
genes
Logistic Models
Odds Ratio
prediction
disease models
Immune System Diseases
Logistics
Gene Frequency

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Croteau-Chonka, D. C., Rogers, A. J., Raj, T., McGeachie, M. J., Qiu, W., Ziniti, J. P., ... Raby, B. A. (2015). Expression quantitative trait loci information improves predictive modeling of disease relevance of non-coding genetic variation. PLoS One, 10(10), [140758]. https://doi.org/10.1371/journal.pone.0140758

Expression quantitative trait loci information improves predictive modeling of disease relevance of non-coding genetic variation. / Croteau-Chonka, Damien C.; Rogers, Angela J.; Raj, Towfique; McGeachie, Michael J.; Qiu, Weiliang; Ziniti, John P.; Stubbs, Benjamin J.; Liang, Liming; Martinez, Fernando; Strunk, Robert C.; Lemanske, Robert F.; Liu, Andrew H.; Stranger, Barbara E.; Carey, Vincent J.; Raby, Benjamin A.

In: PLoS One, Vol. 10, No. 10, 140758, 16.10.2015.

Research output: Contribution to journalArticle

Croteau-Chonka, DC, Rogers, AJ, Raj, T, McGeachie, MJ, Qiu, W, Ziniti, JP, Stubbs, BJ, Liang, L, Martinez, F, Strunk, RC, Lemanske, RF, Liu, AH, Stranger, BE, Carey, VJ & Raby, BA 2015, 'Expression quantitative trait loci information improves predictive modeling of disease relevance of non-coding genetic variation', PLoS One, vol. 10, no. 10, 140758. https://doi.org/10.1371/journal.pone.0140758
Croteau-Chonka, Damien C. ; Rogers, Angela J. ; Raj, Towfique ; McGeachie, Michael J. ; Qiu, Weiliang ; Ziniti, John P. ; Stubbs, Benjamin J. ; Liang, Liming ; Martinez, Fernando ; Strunk, Robert C. ; Lemanske, Robert F. ; Liu, Andrew H. ; Stranger, Barbara E. ; Carey, Vincent J. ; Raby, Benjamin A. / Expression quantitative trait loci information improves predictive modeling of disease relevance of non-coding genetic variation. In: PLoS One. 2015 ; Vol. 10, No. 10.
@article{aa5b40659cf640fca61702834e1e932c,
title = "Expression quantitative trait loci information improves predictive modeling of disease relevance of non-coding genetic variation",
abstract = "Disease-associated loci identified through genome-wide association studies (GWAS) frequently localize to non-coding sequence. We and others have demonstrated strong enrichment of such single nucleotide polymorphisms (SNPs) for expression quantitative trait loci (eQTLs), supporting an important role for regulatory genetic variation in complex disease pathogenesis. Herein we describe our initial efforts to develop a predictive model of disease-associated variants leveraging eQTL information. We first catalogued cis-acting eQTLs (SNPs within 100kb of target gene transcripts) by meta-analyzing four studies of three blood-derived tissues (n = 586). At a false discovery rate <5{\%}, we mapped eQTLs for 6,535 genes; these were enriched for disease-associated genes (P <10-04), particularly those related to immune diseases and metabolic traits. Based on eQTL information and other variant annotations (distance from target gene transcript, minor allele frequency, and chromatin state), we created multivariate logistic regression models to predict SNP membership in reported GWAS. The complete model revealed independent contributions of specific annotations as strong predictors, including evidence for an eQTL (odds ratio (OR) = 1.2-2.0, P <10-11) and the chromatin states of active promoters, different classes of strong or weak enhancers, or transcriptionally active regions (OR = 1.5-2.3, P <10-11). This complete prediction model including eQTL association information ultimately allowed for better discrimination of SNPs with higher probabilities of GWAS membership (6.3-10.0{\%}, compared to 3.5{\%} for a random SNP) than the other two models excluding eQTL information. This eQTL-based prediction model of disease relevance can help systematically prioritize non-coding GWAS SNPs for further functional characterization.",
author = "Croteau-Chonka, {Damien C.} and Rogers, {Angela J.} and Towfique Raj and McGeachie, {Michael J.} and Weiliang Qiu and Ziniti, {John P.} and Stubbs, {Benjamin J.} and Liming Liang and Fernando Martinez and Strunk, {Robert C.} and Lemanske, {Robert F.} and Liu, {Andrew H.} and Stranger, {Barbara E.} and Carey, {Vincent J.} and Raby, {Benjamin A.}",
year = "2015",
month = "10",
day = "16",
doi = "10.1371/journal.pone.0140758",
language = "English (US)",
volume = "10",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "10",

}

TY - JOUR

T1 - Expression quantitative trait loci information improves predictive modeling of disease relevance of non-coding genetic variation

AU - Croteau-Chonka, Damien C.

AU - Rogers, Angela J.

AU - Raj, Towfique

AU - McGeachie, Michael J.

AU - Qiu, Weiliang

AU - Ziniti, John P.

AU - Stubbs, Benjamin J.

AU - Liang, Liming

AU - Martinez, Fernando

AU - Strunk, Robert C.

AU - Lemanske, Robert F.

AU - Liu, Andrew H.

AU - Stranger, Barbara E.

AU - Carey, Vincent J.

AU - Raby, Benjamin A.

PY - 2015/10/16

Y1 - 2015/10/16

N2 - Disease-associated loci identified through genome-wide association studies (GWAS) frequently localize to non-coding sequence. We and others have demonstrated strong enrichment of such single nucleotide polymorphisms (SNPs) for expression quantitative trait loci (eQTLs), supporting an important role for regulatory genetic variation in complex disease pathogenesis. Herein we describe our initial efforts to develop a predictive model of disease-associated variants leveraging eQTL information. We first catalogued cis-acting eQTLs (SNPs within 100kb of target gene transcripts) by meta-analyzing four studies of three blood-derived tissues (n = 586). At a false discovery rate <5%, we mapped eQTLs for 6,535 genes; these were enriched for disease-associated genes (P <10-04), particularly those related to immune diseases and metabolic traits. Based on eQTL information and other variant annotations (distance from target gene transcript, minor allele frequency, and chromatin state), we created multivariate logistic regression models to predict SNP membership in reported GWAS. The complete model revealed independent contributions of specific annotations as strong predictors, including evidence for an eQTL (odds ratio (OR) = 1.2-2.0, P <10-11) and the chromatin states of active promoters, different classes of strong or weak enhancers, or transcriptionally active regions (OR = 1.5-2.3, P <10-11). This complete prediction model including eQTL association information ultimately allowed for better discrimination of SNPs with higher probabilities of GWAS membership (6.3-10.0%, compared to 3.5% for a random SNP) than the other two models excluding eQTL information. This eQTL-based prediction model of disease relevance can help systematically prioritize non-coding GWAS SNPs for further functional characterization.

AB - Disease-associated loci identified through genome-wide association studies (GWAS) frequently localize to non-coding sequence. We and others have demonstrated strong enrichment of such single nucleotide polymorphisms (SNPs) for expression quantitative trait loci (eQTLs), supporting an important role for regulatory genetic variation in complex disease pathogenesis. Herein we describe our initial efforts to develop a predictive model of disease-associated variants leveraging eQTL information. We first catalogued cis-acting eQTLs (SNPs within 100kb of target gene transcripts) by meta-analyzing four studies of three blood-derived tissues (n = 586). At a false discovery rate <5%, we mapped eQTLs for 6,535 genes; these were enriched for disease-associated genes (P <10-04), particularly those related to immune diseases and metabolic traits. Based on eQTL information and other variant annotations (distance from target gene transcript, minor allele frequency, and chromatin state), we created multivariate logistic regression models to predict SNP membership in reported GWAS. The complete model revealed independent contributions of specific annotations as strong predictors, including evidence for an eQTL (odds ratio (OR) = 1.2-2.0, P <10-11) and the chromatin states of active promoters, different classes of strong or weak enhancers, or transcriptionally active regions (OR = 1.5-2.3, P <10-11). This complete prediction model including eQTL association information ultimately allowed for better discrimination of SNPs with higher probabilities of GWAS membership (6.3-10.0%, compared to 3.5% for a random SNP) than the other two models excluding eQTL information. This eQTL-based prediction model of disease relevance can help systematically prioritize non-coding GWAS SNPs for further functional characterization.

UR - http://www.scopus.com/inward/record.url?scp=84948974979&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84948974979&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0140758

DO - 10.1371/journal.pone.0140758

M3 - Article

C2 - 26474488

AN - SCOPUS:84948974979

VL - 10

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 10

M1 - 140758

ER -