Evaluating single-subject study methods for personal transcriptomic interpretations to advance precision medicine

Samir Rachid Zaim, Colleen Kenost, Joanne Berghout, Francesca Vitali, Hao Zhang, Yves A Lussier

Research output: Contribution to journalArticle

Abstract

Background: Gene expression profiling has benefited medicine by providing clinically relevant insights at the molecular candidate and systems levels. However, to adopt a more 'precision' approach that integrates individual variability including 'omics data into risk assessments, diagnoses, and therapeutic decision making, whole transcriptome expression needs to be interpreted meaningfully for single subjects. We propose an "all-against-one" framework that uses biological replicates in isogenic conditions for testing differentially expressed genes (DEGs) in a single subject (ss) in the absence of an appropriate external reference standard or replicates. To evaluate our proposed "all-against-one" framework, we construct reference standards (RSs) with five conventional replicate-anchored analyses (NOISeq, DEGseq, edgeR, DESeq, DESeq2) and the remainder were treated separately as single-subject sample pairs for ss analyses (without replicates). Results: Eight ss methods (NOISeq, DEGseq, edgeR, mixture model, DESeq, DESeq2, iDEG, and ensemble) for identifying genes with differential expression were compared in Yeast (parental line versus snf2 deletion mutant; n = 42/condition) and a MCF7 breast-cancer cell line (baseline versus stimulated with estradiol; n = 7/condition). Receiver-operator characteristic (ROC) and precision-recall plots were determined for eight ss methods against each of the five RSs in both datasets. Consistent with prior analyses of these data, ~ 50% and ~ 15% DEGs were obtained in Yeast and MCF7 datasets respectively, regardless of the RSs method. NOISeq, edgeR, and DESeq were the most concordant for creating a RS. Single-subject versions of NOISeq, DEGseq, and an ensemble learner achieved the best median ROC-area-under-the-curve to compare two transcriptomes without replicates regardless of the RS method and dataset (> 90% in Yeast, > 0.75 in MCF7). Further, distinct specific single-subject methods perform better according to different proportions of DEGs. Conclusions: The "all-against-one" framework provides a honest evaluation framework for single-subject DEG studies since these methods are evaluated, by design, against reference standards produced by unrelated DEG methods. The ss-ensemble method was the only one to reliably produce higher accuracies in all conditions tested in this conservative evaluation framework. However, single-subject methods for identifying DEGs from paired samples need improvement, as no method performed with precision> 90% and obtained moderate levels of recall. http://www.lussiergroup.org/publications/EnsembleBiomarker

Original languageEnglish (US)
Article number96
JournalBMC Medical Genomics
Volume12
DOIs
StatePublished - Jul 11 2019

Fingerprint

Precision Medicine
Genes
Yeasts
Transcriptome
Gene Expression Profiling
Area Under Curve
Publications
Estradiol
Decision Making
Medicine
Breast Neoplasms
Cell Line

Keywords

  • Genomic medicine
  • Medical genomics
  • N-of-1
  • N-of-1 studies
  • Precision medicine
  • Single-subject studies
  • Transcriptome

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Cite this

Evaluating single-subject study methods for personal transcriptomic interpretations to advance precision medicine. / Rachid Zaim, Samir; Kenost, Colleen; Berghout, Joanne; Vitali, Francesca; Zhang, Hao; Lussier, Yves A.

In: BMC Medical Genomics, Vol. 12, 96, 11.07.2019.

Research output: Contribution to journalArticle

@article{aeda845eba9d4f5ca6ed8d7ddbb709f4,
title = "Evaluating single-subject study methods for personal transcriptomic interpretations to advance precision medicine",
abstract = "Background: Gene expression profiling has benefited medicine by providing clinically relevant insights at the molecular candidate and systems levels. However, to adopt a more 'precision' approach that integrates individual variability including 'omics data into risk assessments, diagnoses, and therapeutic decision making, whole transcriptome expression needs to be interpreted meaningfully for single subjects. We propose an {"}all-against-one{"} framework that uses biological replicates in isogenic conditions for testing differentially expressed genes (DEGs) in a single subject (ss) in the absence of an appropriate external reference standard or replicates. To evaluate our proposed {"}all-against-one{"} framework, we construct reference standards (RSs) with five conventional replicate-anchored analyses (NOISeq, DEGseq, edgeR, DESeq, DESeq2) and the remainder were treated separately as single-subject sample pairs for ss analyses (without replicates). Results: Eight ss methods (NOISeq, DEGseq, edgeR, mixture model, DESeq, DESeq2, iDEG, and ensemble) for identifying genes with differential expression were compared in Yeast (parental line versus snf2 deletion mutant; n = 42/condition) and a MCF7 breast-cancer cell line (baseline versus stimulated with estradiol; n = 7/condition). Receiver-operator characteristic (ROC) and precision-recall plots were determined for eight ss methods against each of the five RSs in both datasets. Consistent with prior analyses of these data, ~ 50{\%} and ~ 15{\%} DEGs were obtained in Yeast and MCF7 datasets respectively, regardless of the RSs method. NOISeq, edgeR, and DESeq were the most concordant for creating a RS. Single-subject versions of NOISeq, DEGseq, and an ensemble learner achieved the best median ROC-area-under-the-curve to compare two transcriptomes without replicates regardless of the RS method and dataset (> 90{\%} in Yeast, > 0.75 in MCF7). Further, distinct specific single-subject methods perform better according to different proportions of DEGs. Conclusions: The {"}all-against-one{"} framework provides a honest evaluation framework for single-subject DEG studies since these methods are evaluated, by design, against reference standards produced by unrelated DEG methods. The ss-ensemble method was the only one to reliably produce higher accuracies in all conditions tested in this conservative evaluation framework. However, single-subject methods for identifying DEGs from paired samples need improvement, as no method performed with precision> 90{\%} and obtained moderate levels of recall. http://www.lussiergroup.org/publications/EnsembleBiomarker",
keywords = "Genomic medicine, Medical genomics, N-of-1, N-of-1 studies, Precision medicine, Single-subject studies, Transcriptome",
author = "{Rachid Zaim}, Samir and Colleen Kenost and Joanne Berghout and Francesca Vitali and Hao Zhang and Lussier, {Yves A}",
year = "2019",
month = "7",
day = "11",
doi = "10.1186/s12920-019-0513-8",
language = "English (US)",
volume = "12",
journal = "BMC Medical Genomics",
issn = "1755-8794",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Evaluating single-subject study methods for personal transcriptomic interpretations to advance precision medicine

AU - Rachid Zaim, Samir

AU - Kenost, Colleen

AU - Berghout, Joanne

AU - Vitali, Francesca

AU - Zhang, Hao

AU - Lussier, Yves A

PY - 2019/7/11

Y1 - 2019/7/11

N2 - Background: Gene expression profiling has benefited medicine by providing clinically relevant insights at the molecular candidate and systems levels. However, to adopt a more 'precision' approach that integrates individual variability including 'omics data into risk assessments, diagnoses, and therapeutic decision making, whole transcriptome expression needs to be interpreted meaningfully for single subjects. We propose an "all-against-one" framework that uses biological replicates in isogenic conditions for testing differentially expressed genes (DEGs) in a single subject (ss) in the absence of an appropriate external reference standard or replicates. To evaluate our proposed "all-against-one" framework, we construct reference standards (RSs) with five conventional replicate-anchored analyses (NOISeq, DEGseq, edgeR, DESeq, DESeq2) and the remainder were treated separately as single-subject sample pairs for ss analyses (without replicates). Results: Eight ss methods (NOISeq, DEGseq, edgeR, mixture model, DESeq, DESeq2, iDEG, and ensemble) for identifying genes with differential expression were compared in Yeast (parental line versus snf2 deletion mutant; n = 42/condition) and a MCF7 breast-cancer cell line (baseline versus stimulated with estradiol; n = 7/condition). Receiver-operator characteristic (ROC) and precision-recall plots were determined for eight ss methods against each of the five RSs in both datasets. Consistent with prior analyses of these data, ~ 50% and ~ 15% DEGs were obtained in Yeast and MCF7 datasets respectively, regardless of the RSs method. NOISeq, edgeR, and DESeq were the most concordant for creating a RS. Single-subject versions of NOISeq, DEGseq, and an ensemble learner achieved the best median ROC-area-under-the-curve to compare two transcriptomes without replicates regardless of the RS method and dataset (> 90% in Yeast, > 0.75 in MCF7). Further, distinct specific single-subject methods perform better according to different proportions of DEGs. Conclusions: The "all-against-one" framework provides a honest evaluation framework for single-subject DEG studies since these methods are evaluated, by design, against reference standards produced by unrelated DEG methods. The ss-ensemble method was the only one to reliably produce higher accuracies in all conditions tested in this conservative evaluation framework. However, single-subject methods for identifying DEGs from paired samples need improvement, as no method performed with precision> 90% and obtained moderate levels of recall. http://www.lussiergroup.org/publications/EnsembleBiomarker

AB - Background: Gene expression profiling has benefited medicine by providing clinically relevant insights at the molecular candidate and systems levels. However, to adopt a more 'precision' approach that integrates individual variability including 'omics data into risk assessments, diagnoses, and therapeutic decision making, whole transcriptome expression needs to be interpreted meaningfully for single subjects. We propose an "all-against-one" framework that uses biological replicates in isogenic conditions for testing differentially expressed genes (DEGs) in a single subject (ss) in the absence of an appropriate external reference standard or replicates. To evaluate our proposed "all-against-one" framework, we construct reference standards (RSs) with five conventional replicate-anchored analyses (NOISeq, DEGseq, edgeR, DESeq, DESeq2) and the remainder were treated separately as single-subject sample pairs for ss analyses (without replicates). Results: Eight ss methods (NOISeq, DEGseq, edgeR, mixture model, DESeq, DESeq2, iDEG, and ensemble) for identifying genes with differential expression were compared in Yeast (parental line versus snf2 deletion mutant; n = 42/condition) and a MCF7 breast-cancer cell line (baseline versus stimulated with estradiol; n = 7/condition). Receiver-operator characteristic (ROC) and precision-recall plots were determined for eight ss methods against each of the five RSs in both datasets. Consistent with prior analyses of these data, ~ 50% and ~ 15% DEGs were obtained in Yeast and MCF7 datasets respectively, regardless of the RSs method. NOISeq, edgeR, and DESeq were the most concordant for creating a RS. Single-subject versions of NOISeq, DEGseq, and an ensemble learner achieved the best median ROC-area-under-the-curve to compare two transcriptomes without replicates regardless of the RS method and dataset (> 90% in Yeast, > 0.75 in MCF7). Further, distinct specific single-subject methods perform better according to different proportions of DEGs. Conclusions: The "all-against-one" framework provides a honest evaluation framework for single-subject DEG studies since these methods are evaluated, by design, against reference standards produced by unrelated DEG methods. The ss-ensemble method was the only one to reliably produce higher accuracies in all conditions tested in this conservative evaluation framework. However, single-subject methods for identifying DEGs from paired samples need improvement, as no method performed with precision> 90% and obtained moderate levels of recall. http://www.lussiergroup.org/publications/EnsembleBiomarker

KW - Genomic medicine

KW - Medical genomics

KW - N-of-1

KW - N-of-1 studies

KW - Precision medicine

KW - Single-subject studies

KW - Transcriptome

UR - http://www.scopus.com/inward/record.url?scp=85069431597&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85069431597&partnerID=8YFLogxK

U2 - 10.1186/s12920-019-0513-8

DO - 10.1186/s12920-019-0513-8

M3 - Article

C2 - 31296218

AN - SCOPUS:85069431597

VL - 12

JO - BMC Medical Genomics

JF - BMC Medical Genomics

SN - 1755-8794

M1 - 96

ER -