Comparative shotgun proteomics using spectral count data and quasi-likelihood modeling

Ming Li, William Gray, Haixia Zhang, Christine H. Chung, David D Billheimer, Wendell G. Yarbrough, Daniel C. Liebler, Yu Shyr, Robbert J C Slebos

Research output: Contribution to journalArticle

78 Citations (Scopus)

Abstract

Shotgun proteomics provides the most powerful analytical platform for global inventory of complex proteomes using liquid chromatography-tandem mass spectrometry (LC-MS/MS) and allows a global analysis of protein changes. Nevertheless, sampling of complex proteomes by current shotgun proteomics platforms is incomplete, and this contributes to variability in assessment of peptide and protein inventories by spectral counting approaches. Thus, shotgun proteomics data pose challenges in comparing proteomes from different biological states. We developed an analysis strategy using quasi-likelihood Generalized Linear Modeling (GLM), included in a graphical interface software package (QuasiTel) that reads standard output from protein assemblies created by IDPicker, an HTML-based user interface to query shotgun proteomic data sets. This approach was compared to four other statistical analysis strategies: Student t test, Wilcoxon rank test, Fisher's Exact test, and Poisson-based GLM. We analyzed the performance of these tests to identify differences in protein levels based on spectral counts in a shotgun data set in which equimolar amounts of 48 human proteins were spiked at different levels into whole yeast lysates. Both GLM approaches and the Fisher Exact test performed adequately, each with their unique limitations. We subsequently compared the proteomes of normal tonsil epithelium and HNSCC using this approach and identified 86 proteins with differential spectral counts between normal tonsil epithelium and HNSCC. We selected 18 proteins from this comparison for verification of protein levels between the individual normal and tumor tissues using liquid chromatography-multiple reaction monitoring mass spectrometry (LC-MRM-MS). This analysis confirmed the magnitude and direction of the protein expression differences in all 6 proteins for which reliable data could be obtained. Our analysis demonstrates that shotgun proteomic data sets from different tissue phenotypes are sufficiently rich in quantitative information and that statistically significant differences in proteins spectral counts reflect the underlying biology of the samples.

Original languageEnglish (US)
Pages (from-to)4295-4305
Number of pages11
JournalJournal of Proteome Research
Volume9
Issue number8
DOIs
StatePublished - Aug 6 2010

Fingerprint

Firearms
Proteomics
Proteins
Proteome
Palatine Tonsil
Liquid chromatography
Liquid Chromatography
Mass spectrometry
Epithelium
Tissue
Equipment and Supplies
HTML
Nonparametric Statistics
Tandem Mass Spectrometry
Software packages
Yeast
User interfaces
Interfaces (computer)
Tumors
Mass Spectrometry

Keywords

  • Generalized Linear Model
  • head and neck carcinoma
  • LC-MS/MS
  • multiple reaction monitoring (MRM)
  • shotgun proteomics
  • spectral counting

ASJC Scopus subject areas

  • Biochemistry
  • Chemistry(all)

Cite this

Li, M., Gray, W., Zhang, H., Chung, C. H., Billheimer, D. D., Yarbrough, W. G., ... Slebos, R. J. C. (2010). Comparative shotgun proteomics using spectral count data and quasi-likelihood modeling. Journal of Proteome Research, 9(8), 4295-4305. https://doi.org/10.1021/pr100527g

Comparative shotgun proteomics using spectral count data and quasi-likelihood modeling. / Li, Ming; Gray, William; Zhang, Haixia; Chung, Christine H.; Billheimer, David D; Yarbrough, Wendell G.; Liebler, Daniel C.; Shyr, Yu; Slebos, Robbert J C.

In: Journal of Proteome Research, Vol. 9, No. 8, 06.08.2010, p. 4295-4305.

Research output: Contribution to journalArticle

Li, M, Gray, W, Zhang, H, Chung, CH, Billheimer, DD, Yarbrough, WG, Liebler, DC, Shyr, Y & Slebos, RJC 2010, 'Comparative shotgun proteomics using spectral count data and quasi-likelihood modeling', Journal of Proteome Research, vol. 9, no. 8, pp. 4295-4305. https://doi.org/10.1021/pr100527g
Li, Ming ; Gray, William ; Zhang, Haixia ; Chung, Christine H. ; Billheimer, David D ; Yarbrough, Wendell G. ; Liebler, Daniel C. ; Shyr, Yu ; Slebos, Robbert J C. / Comparative shotgun proteomics using spectral count data and quasi-likelihood modeling. In: Journal of Proteome Research. 2010 ; Vol. 9, No. 8. pp. 4295-4305.
@article{3a52d0d29f8e4e4bb209ec296cf460e9,
title = "Comparative shotgun proteomics using spectral count data and quasi-likelihood modeling",
abstract = "Shotgun proteomics provides the most powerful analytical platform for global inventory of complex proteomes using liquid chromatography-tandem mass spectrometry (LC-MS/MS) and allows a global analysis of protein changes. Nevertheless, sampling of complex proteomes by current shotgun proteomics platforms is incomplete, and this contributes to variability in assessment of peptide and protein inventories by spectral counting approaches. Thus, shotgun proteomics data pose challenges in comparing proteomes from different biological states. We developed an analysis strategy using quasi-likelihood Generalized Linear Modeling (GLM), included in a graphical interface software package (QuasiTel) that reads standard output from protein assemblies created by IDPicker, an HTML-based user interface to query shotgun proteomic data sets. This approach was compared to four other statistical analysis strategies: Student t test, Wilcoxon rank test, Fisher's Exact test, and Poisson-based GLM. We analyzed the performance of these tests to identify differences in protein levels based on spectral counts in a shotgun data set in which equimolar amounts of 48 human proteins were spiked at different levels into whole yeast lysates. Both GLM approaches and the Fisher Exact test performed adequately, each with their unique limitations. We subsequently compared the proteomes of normal tonsil epithelium and HNSCC using this approach and identified 86 proteins with differential spectral counts between normal tonsil epithelium and HNSCC. We selected 18 proteins from this comparison for verification of protein levels between the individual normal and tumor tissues using liquid chromatography-multiple reaction monitoring mass spectrometry (LC-MRM-MS). This analysis confirmed the magnitude and direction of the protein expression differences in all 6 proteins for which reliable data could be obtained. Our analysis demonstrates that shotgun proteomic data sets from different tissue phenotypes are sufficiently rich in quantitative information and that statistically significant differences in proteins spectral counts reflect the underlying biology of the samples.",
keywords = "Generalized Linear Model, head and neck carcinoma, LC-MS/MS, multiple reaction monitoring (MRM), shotgun proteomics, spectral counting",
author = "Ming Li and William Gray and Haixia Zhang and Chung, {Christine H.} and Billheimer, {David D} and Yarbrough, {Wendell G.} and Liebler, {Daniel C.} and Yu Shyr and Slebos, {Robbert J C}",
year = "2010",
month = "8",
day = "6",
doi = "10.1021/pr100527g",
language = "English (US)",
volume = "9",
pages = "4295--4305",
journal = "Journal of Proteome Research",
issn = "1535-3893",
publisher = "American Chemical Society",
number = "8",

}

TY - JOUR

T1 - Comparative shotgun proteomics using spectral count data and quasi-likelihood modeling

AU - Li, Ming

AU - Gray, William

AU - Zhang, Haixia

AU - Chung, Christine H.

AU - Billheimer, David D

AU - Yarbrough, Wendell G.

AU - Liebler, Daniel C.

AU - Shyr, Yu

AU - Slebos, Robbert J C

PY - 2010/8/6

Y1 - 2010/8/6

N2 - Shotgun proteomics provides the most powerful analytical platform for global inventory of complex proteomes using liquid chromatography-tandem mass spectrometry (LC-MS/MS) and allows a global analysis of protein changes. Nevertheless, sampling of complex proteomes by current shotgun proteomics platforms is incomplete, and this contributes to variability in assessment of peptide and protein inventories by spectral counting approaches. Thus, shotgun proteomics data pose challenges in comparing proteomes from different biological states. We developed an analysis strategy using quasi-likelihood Generalized Linear Modeling (GLM), included in a graphical interface software package (QuasiTel) that reads standard output from protein assemblies created by IDPicker, an HTML-based user interface to query shotgun proteomic data sets. This approach was compared to four other statistical analysis strategies: Student t test, Wilcoxon rank test, Fisher's Exact test, and Poisson-based GLM. We analyzed the performance of these tests to identify differences in protein levels based on spectral counts in a shotgun data set in which equimolar amounts of 48 human proteins were spiked at different levels into whole yeast lysates. Both GLM approaches and the Fisher Exact test performed adequately, each with their unique limitations. We subsequently compared the proteomes of normal tonsil epithelium and HNSCC using this approach and identified 86 proteins with differential spectral counts between normal tonsil epithelium and HNSCC. We selected 18 proteins from this comparison for verification of protein levels between the individual normal and tumor tissues using liquid chromatography-multiple reaction monitoring mass spectrometry (LC-MRM-MS). This analysis confirmed the magnitude and direction of the protein expression differences in all 6 proteins for which reliable data could be obtained. Our analysis demonstrates that shotgun proteomic data sets from different tissue phenotypes are sufficiently rich in quantitative information and that statistically significant differences in proteins spectral counts reflect the underlying biology of the samples.

AB - Shotgun proteomics provides the most powerful analytical platform for global inventory of complex proteomes using liquid chromatography-tandem mass spectrometry (LC-MS/MS) and allows a global analysis of protein changes. Nevertheless, sampling of complex proteomes by current shotgun proteomics platforms is incomplete, and this contributes to variability in assessment of peptide and protein inventories by spectral counting approaches. Thus, shotgun proteomics data pose challenges in comparing proteomes from different biological states. We developed an analysis strategy using quasi-likelihood Generalized Linear Modeling (GLM), included in a graphical interface software package (QuasiTel) that reads standard output from protein assemblies created by IDPicker, an HTML-based user interface to query shotgun proteomic data sets. This approach was compared to four other statistical analysis strategies: Student t test, Wilcoxon rank test, Fisher's Exact test, and Poisson-based GLM. We analyzed the performance of these tests to identify differences in protein levels based on spectral counts in a shotgun data set in which equimolar amounts of 48 human proteins were spiked at different levels into whole yeast lysates. Both GLM approaches and the Fisher Exact test performed adequately, each with their unique limitations. We subsequently compared the proteomes of normal tonsil epithelium and HNSCC using this approach and identified 86 proteins with differential spectral counts between normal tonsil epithelium and HNSCC. We selected 18 proteins from this comparison for verification of protein levels between the individual normal and tumor tissues using liquid chromatography-multiple reaction monitoring mass spectrometry (LC-MRM-MS). This analysis confirmed the magnitude and direction of the protein expression differences in all 6 proteins for which reliable data could be obtained. Our analysis demonstrates that shotgun proteomic data sets from different tissue phenotypes are sufficiently rich in quantitative information and that statistically significant differences in proteins spectral counts reflect the underlying biology of the samples.

KW - Generalized Linear Model

KW - head and neck carcinoma

KW - LC-MS/MS

KW - multiple reaction monitoring (MRM)

KW - shotgun proteomics

KW - spectral counting

UR - http://www.scopus.com/inward/record.url?scp=77955458799&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77955458799&partnerID=8YFLogxK

U2 - 10.1021/pr100527g

DO - 10.1021/pr100527g

M3 - Article

C2 - 20586475

AN - SCOPUS:77955458799

VL - 9

SP - 4295

EP - 4305

JO - Journal of Proteome Research

JF - Journal of Proteome Research

SN - 1535-3893

IS - 8

ER -