Summary

JUMPn:蛋白质共表达聚类和网络分析在蛋白质组学中的简化应用

Published: October 19, 2021
doi:

Summary

我们提出了一个系统生物学工具JUMPn,用于执行和可视化定量蛋白质组学数据的网络分析,其详细的方案包括数据预处理,共表达聚类,途径富集和蛋白质 – 蛋白质相互作用网络分析。

Abstract

随着基于质谱的蛋白质组学技术的最新进展,对数百个蛋白质组进行深度分析变得越来越可行。然而,从这些有价值的数据集中获取生物学见解是具有挑战性的。在这里,我们介绍了一个基于生物学的系统软件JUMPn及其相关协议,以将蛋白质组组织成跨样品的蛋白质共表达簇和由模块连接的蛋白质 – 蛋白质相互作用(PPI)网络(例如,蛋白质复合物)。使用R/Shiny平台,JUMPn软件通过集成的数据可视化和用户友好的界面,简化了共表达聚类、通路富集和PPI模块检测的分析。该协议的主要步骤包括安装JUMPn软件,定义差异表达的蛋白质或(dys)调节的蛋白质组,确定有意义的共表达簇和PPI模块,以及结果可视化。虽然该方案使用基于等压标记的蛋白质组谱进行演示,但JUMPn通常适用于广泛的定量数据集(例如,无标记蛋白质组学)。因此,JUMPn软件和协议为定量蛋白质组学中的生物学解释提供了强大的工具。

Introduction

基于质谱的鸟枪鱼蛋白质组学已成为分析复杂样品蛋白质组多样性的关键方法1。随着质谱仪器23,色谱45,离子淌度检测6,采集方法(与数据无关的7和数据依赖的采集8),定量方法(多重等压肽标记方法,例如TMT910和无标记定量1112)和数据分析策略的最新进展/软件开发131415161718,整个蛋白质组(例如,超过10,000个蛋白质)的定量现在是常规的192021。然而,如何从如此深入的定量数据集中获得机械洞察力仍然是一个挑战22。研究这些数据集的最初尝试主要依赖于对数据中各个元素的注释,独立处理每个组分(蛋白质)。然而,生物系统及其行为不能仅仅通过检查单个组分23来解释。因此,将量化的生物分子置于相互作用网络背景下的系统方法对于理解复杂系统和相关过程(例如胚胎发生,免疫反应和人类疾病的发病机制)至关重要24

基于网络的系统生物学已成为分析大规模定量蛋白质组学数据25,2627,282930,313233的强大范式。从概念上讲,诸如哺乳动物细胞之类的复杂系统可以建模为分层网络3435,其中整个系统以层表示:首先由许多大型组件表示,然后由较小的子系统迭代建模。从技术上讲,蛋白质组动力学的结构可以通过共表达蛋白质簇的相互连接的网络(因为共表达的基因/蛋白质通常具有相似的生物学功能或调节机制36)和物理相互作用的PPI模块37来呈现。作为最近的示例25,我们在T细胞活化过程中生成了整个蛋白质组和磷酸蛋白质组的时间谱,并使用具有PPI的整合共表达网络来鉴定介导T细胞静止退出的功能模块。突出了多个生物能量相关模块并进行了实验验证(例如,线粒体和复合IV模块25,以及单碳模块38)。在另一个示例26中,我们进一步扩展了我们的方法来研究阿尔茨海默病的发病机制,并成功地优先考虑与疾病进展相关的蛋白质模块和分子。重要的是,我们的许多无偏倚发现都得到了独立患者队列2629和/或疾病小鼠模型26的验证。这些例子说明了系统生物学方法在通过定量蛋白质组学和其他组学整合来解剖分子机制方面的力量。

在这里,我们介绍 JUMPn,这是一款简化的软件,它使用基于网络的系统生物学方法探索定量蛋白质组学数据。JUMPn作为已建立的JUMP蛋白质组学软件套件131439的下游组件,旨在使用系统生物学方法填补从单个蛋白质定量到生物学上有意义的途径和蛋白质模块的空白。通过以差异表达(或最可变)蛋白质的定量基质作为输入,JUMPn旨在将蛋白质组组织成跨样品和密集连接的PPI模块(例如,蛋白质复合物)共表达的蛋白质簇的分层层次结构,这些模块通过过度表示(或富集)分析进一步注释公共途径数据库(图1)。JUMPn与R/Shiny平台40 一起开发,具有用户友好的界面,并集成了三个主要功能模块:共表达聚类分析,途径富集分析和PPI网络分析(图1)。每次分析后,结果都会自动可视化,并可通过R / shiny小部件功能进行调整,并可轻松下载为Microsoft Excel格式的发布表。在以下实验方案中,我们使用定量全蛋白质组数据作为示例,并描述使用JUMPn的主要步骤,包括安装JUMPn软件,定义差异表达的蛋白质或(dys)调节的蛋白质组,共表达网络分析和PPI模块分析,结果可视化和解释以及故障排除。JUMPn 软件在 GitHub41 上免费提供。

Protocol

注意:在该协议中,JUMPn的使用通过利用由TMT等压标记试剂27定量的B细胞分化期间全蛋白质组分析的已发表数据集来说明。 1. JUMPn 软件的设置 注:为设置 JUMPn 软件提供了两个选项:(i) 在本地计算机上安装以供个人使用;(ii) 在本地计算机上安装以供个人使用;(iii) 在本地计算机上安装以供个人使用;(ii)在远程闪亮服务器上为多个…

Representative Results

我们使用已发布的深度蛋白质组学数据集25、26、27、30 (图 5 和 图 6)以及数据模拟57 (表 1)来优化和评估 JUMPn 性能。对于通过WGCNA进行的共表达蛋白聚类分析,我们建议使用样品之间显着变化的蛋白质作为输入(例如,通…

Discussion

在这里,我们介绍了我们的JUMPn软件及其协议,它们已应用于多个项目中,使用深度定量蛋白质组学数据25,26273064解剖分子机制。JUMPn软件和实验方案已经过全面优化,包括考虑用于共表达网络分析的DE蛋白,综合和高质量PPI网络的汇编,严格的统计分析(例如,通过?…

Disclosures

The authors have nothing to disclose.

Acknowledgements

美国国立卫生研究院(NIH)(R01AG047928,R01AG053987,RF1AG064909,RF1AG068581和U54NS110435)和ALSAC(美国黎巴嫩叙利亚联合慈善机构)提供了资金支持。MS分析在圣裘德儿童研究医院的蛋白质组学和代谢组学中心进行,该中心由NIH癌症中心支持补助金(P30CA021765)部分支持。内容完全由作者负责,并不一定代表美国国立卫生研究院的官方观点。

Materials

MacBook Pro with a 2.3 GHz Quad-Core Processor running OS 10.15.7. Apple Inc. MacBook Pro 13'' Hardware used for software development and testing
Anoconda Anaconda, Inc. version 4.9.2 https://docs.anaconda.com/anaconda/install/
miniconda Anaconda, Inc. version 4.9.2 https://docs.conda.io/en/latest/miniconda.html
RStudio RStudio Public-benefit corporation version 4.0.3 https://www.rstudio.com/products/rstudio/download/
Shiny Server RStudio Public-benefit corporation https://shiny.rstudio.com/articles/shinyapps.html

References

  1. Aebersold, R., Mann, M. Mass-spectrometric exploration of proteome structure and function. Nature. 537, 347-355 (2016).
  2. Senko, M. W., et al. Novel parallelized quadrupole/linear ion trap/orbitrap tribrid mass spectrometer improving proteome coverage and peptide identification rates. Analytical Chemistry. 85, 11710-11714 (2013).
  3. Eliuk, S., Makarov, A. Evolution of orbitrap mass spectrometry instrumentation. Annual Review of Analytical Chemistry. 8, 61-80 (2015).
  4. Wang, H., et al. Systematic optimization of long gradient chromatography mass spectrometry for deep analysis of brain proteome. Journal of Proteome Research. 14, 829-838 (2015).
  5. Blue, L. E. Recent advances in capillary ultrahigh pressure liquid chromatography. Journal of Chromatography A. 1523, 17-39 (2017).
  6. Meier, F., et al. Online parallel accumulation-serial fragmentation (PASEF) with a novel trapped ion mobility mass spectrometer. Molecular & Cellular Proteomics. 17, 2534-2545 (2018).
  7. Ludwig, C., et al. Data-independent acquisition-based SWATH-MS for quantitative proteomics: a tutorial. Molecular Systems Biology. 14 (8), 8126 (2018).
  8. Zhang, Y. Y., Fonslow, B. R., Shan, B., Baek, M. C., Yates, J. R. Protein analysis by shotgun/bottom-up proteomics. Chemical Reviews. 113, 2343-2394 (2013).
  9. Wang, Z., et al. 27-Plex tandem mass tag mass spectrometry for profiling brain proteome in Alzheimer’s disease. Analytical Chemistry. 92, 7162-7170 (2020).
  10. Li, J. M., et al. TMTpro reagents: a set of isobaric labeling mass tags enables simultaneous proteome-wide measurements across 16 samples. Nature Methods. 17 (4), 399-404 (2020).
  11. Collins, B. C., et al. Multi-laboratory assessment of reproducibility, qualitative and quantitative performance of SWATH-mass spectrometry. Nature Communications. 8 (1), 291 (2017).
  12. Navarro, P., et al. A multicenter study benchmarks software tools for label-free proteome quantification. Nature Biotechnology. 34, 1130 (2016).
  13. Wang, X. S., et al. A tag-based database search tool for peptide identification with high sensitivity and accuracy. Molecular & Cellular Proteomics. 13, 3663-3673 (2014).
  14. Li, Y. X., et al. JUMPg: An integrative proteogenomics pipeline identifying unannotated proteins in human brain and cancer cells. Journal of Proteome Research. 15, 2309-2320 (2016).
  15. Cox, J., Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnology. 26, 1367-1372 (2008).
  16. Kong, A. T., Leprevost, F. V., Avtonomov, D. M., Mellacheruvu, D., Nesvizhskii, A. I. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nature Methods. 14, 513 (2017).
  17. Chi, H., et al. Comprehensive identification of peptides in tandem mass spectra using an efficient open search engine. Nature Biotechnology. 36, 1059 (2018).
  18. Demichev, V., Messner, C. B., Vernardis, S. I., Lilley, K. S., Ralser, M. DIA-NN neural networks and interference correction enable deep proteome coverage in high throughput. Nature Methods. 17, 41 (2020).
  19. High, A. A., et al. Deep proteome profiling by isobaric labeling, extensive liquid chromatography, mass spectrometry, and software-assisted quantification. Journal of Visualized Experiments: JoVE. (129), e56474 (2017).
  20. Wang, Z., et al. High-throughput and deep-proteome profiling by 16-plex tandem mass tag labeling coupled with two-dimensional chromatography and mass spectrometry. Journal of Visualized Experiments: JoVE. (162), e61684 (2020).
  21. Meier, F., Geyer, P. E., Winter, S. V., Cox, J., Mann, M. BoxCar acquisition method enables single-shot proteomics at a depth of 10,000 proteins in 100 minutes. Nature Methods. 15, 440 (2018).
  22. Sinitcyn, P., Rudolph, J. D., Cox, J. Computational methods for understanding mass spectrometry-based shotgun proteomics data. Annual Review of Biomedical Data Science. 1, 207-234 (2018).
  23. Ideker, T., Galitski, T., Hood, L. A new approach to decoding life: Systems biology. Annual Review of Genomics and Human Genetics. 2, 343-372 (2001).
  24. Barabasi, A. L., Oltvai, Z. N. Network biology: understanding the cell’s functional organization. Nature Reviews Genetics. 5, 101-113 (2004).
  25. Tan, H., et al. Integrative proteomics and phosphoproteomics profiling reveals dynamic signaling networks and bioenergetics pathways underlying T cell activation. Immunity. 46, 488-503 (2017).
  26. Bai, B., et al. Deep multilayer brain proteomics identifies molecular networks in alzheimer’s disease progression. Neuron. 105, 975-991 (2020).
  27. Zeng, H., et al. Discrete roles and bifurcation of PTEN signaling and mTORC1-mediated anabolic metabolism underlie IL-7-driven B lymphopoiesis. Science Advances. 4, 5701 (2018).
  28. Seyfried, N. T., et al. A multi-network approach identifies protein-specific co-expression in asymptomatic and symptomatic Alzheimer’s disease. Cell Systems. 4, 60-72 (2017).
  29. Johnson, E. C. B., et al. Large-scale proteomic analysis of Alzheimer’s disease brain and cerebrospinal fluid reveals early changes in energy metabolism associated with microglia and astrocyte activation. Nature Medicine. 26, 769-780 (2020).
  30. Stewart, E., et al. Identification of therapeutic targets in rhabdomyosarcoma through integrated genomic, epigenomic, and proteomic analyses. Cancer Cell. 34, 411-426 (2018).
  31. Rudolph, J. D., Cox, J. A network module for the perseus software for computational proteomics facilitates proteome interaction graph analysis. Journal of Proteome Research. 18, 2052-2064 (2019).
  32. Zhang, B., et al. Proteogenomic characterization of human colon and rectal cancer. Nature. 513, 382 (2014).
  33. Petralia, F., et al. Integrated proteogenomic characterization across major histological types of pediatric brain cancer. Cell. 183, 1962 (2020).
  34. Dutkowski, J., et al. A gene ontology inferred from molecular networks. Nature Biotechnology. 31, 38 (2013).
  35. Yu, M. K., et al. Translation of genotype to phenotype by a hierarchy of cell subsystems. Cell Systems. 2, 77-88 (2016).
  36. Jansen, R., Greenbaum, D., Gerstein, M. Relating whole-genome expression data with protein-protein interactions. Genome Research. 12, 37-46 (2002).
  37. Huttlin, E. L., et al. Architecture of the human interactome defines protein communities and disease networks. Nature. 545, 505-509 (2017).
  38. Ron-Harel, N., et al. Mitochondrial biogenesis and proteome remodeling promote one-carbon metabolism for T cell activation. Cell Metabolism. 24, 104-117 (2016).
  39. Niu, M. M., et al. Extensive peptide fractionation and y(1) ion-based interference detection method for enabling accurate quantification by isobaric labeling and mass spectrometry. Analytical Chemistry. 89, 2956-2963 (2017).
  40. Chang, W. shiny: Web Application Framework for. Nature Protocols. 11, 2301-2319 (2021).
  41. . JUMPn Available from: https://github.com/VanderwallDavid/JUMPn_1.0.0 (2021)
  42. . Anaconda Available from: https://docs.anaconda.com/anaconda/install/ (2021)
  43. . miniconda Available from: https://docs.conda.io/en/latest/miniconda.html (2021)
  44. . RStudio Available from: https://www.rstudio.com/products/rstudio/download/ (2021)
  45. . Shiny Server Available from: https://shiny.rstudio.com/articles/shinyapps.html (2021)
  46. Tyanova, S., Temu, T., Cox, J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nature Protocol. 11, 2301-2319 (2016).
  47. . R code Available from: https://github.com/VanderwallDavid/JUMPn_1.0.0/tree/main/JUMPn_preprocessing (2021)
  48. Florens, L., et al. Analyzing chromatin remodeling complexes using shotgun proteomics and normalized spectral abundance factors. Methods. 40, 303-311 (2006).
  49. Zhang, B., Horvath, S. A general framework for weighted gene co-expression network analysis. Statistical Applications in Genetics and Molecular Biology. 4, (2005).
  50. Voineagu, I., et al. Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature. 474, 380 (2011).
  51. Langfelder, P., Zhang, B., Horvath, S. Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics. 24, 719-720 (2008).
  52. Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N., Barabasi, A. L. Hierarchical organization of modularity in metabolic networks. Science. 297, 1551-1555 (2002).
  53. Kuleshov, M. V., et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Research. 44, 90-97 (2016).
  54. Liberzon, A., et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 27, 1739-1740 (2011).
  55. . FlyEn rich r Available from: https://maayanlab.cloud/FlyEnrichr/#stats (2021)
  56. . JUMPn GitHub Available from: https://github.com/VanderwallDavid/JUMPn_1.0.0/tree/main/resources/example_fly (2021)
  57. Langfelder, P., Horvath, S. Eigengene networks for studying the relationships between co-expression modules. BMC Systems Biology. 1, 54 (2007).
  58. Benjamini, Y., Hochberg, Y. Controlling the false discovery rate – a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B. 57, 289-300 (1995).
  59. Szklarczyk, D., et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Research. 43, 447-452 (2015).
  60. Szklarczyk, D., et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Research. 47, 607-613 (2019).
  61. Huttlin, E. L., et al. The BioPlex network: A systematic exploration of the human interactome. Cell. 162, 425-440 (2015).
  62. Huttlin, E. L., et al. Dual proteome-scale networks reveal cell-specific remodeling of the human interactome. Cell. 184, 3022-3040 (2021).
  63. Li, T., et al. A scored human protein-protein interaction network to catalyze genomic interpretation. Nature Methods. 14, 61-64 (2017).
  64. Wang, H., et al. Deep multiomics profiling of brain tumors identifies signaling networks downstream of cancer driver genes. Nature Communications. 10, 3718 (2019).
  65. Gerstein, M. B., et al. Architecture of the human regulatory network derived from ENCODE data. Nature. 489, 91-100 (2012).
  66. Yu, J., Peng, J., Chi, H. Systems immunology: Integrating multi-omics data to infer regulatory networks and hidden drivers of immunity. Current Opinion in Systems Biology. 15, 19-29 (2019).
  67. Califano, A., Alvarez, M. J. The recurrent architecture of tumour initiation, progression and drug sensitivity. Nature Reviews Cancer. 17, 116-130 (2017).
  68. Hein, M. Y., et al. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell. 163, 712-723 (2015).
  69. Liang, Z., Xu, M., Teng, M. K., Niu, L. W. Comparison of protein interaction networks reveals species conservation and divergence. BMC Bioinformatics. 7, 457 (2006).
  70. Shou, C., et al. Measuring the evolutionary rewiring of biological networks. PLOS Computational Biology. 7, 1001050 (2011).
  71. Zhou, Y., et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nature Communications. 10, 1523 (2019).
  72. Cline, M. S., et al. Integration of biological networks and gene expression data using Cytoscape. Nature Protocols. 2, 2366-2382 (2007).
check_url/kr/62796?article_type=t

Play Video

Cite This Article
Vanderwall, D., Suresh, P., Fu, Y., Cho, J., Shaw, T. I., Mishra, A., High, A. A., Peng, J., Li, Y. JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics. J. Vis. Exp. (176), e62796, doi:10.3791/62796 (2021).

View Video