Differential Integration of Transcriptome and Proteome Identifies Pan-cancer Prognostic Biomarkers.
Frontiers in Genetics
High-throughput analysis of the transcriptome and proteome individually are used to interrogate complex oncogenic processes in cancer. However, an outstanding challenge is how to combine these complementary, yet partially disparate data sources to accurately identify tumor-specific gene products and clinical biomarkers. Here, we introduce inteGREAT for robust and scalable differential integration of high-throughput measurements. With inteGREAT, each data source is represented as a co-expression network, which is analyzed to characterize the local and global structure of each node across networks. inteGREAT scores the degree by which the topology of each gene in both transcriptome and proteome networks are conserved within a tumor type, yet different from other normal or malignant cells. We demonstrated the high performance of inteGREAT based on several analyses: deconvolving synthetic networks, rediscovering known diagnostic biomarkers, establishing relationships between tumor lineages, and elucidating putative prognostic biomarkers which we experimentally validated. Furthermore, we introduce the application of a clumpiness measure to quantitatively describe tumor lineage similarity. Together, inteGREAT not only infers functional and clinical insights from the integration of transcriptomic and proteomic data sources in cancer, but also can be readily applied to other heterogeneous high-throughput data sources. inteGREAT is open source and available to download from https://github.com/faryabib/inteGREAT.