A multi-scale integrated analysis identifies KRT8 as a pan-cancer early biomarker. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Scott, M. K., Limaye, M., Schaffert, S., West, R., Ozawa, M. G., Chu, P., Nair, V. S., Koong, A. C., Khatri, P. 2021; 26: 297–308


An early biomarker would transform our ability to screen and treat patients with cancer. The large amount of multi-scale molecular data in public repositories from various cancers provide unprecedented opportunities to find such a biomarker. However, despite identification of numerous molecular biomarkers using these public data, fewer than 1% have proven robust enough to translate into clinical practice. One of the most important factors affecting the successful translation to clinical practice is lack of real-world patient population heterogeneity in the discovery process. Almost all biomarker studies analyze only a single cohort of patients with the same cancer using a single modality. Recent studies in other diseases have demonstrated the advantage of leveraging biological and technical heterogeneity across multiple independent cohorts to identify robust disease biomarkers. Here we analyzed 17149 samples from patients with one of 23 cancers that were profiled using either DNA methylation, bulk and single-cell gene expression, or protein expression in tumor and serum. First, we analyzed DNA methylation profiles of 9855 samples across 23 cancers from The Cancer Genome Atlas (TCGA). We then examined the gene expression profile of the most significantly hypomethylated gene, KRT8, in 6781 samples from 57 independent microarray datasets from NCBI GEO. KRT8 was significantly over-expressed across cancers except colon cancer (summary effect size=1.05; p < 0.0001). Further, single-cell RNAseq analysis of 7447 single cells from lung tumors showed that genes that significantly correlated with KRT8 (p < 0.05) were involved in p53-related pathways. Immunohistochemistry in tumor biopsies from 294 patients with lung cancer showed that high protein expression of KRT8 is a prognostic marker of poor survival (HR = 1.73, p = 0.01). Finally, detectable KRT8 in serum as measured by ELISA distinguished patients with pancreatic cancer from healthy controls with an AUROC=0.94. In summary, our analysis demonstrates that KRT8 is (1) differentially expressed in several cancers across all molecular modalities and (2) may be useful as a biomarker to identify patients that should be further tested for cancer.

View details for PubMedID 33691026