Protein signatures comparing with existent studies
Proteomic studies have been conducted to explore potential biomarkers for ESCC diagnosis by using different biological samples, such as body fluids (plasma, serum, etc.), tumor tissues (fresh frozen tissues or formalin-fixed-paraffin-embedded tissues) and cells in vitro. In 2016, Harada et al. summarized 18 non-targeted proteomic studies with limited sample sizes for ESCC diagnosis based on mass spectrometry technology using serum, tissue and cell line samples, and identified several novel ESCC diagnostic markers, such as Apolipoprotein A-I, Tubulin beta chain filamin A alpha, HSP70, and so on [22]. Blood-based diagnostic studies have been extensively used as a cost-effective and fast screening tool for understanding diseases and medication treatment efficiency over the years, and organ-specific proteins in plasma could mirror organ dysfunction [23]. Development of a liquid biopsy method for early ESCC detection would significantly improve the efficiency of subsequent gastroscopy examination, especially for asymptomatic high-risk population.
Recently, a study identified 13 protein biomarkers in serum using the protein chip AAH-BLG-507 from RayBiotech for discriminating 10 early ESCC patients from 10 healthy controls in China [24]. Liao et al. reported that a combination of plasma FAPα plus traditional biomarker (CEA, CYFRA211, SCCA) using ELISA could significantly discriminate (AUC = 0.745) ESCC (n = 151, stage I: 29 + stage II: 59 + stage III/IV: 63) from non-malignancy controls (n = 230, healthy: 194 + benign:36) [25]. Huang et al. reported an AUC of 0.725 for serum IGFBP7 based on a study including 107 controls and 37 early ESCC patients [26]. Xu et al. reported the serum autoantibody panel (p53, MMP-7, HSP70, Prx VI and Bmi-1) could distinguish early-stage ESCC patients (n = 76) from normal controls (n = 134) with sensitivity of 45% and specificity of 96% in a validation cohort [13]. In our study, 23 proteins, namely, ANXA1, hK8, CDKN1A, ABL1, SCAMP3, EGF, LYN, MetAP2, KLK13, ADAMTS15, hK14, VIM, TXLNA, GPC1, RSPO3, hK11, TRAIL, X5NT, CPE, FADD, TGFR2, SEZ6L and CD160, showed potential diagnostic utility for distinguishing early ESCC from controls and their serum levels showed a significant dose-response relationship with ESCC stages. However, few overlapped proteins were found in the above-mentioned studies, which may be due to differences of candidate protein signatures, sample sizes, ESCC stages, biological nature of samples (plasma vs. serum) and detection methods (PEA vs. protein chip vs. ELISA) used in various studies.
This is the first study to estimate the efficiency of Olink Oncology II panel for the early diagnosis of ESCC. Although this panel was not designed specifically for identifying ESCC patients, the majority of the proteins on the Oncology II panel are secreted proteins that show abnormal expression in the tissues or sera of multiple types of cancer [21, 27, 28]. Especially, several proteins, such as, ANXA1, CEACAM5 (aka CEA), VIM, ALB1 and IL6, have been reported to be potential biomarkers in the diagnosis of ESCC, [24, 25, 29,30,31,32] however, most proteins on the Oncology II panel have not yet been examined for their expression in ESCC blood samples.
Model performance
In order to avoid overfitting and consider the clinical feasibility for early diagnosis of ESCC, a concise multi-protein classifier containing ANXA1, hK8, hK14, VIM and RSPO3 was created. The AUC of the five-protein classifier for differentiating early ESCC from controls was 0.936 (95%CI:0.899 ~ 0.973). The specificity and sensitivity were 78.6 and 96.7% at optimal Youden’s index, and the classification accuracy was 0.888. We used five-fold cross-validation to estimate the average accuracy rate of the five-protein classifier, and the corresponding figure was 0.861 and 0.825 in the discovery set and validation set, respectively. Overall, the differentiation efficiency of our multi-protein classifier was relatively superior to other studies [13, 25, 26, 33].
Biological functions
In our study, 92 tumor-related candidate proteins were detected in serum from various stage ESCC patients and healthy controls to predict cancer status, and 23 proteins were preliminarily identified as potential diagnostic protein biomarkers for ESCC. Functional enriched pathway analyses of these 23 proteins showed that they were involved in signaling receptor binding, extracellular space, regulation of response to stimulus and TP53 network implicated in development of ESCC. Thus, their compositions in serum could mirror the pro-tumorigenic ESCC microenvironment and can be used to monitor the progression of ESCC.
In our final diagnostic classifier for early stage ESCC, ANXA1, hK8, hK14, VIM and RSPO3 were selected. The serum levels of ANXA1 and VIM were over-expressed in ESCC patients, on the contrary, the serum levels of hK8, hK14 and RSPO3 were decreased.
ANXA1 (annexin A1), known as an endogenous anti-inflammatory protein, has now been recognized to be closely related to tumor cell proliferation, invasion, differentiation, apoptosis, metastasis and chemotherapy sensitivity via modulation of various cancer-associated pathways [34, 35]. Moreover, ANXA1 shows contrasting expression profiles in various cancer types: over-expressed in lung cancer, colorectal cancer, and pancreatic cancer, and so on, by the contrary, lack of expression in cervical cancer, prostate cancer, nasopharyngeal carcinoma, etc. [34, 36] We found a high level of ANXA1 in serum of ESCC patients, which is consistent with the finding of a previous study showing upregulated levels of ANXA1 in ESCC tissues versus matching normal tissues [30]. However, most previous studies reported that ANXA1 expression was significantly downregulated in cell lines and tissues from ESCC patients compared with adjacent normal tissues [29, 32, 37,38,39]. Further studies are needed to examine the correlation of tumor ANXA1 expression with serum level.
VIM (vimentin), one of class-III intermediated filament proteins, is involved with cytoskeletal integrity, cell adhesion and cell migration via epithelial-mesenchymal transition, [40, 41] and upregulated VIM levels in tissues have been reported as a potential diagnostic and prognostic marker of multiple types of cancers, such as prostate cancer, breast cancer, malignant melanoma and lung cancer [42]. The over-expressed VIM was reported in ESCC tissues compared with adjacent normal tissues, [30] which was somewhat consistent with the results of our study. The biological expression of vimentin is regulated by the transcription factors Twist, Zeb1, Snail, and Slug, which are induced by TGF-β signal transduction [43].
Dysregulation of kallikrein-related peptidases (KLKs) is related to differential expression signatures in various types of cancers, [44, 45] but little is known about its role in ESCC development. Four proteins from kallikrein-related peptidase family, namely, hK8(kallikrein-8), hK11(kallikrein-11), KLK13(kallikrein-13) and hK14(kallikrein-14), were detected by Olink Oncology II panel, and we found all of them had low levels in serum in ESCC patients regardless tumor stage, compared with healthy controls. KLKs, the largest secreted serine protease family, are involved in cancer cell growth, migration, invasion, and chemo-resistance by activation of PARs, the release of active growth factors, modulation of the proteolytic network, and activation of androgen receptor signaling [45, 46].
RSPO3 (R-spondin-3), an activator of the canonical Wnt signaling pathway and PI3K/AKT pathway as a key regulator of angiogenesis and epithelial-mesenchymal transition, has shown low expression in colorectal cancer, squamous cell carcinoma of the lung and prostate cancer, but upregulated expression in bladder cancer, ovarian cancer and lung adenocarcinoma [47,48,49,50]. Our study showed that RSPO3 level in serum was inversely associated with ESCC progression.
Limitations and future perspectives
The results of our models should be interpreted with caution. First, the study was conducted in an ESCC high-risk area of China, which might weaken the generalization of our five-protein prediction classifier to other relatively normal-risk areas. Second, although we found the overall good dose-response relationship between the serum levels of identified biomarkers and ESCC stages, the trends of certain proteins were not perfect, which recommends that external, independent studies are needed to validate and generalize our findings. Moreover, the identified protein biomarkers for ESCC were generally universal biomarkers for multiple types of tumors. Further work is needed to determine the specificity of our five-protein classifier for ESCC diagnosis versus other cancer types. Considering a three-level hierarchical screening strategy, i.e. “environment exposure + blood biopsy + esophagogastroduodenoscopy”, to be established in ESCC high-incidence area, our serum multi-protein classifier with high sensitivity and specificity would have a promising application value in high-risk population. The identified ESCC biomarkers are also involved in ESCC progression, which highlights their possible application also as prognostic biomarkers.
In summary, we identified and established a multi-protein classifier for discriminating early ESCC patients from healthy controls, which might contribute to improving the three-level hierarchical screening strategy for decreasing the ESCC burden in high-incidence areas. However, the results need to be further validated in prospective cohort studies.