Identification of recurrent risk-related genes and establishment of support vector machine prediction model for gastric cancer
Abstract:
This study sought to investigate genes related to recurrent risk and establish a support vector machine (SVM) classifier for prediction of recurrent risk in gastric cancer (GC).Based on the gene expression profiling dataset GSE26253, feature genes that were significantly associated with survival time and status were screened out. Subsequently, protein-protein interaction (PPI) network was constructed for these feature genes, and genes in this network was optimized using betweenness centrality algorithm in order to identify genes potentially correlated with GC (named as GCGs). In total, 1202 feature genes were identified to be significantly associated with survival time and status of GC, among of which, 65 genes were identified as a classifier that was able to recognize recurrence and nonrecurrence GC cases with a high sensitivity and specificity, predictive value (PPV), negative predictive value (NPV) and area under the receiver operating characteristic curve (AUC). Furthermore, the classifier was able to reasonably classify tumor samples in GSE15459 into high and low recurrent risk groups. Among those genes, a set of genes were predicted to have interactions (e.g. RHOA interacting with TGFBR1, PRKACA and PLCG1; TGFBR1 interacting with TGFBR2) and be involved in pathways like MAPK signaling (e.g. TGFBR1 and TGFBR2), adherens junction (e.g. RHOA) and apoptosis (e.g. PRKACA).The genes in the classifier model may be related to GC recurrence, and the classifier model may contribute to the prediction of recurrent risk in GC.
Received date: 05/07/2017
Accepted date: 07/12/2017
Ahead of print publish date: 05/16/2018
Issue: 3/2018
Volume: 65
Pages: 360 — 366
Keywords: gastric cancer, support vector machine, recurrence, gene, network
Supplementary files:
Tables - TE.docx
DOI: 10.4149/neo_2018_170507N326