Research Paper Volume 12, Issue 20 pp 20471—20482
Development of a susceptibility gene based novel predictive model for the diagnosis of ulcerative colitis using random forest and artificial neural network
- 1 Division of Gastroenterology and Hepatology, Key Laboratory of Gastroenterology and Hepatology, Ministry of Health, Inflammatory Bowel Disease Research Center, Shanghai 200127, China
- 2 Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200127, China
- 3 Shanghai Institute of Digestive Disease, Shanghai 200127, China
Received: May 7, 2020 Accepted: July 21, 2020 Published: October 24, 2020
https://doi.org/10.18632/aging.103861How to Cite
Copyright: © 2020 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Abstract
Ulcerative colitis is a type of inflammatory bowel disease characterized by chronic and recurrent nonspecific inflammation of the intestinal tract. To find susceptibility genes and develop a novel predictive model of ulcerative colitis, two sets of cases and a control group containing the ulcerative colitis gene expression profile (training set GSE109142 and validation set GSE92415) were downloaded and used to identify differentially expressed genes. A total of 781 upregulated and 127 downregulated differentially expressed genes were identified in GSE109142. The random forest algorithm was introduced to determine 1 downregulated and 29 upregulated differentially expressed genes contributing highest to ulcerative colitis occurrence. Expression data of these 30 genes were transformed into gene expression scores, and an artificial neural network model was developed to calculate differentially expressed genes weights to ulcerative colitis. We established a universal molecular prognostic score (mPS) based on the expression data of the 30 genes and verified the mPS system with GSE92415. Prediction results agreed with that of an independent data set (ROC-AUC=0.9506/PR-AUC=0.9747). Our research creates a reliable predictive model for the diagnosis of ulcerative colitis, and provides an alternative marker panel for further research in disease early screening