Eksmote Machine Learning Model to Predict the Disease Severity of Covid Patients Using Big Data Analytics

Main Article Content

Ms. K. Sindhu, Dr. V.Baby Deepa


COVID-19 is recognized as starting from the end of 2019. Patients infected with SARS-CoV-2 may have no pathogenic symptoms, i.e., pre-symptomatic patients and asymptomatic patients. Both patients could further spread the virus to other susceptible people, thereby making the control of COVID-19 difficult. Therefore, new biomarkers at different omics levels are required for the large-scale screening and diagnosis of COVID-19. Although some initial analyses could identify a group of candidate gene biomarkers for COVID-19, the previous work still could not identify biomarkers capable for clinical use in COVID-19, which requires disease-specific diagnosis compared with other multiple infectious diseases. In this research work, The data were analysed using three classification algorithms, namely, logistic regression (LR), random forest (RF), and extreme gradient boosting (XGB). Initially, the data were pre-processed using several pre-processing techniques. Furthermore, 10-k cross-validation was applied for data partitioning and EKSMOTE for alleviating the data imbalance. Experiments were performed using twenty clinical features, identified as significant for predicting the survival versus the deceased COVID-19 patients. The results showed that RF outperformed the other classifiers with an accuracy of 0.95 and area under curve (AUC) of 0.99. The proposed model can assist the decision-making and health care professional by early identification of at-risk COVID-19 patients effectively.

Article Details