To cite this paper use one of the standards below:
The global impact of COVID-19 includes millions infected and a significant death toll. Predicting infection severity is crucial for early interventions, optimizing healthcare resources, and saving lives. Deep learning, like Convolutional Neural Network (CNN) for processing biological sequences and Long Short-Term Memory Network (LSTM) for disease dynamics, shows promise in predicting infection severity effectively. Our study introduces a hybrid CNN-LSTM model to predict the potential for the evolution of health status using clinical data from 3467 samples in GISAID across South American countries, including age, gender, lineages, clade information, and Spike protein FASTA sequences of different variants. The CNN extracts features while the LSTM models temporal patterns. Descriptors include amino acid composition, sequence length, amino acid diversity, hydrophobicity, net charge, secondary structure, polarity, and numerical sequence descriptors. The model includes a CNN layer with 128 filters (kernel size 4, pool size 2) for feature capture and an LSTM layer with 64 units for long-term dependencies. Regularization and optimization use a learning rate of 0.0025 and a dropout rate of 0.166 to prevent overfitting. Performance evaluation employs cross-validation to ensure generalizability, assessing metrics like Precision, Recall, F1 Score, ROC-AUC Score, Sensitivity, and Specificity from the confusion matrix. The hyperparameters have shown optimal performance in capturing complex patterns and long-term dependencies in the sequence data. The confusion matrix revealed True Negatives of 391, False Positives of 76, False Negatives of 44, and True Positives of 183. The Precision Score was approximately 0.836. The Recall Score was around 0.827. The F1 Score, which balances precision and recall, was approximately 0.830, suggesting a good equilibrium between these two metrics. Additionally, the model's ROC-AUC Score was 0.914, the Sensitivity Score was about 0.806, and the Specificity Score was approximately 0.837, indicating a high overall performance. In summary, our study demonstrated that the hybrid CNN-LSTM model is effective in predicting the evolution of health status based on clinical data and Spike protein sequences of SARS-CoV-2. With robust metrics such as an approximate F1 Score and ROC-AUC, providing a substantial impact on public health strategies and resource management.
With nearly 200,000 papers published, Galoá empowers scholars to share and discover cutting-edge research through our streamlined and accessible academic publishing platform.
Learn more about our products:
This proceedings is identified by a DOI , for use in citations or bibliographic references. Attention: this is not a DOI for the paper and as such cannot be used in Lattes to identify a particular work.
Check the link "How to cite" in the paper's page, to see how to properly cite the paper