Symptom-Based Classification of Common Syndromes Using Machine Learning: A Review
DOI:
https://doi.org/10.32628/CSEIT26121Keywords:
: Symptom Classification, Machine Learning, Disease Prediction, Natural Language Processing, Healthcare AIAbstract
Symptom-based disease classification has emerged as a critical application of machine learning (ML) and artificial intelligence (AI) in modern healthcare. With the growing availability of electronic health records, patient-reported outcomes, and conversational health systems, automated interpretation of symptoms offers an efficient approach for early diagnosis and clinical decision support. This review systematically analyzes recent advancements in symptom-based classification of common syndromes using traditional machine learning, deep learning, natural language processing (NLP), and large language models (LLMs). The study highlights methods that process structured symptom vectors as well as unstructured free-text symptom descriptions obtained from chatbots and clinical narratives. A comprehensive comparison of twenty recent studies published between 2024 and 2025 is presented, focusing on employed methodologies, key advantages, and reported limitations. Furthermore, the review synthesizes major research findings, identifies open challenges such as data sparsity, explainability, and clinical reliability, and discusses future research directions. This paper aims to serve as a consolidated reference for researchers and practitioners working on intelligent symptom-based disease prediction systems and to guide future development of robust, interpretable, and clinically deployable models.
Downloads
References
S. S. Al-qarni and A. Algarni, “Disease Prediction from Symptom Descriptions Using Deep Learning and NLP Technique,” International Journal of Advanced Computer Science and Applications, vol. 16, no. 5, pp. 416–426, 2025, doi: 10.14569/IJACSA.2025.0160541. DOI: https://doi.org/10.14569/IJACSA.2025.0160541
D. Vyas, M. Shah, H. Kantawala, B. Patel, T. Patel, and J. Enamala, “SympTextML: Leveraging Natural Language Symptom Descriptions for Accurate Multi-Disease Prediction,” Journal of Electronics, Electromedical Engineering, and Medical Informatics, vol. 7, no. 3, pp. 911–924, 2025, doi: 10.35882/jeeemi.v7i3.946. DOI: https://doi.org/10.35882/jeeemi.v7i3.946
A. Chandel, “Healthchare Chatbot Using SVM & Decision Tree,” Trends in Health Informatics, vol. 2, no. 1, pp. 10–17, 2025.
J. Yang, L. Shu, H. Duan, and H. Li, “RDguru: A Conversational Intelligent Agent for Rare Diseases,” IEEE Journal of Biomedical and Health Informatics, vol. 29, no. 9, pp. 6366–6378, 2025, doi: 10.1109/JBHI.2024.3464555. DOI: https://doi.org/10.1109/JBHI.2024.3464555
G. Yang et al., “TGFN-SD: A text-guided multimodal fusion network for swine disease diagnosis,” Artificial Intelligence in Agriculture, vol. 15, no. 2, pp. 266–279, 2025, doi: 10.1016/j.aiia.2025.03.002. DOI: https://doi.org/10.1016/j.aiia.2025.03.002
S. Zhou et al., “Explainable differential diagnosis with dual-inference large language models,” npj Health Systems, vol. 2, no. 1, pp. 1–9, 2025, doi: 10.1038/s44401-025-00015-6. DOI: https://doi.org/10.1038/s44401-025-00015-6
J. T. Song, J. J. Huang, and R. L. Liu, “Integrating NLP and LLMs to discover biomarkers and mechanisms in Alzheimer’s disease,” SLAS Technology, vol. 31, no. February, pp. 0–9, 2025, doi: 10.1016/j.slast.2025.100257. DOI: https://doi.org/10.1016/j.slast.2025.100257
G. K. Gupta, A. Singh, S. V. Manikandan, and A. Ehtesham, “Digital Diagnostics: The Potential of Large Language Models in Recognizing Symptoms of Common Illnesses,” AI (Switzerland), vol. 6, no. 1, pp. 1–17, 2025, doi: 10.3390/ai6010013. DOI: https://doi.org/10.3390/ai6010013
X. Chen and Y. Du, “Enhancing medical text classification with GAN-based data augmentation and multi-task learning in BERT,” Scientific Reports, vol. 15, no. 1, pp. 1–13, 2025, doi: 10.1038/s41598-025-98281-9. DOI: https://doi.org/10.1038/s41598-025-98281-9
I. Almubark, “Exploring the Impact of Large Language Models on Disease Diagnosis,” IEEE Access, vol. 13, no. December 2024, pp. 8225–8238, 2025, doi: 10.1109/ACCESS.2025.3527025. DOI: https://doi.org/10.1109/ACCESS.2025.3527025
G. E. Sayegh, D. Ring, and P. Jayakumar, “Potential misinformation in large language model descriptions of upper extremity diseases,” Journal of Hand Surgery: European Volume, vol. 50, no. 3, pp. 411–414, 2025, doi: 10.1177/17531934241268975. DOI: https://doi.org/10.1177/17531934241268975
K. N. Singh and J. K. Mantri, “An intelligent recommender system using machine learning association rules and rough set for disease prediction from incomplete symptom set,” Decision Analytics Journal, vol. 11, no. April, p. 100468, 2024, doi: 10.1016/j.dajour.2024.100468. DOI: https://doi.org/10.1016/j.dajour.2024.100468
M. Abu Tareq Rony, M. Shariful Islam, T. Sultan, S. Alshathri, and W. El-Shafai, “MediGPT: Exploring Potentials of Conventional and Large Language Models on Medical Data,” IEEE Access, vol. 12, no. August, pp. 103473–103487, 2024, doi: 10.1109/ACCESS.2024.3428918. DOI: https://doi.org/10.1109/ACCESS.2024.3428918
A. Das, D. Choudhury, and A. Sen, “A collaborative empirical analysis on machine learning based disease prediction in health care system,” International Journal of Information Technology (Singapore), vol. 16, no. 1, pp. 261–270, 2024, doi: 10.1007/s41870-023-01556-5. DOI: https://doi.org/10.1007/s41870-023-01556-5
W. I. Wei, C. L. K. Leung, A. Tang, E. B. McNeil, S. Y. S. Wong, and K. O. Kwok, “Extracting symptoms from free-text responses using ChatGPT among COVID-19 cases in Hong Kong,” Clinical Microbiology and Infection, vol. 30, no. 1, pp. 142.e1-142.e3, 2024, doi: 10.1016/j.cmi.2023.11.002. DOI: https://doi.org/10.1016/j.cmi.2023.11.002
J. do Olmo, J. Logroño, C. Mascías, M. Martínez, and J. Isla, “Assessing DxGPT: Diagnosing Rare Diseases with Various Large Language Models,” med, pp. 1–13, May 2024, doi: 10.1101/2024.05.08.24307062. DOI: https://doi.org/10.1101/2024.05.08.24307062
E. Hassan, T. Abd El-Hafeez, and M. Y. Shams, “Optimizing classification of diseases through language model analysis of symptoms,” Scientific Reports, vol. 14, no. 1, pp. 1–24, 2024, doi: 10.1038/s41598-024-51615-5. DOI: https://doi.org/10.1038/s41598-024-51615-5
S. Zhang and J. Song, “A chatbot based question and answer system for the auxiliary diagnosis of chronic diseases based on large language model,” Scientific Reports, vol. 14, no. 1, pp. 1–14, 2024, doi: 10.1038/s41598-024-67429-4. DOI: https://doi.org/10.1038/s41598-024-67429-4
H. Zong et al., “Advancing Chinese biomedical text mining with community challenges,” Journal of Biomedical Informatics, vol. 157, no. August, 2024, doi: 10.1016/j.jbi.2024.104716. DOI: https://doi.org/10.1016/j.jbi.2024.104716
Y. Duan et al., “Research on a traditional Chinese medicine case-based question-answering system integrating large language models and knowledge graphs,” Frontiers in Medicine, vol. 11, no. January, 2024, doi: 10.3389/fmed.2024.1512329. DOI: https://doi.org/10.3389/fmed.2024.1512329
Downloads
Published
Issue
Section
License
Copyright (c) 2026 International Journal of Scientific Research in Computer Science, Engineering and Information Technology

This work is licensed under a Creative Commons Attribution 4.0 International License.