Machine Learning Classifiers for Differentiation between Iron Deficiency Anaemia and Beta Thalassemia Trait: comparative study

Authors

  • Salma Abdulbaki Mahmood University of Basrah, Basrah, Iraq.

DOI:

https://doi.org/10.22399/ijcesen.2858

Keywords:

Machine Learning, Microcytosis Anaemia, Iron Deficiency, Beta Thalassemia Trait, Clinical Decision Support

Abstract

This research compares various machine learning classifiers to differentiate between Iron Deficiency Anaemia (IDA) and Beta Thalassemia trait (BTT). The goal is to identify the most suitable classifiers for handling complex and intertwined medical data, systematically evaluating twelve machine learning (ML) algorithms on a locally curated dataset comprising 2,160 participants (1,080 diagnosed with IDA and 1,040 with BTT conditions). The models being assessed include classical classifiers—Logistic Regression, Decision Tree, Random Forest, Support Vector Machine (SVM), Multilayer Perceptron (MLP), Linear Discriminant Analysis (LDA), k-Nearest Neighbors (KNN), and Naive Bayes—as well as ensemble-based methods: Gradient Boosting, AdaBoost, XGBoost, and CatBoost. Comprehensive pre-processing steps were applied, including outlier data deletion, imputation of missing values, class balancing, and feature scaling. An 80:20 train-test split was employed, with performance validated using 5-fold cross-validation to mitigate overfitting risks. Results indicate that SVM achieved the highest accuracy (97.0%) with an AUC of 99.1%, sensitivity of 95.9%, specificity of 98.7%, and the fastest execution time (0.37 seconds). AdaBoost and MLP followed closely, attaining accuracies of 96.9% and 96.8%, respectively. All models demonstrated high robustness, with F1-scores exceeding 96.8%. SVM provided the best trade-off between diagnostic performance and computational efficiency. These results emphasize the promise of enhanced machine learning models, especially SVM, as dependable and economical diagnostic assistance tools for IDA_BTT differentiation in clinical settings, providing a substitute for costly laboratory tests while maintaining diagnostic precision. The proposed framework underscores the feasibility of deploying ML techniques in haematological diagnostics based on routine clinical data.

References

[1]World Health Organization (Ed.). (2024). Guideline on haemoglobin cutoffs to define anaemia in individuals and populations. World Health Organization,

[2]McLean, E., Cogswell, M., Egli, I., Wojdyla, D., & De Benoist, B. (2009). Worldwide prevalence of anaemia, WHO Vitamin and Mineral Nutrition Information System, 1993–2005. Public Health Nutrition, 12(04), 444. https://doi.org/10.1017/S1368980008002401

[3]Lafta, R. K. (2023). Burden of Thalassemia in Iraq. Public Health Open Access, 7(1), 1–7. https://doi.org/10.23880/phoa-16000242

[4]Khaleed J . Khaleel. (2020). Thalassemia in Iraq Review Article. Iraqi Journal of Cancer and Medical Genetics, 13(1), 13–16. https://doi.org/10.29409/ijcmg.v13i1.308

[5]Yang, J., Li, Q., Feng, Y., & Zeng, Y. (2023). Iron Deficiency and Iron Deficiency Anemia: Potential Risk Factors in Bone Loss. International Journal of Molecular Sciences, 24(8), 6891. https://doi.org/10.3390/ijms24086891.

[6]Burz, C., Cismaru, A., Pop, V., & Bojan, A. (2019). Iron-Deficiency Anemia. In L. Rodrigo (Ed.), Iron Deficiency Anemia. IntechOpen. https://doi.org/10.5772/intechopen.80940

[7]Miri-Moghaddam, E., & Sargolzaie, N. (2014). Cut off Determination of Discrimination Indices in Differential Diagnosis between Iron Deficiency Anemia and β- Thalassemia Minor.8(2):27-32.

[8]Kabootarizadeh, L., Jamshidnezhad, A., Koohmareh, Z., & Ghamchili, A. (2019). Differential Diagnosis of Iron-Deficiency Anemia from beta-Thalassemia Trait Using an Intelligent Model in Comparison with Discriminant Indexes. Acta Informatica Medica, 27(2), 78. https://doi.org/10.5455/aim.2019.27.78-84

[9]Laengsri, V., Shoombuatong, W., Adirojananon, W., Nantasenamat, C., Prachayasittikul, V., & Nuchnoi, P. (2019). ThalPred: A web-based prediction tool for discriminating thalassemia trait and iron deficiency anemia. BMC Medical Informatics and Decision Making, 19(1), 212. https://doi.org/10.1186/s12911-019-0929-2

[10]Ayyıldız, H., & Arslan Tuncer, S. (2020). Determination of the effect of red blood cell parameters in the discrimination of iron deficiency anemia and beta thalassemia via Neighborhood Component Analysis Feature Selection-Based machine learning. Chemometrics and Intelligent Laboratory Systems, 196, 103886. https://doi.org/10.1016/j.chemolab.2019.103886

[11]Xiao, H., Wang, Y., Ye, Y., Yang, C., Wu, X., Wu, X., Zhang, X., Li, T., Xiao, J., Zhuang, L., Qi, H., & Wang, F. (2021). Differential diagnosis of thalassemia and iron deficiency anemia in pregnant women using new formulas from multidimensional analysis of red blood cells. Annals of Translational Medicine, 9(2), 141–141. https://doi.org/10.21037/atm-20-7896.

[ 12]Shahmirzalou, P., Hamze, M. S., & Sadagheyani, H. E. (2024). A New Formula Based on Simple Blood Indices to Differentiate Beta Thalassemia Trait from Iron Deficiency Anemia. Iranian Journal of Public Health. https://doi.org/10.18502/ijph.v53i5.15601.

[13]Saputra, D. C. E., Sunat, K., & Ratnaningsih, T. (2023). A New Artificial Intelligence Approach Using Extreme Learning Machine as the Potentially Effective Model to Predict and Analyze the Diagnosis of Anemia. Healthcare, 11(5), 697. https://doi.org/10.3390/healthcare11050697.

[14]Bahadure, N. B., Khomane, R., & Nittala, A. (2024). Anemia detection and classification from blood samples using data analysis and deep learning*. Automatika, 65(3), 1163–1176. https://doi.org/10.1080/00051144.2024.2352317

[15]Pullakhandam, S., & McRoy, S. (2024). Classification and Explanation of Iron Deficiency Anemia from Complete Blood Count Data Using Machine Learning. BioMedInformatics, 4(1), 661–672. https://doi.org/10.3390/biomedinformatics4010036

[16]Uçucu, S., & Azik, F. (2024). Artificial intelligence-driven diagnosis of β-thalassemia minor & iron deficiency anemia using machine learning models. Journal of Medical Biochemistry, 43(1), 11–18. https://doi.org/10.5937/jomb0-38779.

[17] Al-Najafi, W. K., Attiyah, M. N., & Abd, H. M. (2022). Karbala Formula to Differentiate Beta-Thalassemia Trait from Iron Deficiency Anemia. 15(1).https://doi.org/10.70863/karbalajm.v15i1.932

[18]Tepakhan, W., Srisint, W., Penglong, T., & Saelue, P. (2025). Machine learning approach for differentiating iron deficiency anemia and thalassemia using random forest and gradient boosting algorithms. In Review. https://doi.org/10.21203/rs.3.rs-5623304/v1.

[19]Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357. https://doi.org/10.1613/jair.953

[20]Géron, A. (2017). Hands-On Machine Learning with Scikit-Learn and TensorFlow. O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. http://oreilly.com/catalog/errata.csp?isbn=9781491962299

[21]Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010

Published

2025-06-03

How to Cite

Salma Abdulbaki Mahmood. (2025). Machine Learning Classifiers for Differentiation between Iron Deficiency Anaemia and Beta Thalassemia Trait: comparative study . International Journal of Computational and Experimental Science and Engineering, 11(3). https://doi.org/10.22399/ijcesen.2858

Issue

Section

Research Article