How Do Credit Card Fraud Detection Models Make Decisions? An Explainable Machine Learning Analysis
DOI:
https://doi.org/10.32628/CSEIT2612120Keywords:
Credit Card Fraud Detection, Machine Learning, Explainable AI, SHAP, Class ImbalanceAbstract
Credit card fraud detection is an important application of machine learning, where incorrect predictions can lead to financial losses and reduced customer trust. Although many fraud detection models achieve high accuracy, their decision-making process is often unclear, which limits their use in real financial systems. This paper presents an experiment-based study of machine learning models for credit card fraud detection with a focus on explainability. Logistic Regression, Random Forest, and Gradient Boosting models are evaluated using a real-world credit card transaction dataset with extreme class imbalance. Model performance is assessed using precision, recall, F1-score, ROC-AUC, and confusion matrix analysis, with special emphasis on fraud recall. To address class imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) is applied during training. To understand model decision-making, SHAP-based global and local explanations are generated for the best-performing model. Experimental results show that Random Forest provides the most balanced performance, achieving high fraud recall while controlling false positives. The explainability analysis shows that fraud predictions are driven by a small number of influential features. The study highlights the importance of evaluating both performance and explainability when deploying fraud detection systems in financial environments.
Downloads
References
L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001. DOI: https://doi.org/10.1023/A:1010933404324
J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Annals of Statistics, vol. 29, no. 5, pp. 1189–1232, 2001. DOI: https://doi.org/10.1214/aos/1013203451
S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” Advances in Neural Information Processing Systems (NeurIPS), pp. 4765–4774, 2017.
N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002. DOI: https://doi.org/10.1613/jair.953
A. Dal Pozzolo, G. Bontempi, and others, “Adversarial drift detection in streaming data,” IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 10, pp. 1929–1942, 2014.
C. Whitrow, D. J. Hand, P. Juszczak, D. Weston, and N. M. Adams, “Transaction aggregation as a strategy for credit card fraud detection,” Data Mining and Knowledge Discovery, vol. 18, no. 1, pp. 30–55, 2009. DOI: https://doi.org/10.1007/s10618-008-0116-z
A. Dal Pozzolo, G. Bontempi, M. Snoeck, and others, “Calibrating probability with undersampling for unbalanced classification,” in Proc. IEEE Symposium Series on Computational Intelligence, 2015, pp. 159–166. DOI: https://doi.org/10.1109/SSCI.2015.33
H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263–1284, 2009. DOI: https://doi.org/10.1109/TKDE.2008.239
A. Fernández, S. García, F. Herrera, and N. V. Chawla, “SMOTE for learning from imbalanced data: Progress and challenges,” Pattern Recognition, vol. 98, p. 107045, 2018.
S. M. Lundberg, G. Erion, H. Chen, et al., “From local explanations to global understanding with explainable AI for trees,” Nature Machine Intelligence, vol. 2, no. 1, pp. 56–67, 2020. DOI: https://doi.org/10.1038/s42256-019-0138-9
Downloads
Published
Issue
Section
License
Copyright (c) 2026 International Journal of Scientific Research in Computer Science, Engineering and Information Technology

This work is licensed under a Creative Commons Attribution 4.0 International License.