Developing a Deep Learning–Based Model for Predicting and Detecting Fraud in Financial Statements

Document Type : Original Article

Authors

1 Faculty of Social Sciences and Economics, Alzahra University, Tehran, Iran E-mail: n.mehrabi@alzahra.ac.ir

2 Alzahra University, Tehran, Iran Corresponding Author E-mail: gh.soleymani@alzahra.ac.ir

Abstract
This study developed a data-driven framework for financial statement fraud detection by benchmarking machine learning, deep learning, and hybrid classifiers under a unified, leakage-resistant evaluation protocol. The fraud cases were identified from the U.S. Securities and Exchange Commission’s Accounting and Auditing Enforcement Releases (AAERs) and matched with Compustat data over 1991–2014, producing 122,526 firm-year observations, including 902 confirmed fraud cases. Four structured-input configurations were evaluated: 28 raw financial statement items, 14 financial ratios, their combined set (28+14), and a parsimonious seven-feature subset (six ratios plus Altman’s Z-score). The features were selected using minimum redundancy–maximum relevance (mRMR), class imbalance was addressed via cost-sensitive learning, and performance was assessed with a firm-level 80/20 split and stratified group-based five-fold cross-validation within training. The empirical results indicated that deep and hybrid models consistently outperform classical tabular baselines, reflecting non-linear and interaction-driven fraud signals. The Transformer achieved the most stable and highest overall performance, reaching 0.98898 accuracy and a 0.51087 F1-score under the seven-feature configuration. The combined raw-item and ratio inputs outperformed the ratios alone, implying incremental predictive value in raw accounting items, while the best overall outcomes were obtained with parsimonious seven-feature subset. Collectively, the findings supported the study’s hypotheses and demonstrated the effectiveness of attention-based modeling for financial statement fraud detection.

Keywords

Subjects

REFERENCES
Achakzai, M. A. K., & Juan, P. (2022). Using machine learning Meta-Classifiers to detect financial frauds. Finance Research Letters, 48, 102915. https://doi.org/10.1016/J.FRL.2022.102915.
Ahmed, K., & Courtis, J. K. (2015). The determinants of financial ratio disclosures and quality: Evidence from an emerging market. British Accounting Review, 31(1), 35–61. https://doi.org/10.1006/BARE.1998.0082.
Ahmed, K. H., Axelsson, S., Li, Y., & Sagheer, A. M. (2025). A credit card fraud detection approach based on ensemble machine learning classifier with hybrid data sampling. Machine Learning with Applications, 20, 100675. https://doi.org/10.1016/J.MLWA.2025.100675.
Altman, E.I. (1983) Corporate Financial Distress: A Complete Guide to Predicting, Avoiding, and Dealing with Bankruptcy. Wiley, New York (1983), 368.                           
Arboleda, F. J. M., Guzman-Luna, J. A., & Torres, I. D. (2018). Fraud detection-oriented operators in a data warehouse based on forensic accounting techniques. Computer Fraud & Security, 2018(10), 13–19. https://doi.org/10.1016/S1361-3723(18)30098-8.
Azim Mim, M., Majadi, N., & Mazumder, P. (2024). A soft voting ensemble learning approach for credit card fraud detection. Heliyon, 10(3), e25466. https://doi.org/10.1016/j.heliyon.2024.e25466.
Bao, Y., Ke, B., Li, B., Yu, Y. J., & Zhang, J. (2020). Detecting accounting fraud in publicly traded U.S. firms using a machine learning approach. Journal of Accounting Research, 58(1), 199–235. https://doi.org/10.1111/1475-679X.12292.
Bhattacharya, I., & Mickovic, A. (2024). Accounting fraud detection using contextual language learning. International Journal of Accounting Information Systems, 53, 100682. https://doi.org/10.1016/J.ACCINF.2024.100682.
Cai, S., & Xie, Z. (2024). Explainable fraud detection of financial statement data driven by two-layer knowledge graph. Expert Systems with Applications, 246, 123126. https://doi.org/10.1016/J.ESWA.2023.123126.
Cao, R., Wang, J., Mao, M., Liu, G., & Jiang, C. (2023). Feature-wise attention based boosting ensemble method for fraud detection. Engineering Applications of Artificial Intelligence, 126, 106975. https://doi.org/10.1016/J.ENGAPPAI.2023.106975.
Cecchini, M., Aytug, H., Koehler, G. J., & Pathak, P. (2010). Detecting management fraud in public companies. Management Science56(7), 1146–1160. https://doi.org/10.1287/MNSC.1100.1174.
Dechow, P. M., Ge, W., Larson, C. R., & Sloan, R. G. (2011). Predicting material accounting misstatements. Contemporary Accounting Research, 28(1), 17–82. https://doi.org/10.1111/J.1911-3846.2010.01041.X.
Etemadi, H., & Zolghi, H. (2013). Application of logistic regression in detecting fraudulent financial reporting. Danesh-e Hesabresi (Auditing Knowledge), 13(51) 145-163.
Islam, M. R., Qaraqe, M., Qaraqe, K., & Serpedin, E. (2024). CAT-Net: Convolution, attention, and transformer based network for single-lead ECG arrhythmia classification. Biomedical Signal Processing and Control, 93, 106211. https://doi.org/10.1016/J.BSPC.2024.106211.
Jeyasothy, A., Suresh, S., Ramasamy, S., & Sundararajan, N. (2024). Development of a novel transformation of spiking neural classifier to an interpretable classifier. IEEE Transactions on Cybernetics, 54(1), 3–12. https://doi.org/10.1109/TCYB.2022.3181181.
Kanapickienė, R., & Grundienė, Ž. (2015). The model of fraud detection in financial statements by means of financial ratios. Procedia - Social and Behavioral Sciences, 213, 321–327. https://doi.org/10.1016/J.SBSPRO.2015.11.545.
Karnavou, E., Cascavilla, G., Marcelino, G., & Geradts, Z. (2025). I know you’re a fraud: Uncovering illicit activity in a Greek bank transactions with unsupervised learning. Expert Systems with Applications, 288, 128148. https://doi.org/10.1016/J.ESWA.2025.128148.
Kim, Y. J., Baik, B., & Cho, S. (2016). Detecting financial misstatements with fraud intention using multi-class cost-sensitive learning. Expert Systems with Applications, 62, 32–43. https://doi.org/10.1016/J.ESWA.2016.06.016.
Lei, Y. T., Ma, C. Q., Ren, Y. S., Chen, X. Q., Narayan, S., & Huynh, A. N. Q. (2023). A distributed deep neural network model for credit card fraud detection. Finance Research Letters, 58, 104547. https://doi.org/10.1016/J.FRL.2023.104547.
Lu, J., Xu, Q., & Hu, J. (2026). A novel graph learning framework for interpretable and imbalance financial fraud detection. Engineering Applications of Artificial Intelligence, 167, 113709. https://doi.org/10.1016/J.ENGAPPAI.2025.113709.
Mazzia, V., Salvetti, F., & Chiaberge, M. (2021). Efficient-CapsNet: capsule network with self-attention routing. Scientific Reports, 11(1), 14634. https://doi.org/10.1038/s41598-021-93977-0.
Mehrabi Hashjin, N., Amiri, M. H., Mohammadzadeh, A., Mirjalili, S., & Khodadadi, N. (2024). Novel hybrid classifier based on fuzzy type-III decision maker and ensemble deep learning model and improved chaos game optimization. Cluster Computing, 27(7), 10197–10234. https://doi.org/10.1007/S10586-024-04475-7/METRICS.
Narayana Gorle, V. L., & Panigrahi, S. (2026). An efficient heuristic optimization-based fraudulent activities detection in the financial sector using adaptive machine learning and deep learning system. Expert Systems with Applications, 302, 130551. https://doi.org/10.1016/J.ESWA.2025.130551.
Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1226–1238. https://doi.org/10.1109/TPAMI.2005.159.
Sai, C. V., Das, D., Elmitwally, N., Elezaj, O., & Islam, M. B. (2023). Explainable ai-driven financial transaction fraud detection using machine learning and deep neural networks. https://doi.org/10.2139/SSRN.4439980.
Shao, Z., Yu, H., Wen, J., Liu, Z., & Qi, P. (2026). A graph fraud detection model based on mutual information. Neurocomputing, 663, 131972. https://doi.org/10.1016/J.NEUCOM.2025.131972.
Vakilifard, H. R., Jabarzadeh Kangarlouei, S., & Pourreza Sultan Ahmadi, A. (2009). An investigation of the characteristics of fraud in financial statements. Monthly Magazine of the Iranian Association of Certified Public Accountants, (210), 26–41.
Zhang, Z., Wang, Z., & Cai, L. (2025). Predicting financial fraud in Chinese listed companies: An enterprise portrait and machine learning approach. Pacific-Basin Finance Journal, 90, 102665. https://doi.org/10.1016/J.PACFIN.2025.102665.