Proactive Detection of Tax Fraud Using Explainable AI Techniques: A Hybrid Approach

Tax fraud continues to cause significant financial losses for governments worldwide. In the United States alone, the Internal Revenue Service estimates an annual tax gap of approximately $441 billion, much of it is due to intentional evasion and fraud. Traditional detection methods are based on static rules, manual audits, and retrospective analysis which often fail to keep pace with increasingly sophisticated and adaptive fraud schemes. In this research study, we develop an advanced hybrid AI framework for tax fraud detection, combining a gradient-boosted decision tree (GBDT) model (via XGBoost) with a deep neural network (DNN) that incorporates an attention mechanism. Crucially, we integrate explainable AI (XAI) techniques (e.g., SHAP and attention heatmaps) so that the model’s predictions are transparent to tax auditors. Using a realistic synthetic dataset of 10,000 tax returns (10% fraud), our hybrid model achieved 92% accuracy and 88% recall, 91% precision, and 0.95 AUC, significantly outperforming conventional approaches (p<0.01). Moreover, it enables proactive detection by identifying potentially fraudulent returns early in the filing process (before full processing) and generating clear explanations for each flagged return. This approach advances both theoretical research and practical application by demonstrating the effectiveness of hybrid modeling and XAI in regulatory settings and by providing a scalable roadmap for tax authorities to modernize fraud detection within legal, ethical, and operational constraints. (source code : https://github.com/aalosbeh/tax-fraud-xai)

Anas AlSobeh
Southern Illinois University Carbondale
United States
anas.alsobeh@siu.edu

Mustafa Abo El Rob
University of Denver
United States
mustafa.aboelrob@du.edu

Kamel Rouibah
University of Kuwait
Kuwait
kamel.rouibah@ku.edu.kw

Amani Shatnawi
Weber State University
United States
amanishtanawi1@weber.edu