Finance · Transactions · Fraud Detection P1778698833

Executive Summary

Key fraud detection performance metrics and model recommendations

n_observations

2000

n_train

1401

n_test

599

fraud_rate_original

0.108

fraud_rate_test

0.1068

best_model_f1

0.9344

best_model_auc

0.9734

best_model_recall

0.8906

best_model_precision

0.9828

Random Forest achieved the best balance with F1 score of 0.934 and AUC of 0.973 on the held-out test set of 599 transactions. Fraud recall (sensitivity) of 89.1% means the model caught 89.1% of fraudulent transactions, critical for minimizing fraud loss. The original dataset showed 10.80% fraud rate; after SMOTE balancing the training set to 10.8%, all models were less biased toward predicting legitimate transactions.

Interpretation

Random Forest achieved the best balance with F1 score of 0.934 and AUC of 0.973 on the held-out test set of 599 transactions. Fraud recall (sensitivity) of 89.1% means the model caught 89.1% of fraudulent transactions, critical for minimizing fraud loss. The original dataset showed 10.80% fraud rate; after SMOTE balancing the training set to 10.8%, all models were less biased toward predicting legitimate transactions.

Visualization

Fraud Class Imbalance

Distribution of fraudulent vs legitimate transactions in the original dataset

Interpretation

The original dataset contains 2000 transactions with only 216 frauds (10.80% fraud rate), illustrating severe class imbalance typical of real-world fraud detection problems. This imbalance is why SMOTE oversampling is essential during training—without it, models would achieve high accuracy by simply predicting 'legitimate' for every transaction.

Visualization

Feature Correlations with Fraud

Top 10 principal components ranked by correlation strength with fraud label

Interpretation

The strongest correlation with fraud is -0.805 (pc_14), indicating this component is the single most predictive feature. Correlations range from -0.805 to 0.693, showing that fraud patterns are distributed across multiple components rather than concentrated in one. This validates the use of ensemble methods that can capture non-linear combinations of these features.

Data Table

Model Performance Comparison

Accuracy, precision, recall, and F1 scores for three classification algorithms on the test set

Model Name	Accuracy	Precision	Recall	F1 Score
Logistic Regression	0.98	0.9333	0.875	0.9032
Random Forest	0.9866	0.9828	0.8906	0.9344
XGBoost	0.9866	0.9828	0.8906	0.9344

Interpretation

Random Forest achieved the best F1 score of 0.9344, balancing precision (0.9828) and recall (0.8906). Recall is critical for fraud detection because missing frauds (false negatives) is more costly than false alarms. All three models show reasonable performance, with Random Forest offering the best trade-off between catching fraud and minimizing false positives.

Visualization

Feature Importance (Top Features)

Mean Decrease in Gini importance for top 10 principal components from Random Forest model

Interpretation

pc_14 is the most important feature (importance 49.706), followed by pc_10 (36.574). These top features should be prioritized in future data collection and model monitoring. The importance distribution shows that fraud detection relies on an ensemble of many features rather than a single dominant predictor, justifying the use of non-linear models.

Visualization

ROC Curve (Best Model)

Trade-off between true positive rate (fraud detection) and false positive rate across all classification thresholds

Interpretation

The ROC curve shows AUC of 0.973 for Random Forest, indicating strong discrimination ability between fraudulent and legitimate transactions. At a false positive rate of 5%, the model achieves ~93.8% fraud detection (TPR). This curve enables threshold selection based on business priorities: lower threshold for aggressive fraud prevention (high recall), higher threshold to minimize false alarms.

Visualization

Confusion Matrix

Breakdown of correct and incorrect predictions for fraudulent and legitimate transactions

Interpretation

Out of 599 test transactions, the model correctly classified 534 legitimate transactions (specificity 99.8%) and 57 frauds (sensitivity 89.1%). There were 1 false positives (legitimate flagged as fraud) and 7 false negatives (fraud missed). The 7 missed frauds are the most costly errors in practice.

Data Table

Final Model Metrics

Summary of key performance metrics and deployment-ready threshold recommendations

Metric Name	Metric Value
AUC-ROC	0.9734
Accuracy	0.9866
Balanced Accuracy	0.9444
Sensitivity (Recall)	0.8906
Specificity	0.9981
Precision	0.9828
F1 Score	0.9344
Deployment Threshold (80% Fraud Recall)	0.785

Interpretation

The Random Forest model achieves AUC 0.973, indicating strong ability to discriminate frauds from legitimate transactions. At the deployment threshold of 0.79 (optimized for 80% fraud detection), sensitivity is 89.1% and specificity is 99.8%. Balanced accuracy of 94.4% accounts for both fraud detection and false alarm minimization. These metrics qualify the model for production deployment.

What's wrong with this card?

Executive Summary

Fraud Class Imbalance

Feature Correlations with Fraud

Model Performance Comparison

Feature Importance (Top Features)

ROC Curve (Best Model)

Confusion Matrix

Final Model Metrics

Report an Issue