Executive Summary
Key model performance metrics for fraud detection
Logistic regression achieved AUC 0.984 and isolation forest achieved AUC 0.882 on the 5,000-transaction sample containing 492 confirmed fraud cases (9.84%). At the F1-optimal threshold, the logistic model catches 86.8% of fraud cases with precision 97.7% — meaning 10 legitimate transactions are flagged for every 427 fraud cases correctly identified.
Fraud Prevalence: Sample vs Original Dataset
Fraud rate comparison between the 5,000-row balanced sample and the full Kaggle dataset
The balanced training sample has a fraud rate of 9.84%, compared to just 0.17% in the full 284,807-transaction dataset. This enrichment (57x over-sampling) ensures both models see enough fraud examples to learn discriminative patterns. Results should be interpreted with the original prevalence in mind when estimating expected false-positive volume in production.
Isolation Forest Anomaly Score by Class
Distribution of isolation forest anomaly scores for fraudulent vs legitimate transactions
Fraudulent transactions have a median isolation forest score of 0.612 versus 0.587 for legitimate transactions. Higher scores indicate observations that are easier to isolate — i.e., they fall in sparse regions of feature space. The degree of separation between these distributions reflects how well the unsupervised model can discriminate without access to the fraud label.
Logistic Regression Coefficients (Top Features)
Log-odds coefficients from logistic regression showing each PCA component's contribution to fraud probability
Of the top 20 features by coefficient magnitude, 9 are statistically significant at p < 0.05. The largest-magnitude coefficient is v4 (+0.855), indicating it has the strongest influence on fraud log-odds. Negative coefficients reduce estimated fraud probability while positive coefficients increase it — features like V14 and V17 are well-known fraud discriminators in this dataset.
ROC Curve — Logistic vs Isolation Forest
ROC curves comparing logistic regression and isolation forest across all classification thresholds
Logistic regression achieves AUC = 0.984 and isolation forest achieves AUC = 0.882 on this dataset. AUC measures the probability that a randomly chosen fraud transaction is ranked above a randomly chosen legitimate transaction. The model closer to the top-left corner at any given false-positive rate is preferable for production use where analyst capacity limits review volume.
Precision-Recall Curve — Logistic vs Isolation Forest
PR curves showing the precision-recall trade-off for both models under class imbalance
Under the class imbalance present in the sample (9.8% fraud), precision-recall curves provide a more honest picture of operational performance than ROC curves. A model with high PR-AUC maintains strong precision even as recall is pushed toward 1.0, minimising the number of legitimate transactions sent to a human reviewer for each additional fraud case caught.
Confusion Matrix at Optimal F1 Threshold
Confusion matrix using logistic regression at the threshold maximising F1-score (0.428)
At the F1-optimal threshold of 0.428, the logistic model correctly identifies 427 of 492 fraud cases (recall = 86.8%) while generating 10 false positives. 65 fraud cases are missed (false negatives), and these represent the highest operational risk — transactions that pass through undetected.
Top 20 Most Anomalous Transactions
Highest-risk transactions ranked by blended isolation forest and logistic regression score
| Transaction ID | Amount | Fraud Prob | Anomaly Score | Class Label |
|---|---|---|---|---|
| 1999 | 0.01 | 1 | 0.7242 | Fraud |
| 4597 | 2.28 | 1 | 0.7222 | Fraud |
| 4927 | 0 | 1 | 0.7183 | Fraud |
| 28 | 1 | 1 | 0.7154 | Fraud |
| 737 | 1 | 1 | 0.7101 | Fraud |
| 1554 | 1 | 1 | 0.7101 | Fraud |
| 3238 | 1 | 1 | 0.7101 | Fraud |
| 3644 | 1 | 1 | 0.7101 | Fraud |
| 1297 | 2.28 | 1 | 0.7077 | Fraud |
| 1996 | 1 | 1 | 0.7039 | Fraud |
| 2911 | 1.63 | 1 | 0.7025 | Fraud |
| 3938 | 9.82 | 1 | 0.701 | Fraud |
| 301 | 364.2 | 1 | 0.6944 | Fraud |
| 1733 | 1.63 | 1 | 0.6888 | Fraud |
| 1976 | 106.5 | 1 | 0.6874 | Fraud |
| 1371 | 8.64 | 1 | 0.6869 | Fraud |
| 1506 | 1219 | 1 | 0.6851 | Fraud |
| 4223 | 1 | 1 | 0.6851 | Fraud |
| 1654 | 139.9 | 1 | 0.6823 | Fraud |
| 3025 | 30.31 | 1 | 0.6745 | Fraud |
The 20 transactions with the highest combined anomaly score include 20 confirmed fraud cases and 0 legitimate transactions flagged as highly anomalous. The blended score averages the normalised isolation forest score and the logistic regression probability, prioritising transactions that both models agree are suspicious. Transaction amount ranges from $0 to $1218.89 across the top 20.
Feature Importance — Isolation Forest
Which PCA components contribute most to the isolation forest's anomaly detection
The most important feature for isolation forest is v19 — the PCA component on which the model most frequently and shallowly splits to isolate anomalies. Features that appear at shallow splits are particularly discriminating because they alone can separate outliers from the bulk of the data. Comparing this ranking to the logistic regression coefficients reveals whether supervised and unsupervised methods agree on which PCA components carry fraud signal.