Upload your data and get a complete credit card fraud anomaly detection report. Free.
or click to browse · max 3 MB
Running credit card fraud anomaly detection analysis...
Sent to — interactive charts, statistical results, R code, and AI insights.
Analyze another fileCombines unsupervised isolation forest and supervised logistic regression to score each transaction for fraud likelihood, enabling side-by-side comparison of anomaly signals and interpretable coefficient analysis
Use when you have labeled fraud data and want to compare supervised vs unsupervised approaches, or when you need both interpretable coefficients and anomaly-based detection
Do not use if you have no labeled data at all (use pure isolation forest instead), or if the dataset has fewer than 200 transactions
Built for: Fraud analysts, risk management officers, data scientists, AML/KYC compliance analysts, financial crime investigators
Typical data source: CSV with transaction records including amount, timestamp, fraud label, and behavioral or PCA-transformed features
Dataset with 31 columns
Minimum 100 rows
The Kaggle Credit Card Fraud dataset has 284,807 transactions of which 492 (0.173%) are fraudulent. Features V1-V28 are PCA-transformed (anonymized for privacy), plus Time (seconds from first transaction) and Amount (transaction value). We use a class-balanced 5,000-row sample (all 492 positives + 4,508 negatives, seed=42) so the model sees enough positive signal. Isolation forest via the solitude R package; logistic regression via stats::glm with ROC/AUC from pROC.
Compares fraud rate in the balanced training sample vs the full original dataset to show enrichment
Shows whether isolation forest anomaly scores separate fraud from legitimate transactions
Shows logistic regression coefficient magnitudes with significance flags for each PCA component
ROC curves for both models showing AUC and the false-positive rate at 80% recall
Precision-recall curves showing which model maintains higher precision at key recall thresholds
Confusion matrix at the F1-optimal threshold showing false negatives and false positives
Top 20 highest-risk transactions ranked by combined isolation forest and logistic probability scores
Isolation forest feature importance showing which PCA components drive anomaly detection
Plain-English interpretation — what the numbers mean, what's significant, and what to do next.
Need something simpler? Tf038 Live Ttest — When you just need to test whether average transaction amounts or feature values differ significantly between fraud and non-fraud groups, without needing a full anomaly scoring model.
Similar: Churn Drivers, Attrition Drivers
The ranking table surfaces the top 20 most anomalous transactions scored by both isolation forest and logistic regression, giving investigators a clear triage list.
See our FAQ for details on pricing, data privacy, and how the analysis works. Every report includes a Methodology section showing the statistical test, assumptions checked, and diagnostics run.
Run any analysis on your own data — validated R analyses, interactive reports, AI insights, and PDF export.
Try Free — No Credit Card