Upload your data and get a complete pima indians diabetes risk drivers report. Free.
or click to browse · max 3 MB
Running pima indians diabetes risk drivers analysis...
Sent to — interactive charts, statistical results, R code, and AI insights.
Analyze another fileLogistic regression with odds ratios, ROC/AUC, and confusion matrix on the Pima Indians diabetes dataset with transparent handling of zero-coded missing values
Use when you have binary clinical outcome data and want to identify which continuous measurements best predict the outcome
Do not use if you have fewer than 100 observations or if predictors are highly collinear without regularization
Built for: Clinical data analysts, population health analysts, biostatisticians, clinical researchers, public health epidemiologists, healthcare quality improvement managers
Typical data source: Patient records CSV with clinical biomarkers (glucose, BMI, blood pressure, insulin, skin thickness, age) and a binary diabetes diagnosis outcome column
Dataset with 9 columns
Minimum 30 rows
768 female patients, 8 clinical predictors, binary outcome (Outcome: 1=diabetic, 0=not). The canonical UCI/Kaggle health classification dataset. Known zero-value artifacts in glucose, blood pressure, BMI that represent missing data — the analysis must handle these.
How clinical measurements differ between diabetic and non-diabetic patients
Which clinical variables have the most zero-coded missing values
Which measurements show greatest distributional separation by outcome
Logistic regression odds ratios showing which predictors significantly increase diabetes odds
ROC curve showing model discrimination across all thresholds
Confusion matrix at Youden optimal threshold showing TP/FP/FN balance
Variable importance ranking by absolute standardized log-odds
Plain-English interpretation — what the numbers mean, what's significant, and what to do next.
Need something simpler? Tf038 Live Ttest — When you only need to compare a single clinical measurement (e.g., glucose or BMI) between diabetic and non-diabetic groups without building a full predictive model with odds ratios.
Need more power? Fraud Anomaly — When your outcome is extremely rare (anomaly detection approach) rather than a standard binary classification problem with ~35% event rate like diabetes onset.
Similar: Attrition Drivers, Churn Drivers
Uses logistic regression odds ratios with 95% confidence intervals to rank clinical predictors by statistical significance and effect size, controlling for all other variables simultaneously.
See our FAQ for details on pricing, data privacy, and how the analysis works. Every report includes a Methodology section showing the statistical test, assumptions checked, and diagnostics run.
Run any analysis on your own data — validated R analyses, interactive reports, AI insights, and PDF export.
Try Free — No Credit Card