Upload your data and get a complete medical insurance cost prediction report. Free.
or click to browse · max 3 MB
Running medical insurance cost prediction analysis...
Sent to — interactive charts, statistical results, R code, and AI insights.
Analyze another fileGeneralized Linear Model (GLM) with Gamma distribution and log link, including a smoker-by-BMI interaction term, fitted to predict individual annual medical insurance charges from demographic and lifestyle factors
Use when predicting medical costs or insurance premiums with right-skewed positive outcome variables and suspected multiplicative interaction effects between lifestyle factors
Do not use if outcome variable can be negative, if sample size is very small (under 100), or if you need exact prediction intervals rather than mean cost estimates
Built for: Actuaries, insurance pricing analysts, underwriters, health benefits managers, data scientists in insurance
Typical data source: Policyholder records with age, BMI, smoking status, region, number of dependents, and annual medical charges billed
Dataset with 7 columns
Minimum 30 rows
Cornerstone #15 — GLM with interaction effects on insurance medical cost (3,178 votes)
Distribution of medical charges showing right-skew
Average charges comparison between smokers and non-smokers
BMI vs charges scatter showing smoker interaction effect
GLM coefficient magnitudes showing predictor importance
Average medical costs by US geographic region
Actual vs predicted charges showing model fit quality
Residuals vs fitted values for GLM diagnostic assessment
Descriptive statistics for all numeric variables
Plain-English interpretation — what the numbers mean, what's significant, and what to do next.
Need something simpler? Diabetes Risk Drivers — When you need to identify which health and demographic factors are associated with disease risk rather than predict a continuous cost amount
Need more power? Cancer Classification — When you need to classify policyholders into discrete high/medium/low risk tiers using a classification model rather than predict their continuous cost
Similar: Price Drivers Geo, Happiness Regression
Actuarial Premium Pricing
Insurers upload policyholder demographics to build a transparent, auditable GLM that justifies premium tiers based on quantified risk factors — smoker status, BMI, age, and region — without black-box complexity.
See our FAQ for details on pricing, data privacy, and how the analysis works. Every report includes a Methodology section showing the statistical test, assumptions checked, and diagnostics run.
Run any analysis on your own data — validated R analyses, interactive reports, AI insights, and PDF export.
Try Free — No Credit Card