Executive Summary
Key regression findings including model fit and strongest predictor
The six-factor regression explains 77.9% of the variation in national happiness scores (R² = 0.779, adjusted R² = 0.77) across 156 countries. After controlling for all other predictors, social_support emerges as the strongest independent driver of happiness. Residual diagnostics confirm that the model assumptions are broadly satisfied.
Predictor Descriptive Statistics
Mean, SD, min, and max for each happiness predictor across all countries
| Variable | Mean | SD | Min | Max |
|---|---|---|---|---|
| gdp_per_capita | 0.9051 | 0.3984 | 0 | 1.684 |
| social_support | 1.209 | 0.2992 | 0 | 1.624 |
| healthy_life_expectancy | 0.7252 | 0.2421 | 0 | 1.141 |
| freedom | 0.3926 | 0.1433 | 0 | 0.631 |
| generosity | 0.1848 | 0.0953 | 0 | 0.566 |
| corruption | 0.1106 | 0.0945 | 0 | 0.453 |
Across the 156 countries analysed, GDP per capita shows the widest relative spread, while generosity and corruption have the smallest absolute ranges. Social support and healthy life expectancy have the highest mean values among the six predictors, suggesting these factors are prevalent in most countries. The outcome variable (happiness score) has a mean of 5.407.
Predictor Correlation Matrix
Pairwise Pearson correlations among happiness score and all six predictors
The correlation matrix covers happiness score and all six predictors. Among the predictors, gdp_per_capita shows the strongest bivariate correlation with happiness score (r = 0.794). High inter-predictor correlations (r > 0.7) could signal multicollinearity — the VIF table below provides the definitive diagnostic.
Standardized Regression Coefficients
Beta weights ranking each predictor's independent effect on happiness
Standardized beta coefficients rank the six predictors by their independent contribution to happiness after controlling for all others. social_support has the largest beta (0.302), meaning a one-SD increase in this factor is associated with a 0.302-SD change in happiness, holding all else constant. Predictors with betas close to zero have little unique explanatory power once the other five factors are accounted for.
Top 10 Happiest Countries
Countries with the highest happiness scores
Finland tops the rankings with a happiness score of 7.769. The gap between the happiest country and the 10th-ranked country is 0.523 points, showing that the top tier is relatively tightly clustered. All top-10 countries score well above the global mean of 5.407.
Bottom 10 Least Happy Countries
Countries with the lowest happiness scores
South Sudan has the lowest happiness score in the dataset at 2.853. The gap between the least happy country and the happiest is 4.916 points — a range of 4.916 points spanning the full dataset. Countries at the bottom tend to share low scores across multiple predictors, reinforcing the multidimensional nature of wellbeing.
VIF Multicollinearity Diagnostics
VIF scores indicating collinearity between happiness predictors
| Predictor | Vif Value |
|---|---|
| gdp_per_capita | 4.116 |
| healthy_life_expectancy | 3.573 |
| social_support | 2.736 |
| freedom | 1.575 |
| corruption | 1.432 |
| generosity | 1.224 |
Variance Inflation Factors assess how much each predictor's coefficient variance is inflated by correlations with other predictors. All VIF values are below 5 (maximum: 4.12 for gdp_per_capita), indicating acceptable multicollinearity. Predictors with VIF < 5 can be interpreted with confidence; those above 5 should be treated with caution when drawing causal conclusions.
Residuals vs Fitted Values
Scatter plot of regression residuals against fitted values to check homoskedasticity
The residuals vs fitted plot assesses whether OLS assumption of homoskedasticity holds: residuals should be randomly scattered around zero with no systematic pattern. Here residuals have a standard deviation of 0.5231 and a maximum absolute value of 1.753. Any funnel shape or curve in this plot would indicate heteroskedasticity or non-linearity requiring model refinement.
Residual Distribution
Histogram of regression residuals to validate the normality assumption
The histogram of residuals tests the normality assumption underpinning OLS inference. A bell-shaped, symmetric distribution centred near zero supports valid p-values and confidence intervals. The approximate skewness of the residuals is -0.493; values near zero indicate a symmetric distribution consistent with the normality assumption.