Analysis overview and configuration

Configuration

Analysis TypeRidge

CompanyTest Company

ObjectivePredict sales from multi-channel advertising spend using ridge regression with cross-validated regularization

Analysis Date2026-03-15

Processing Idanalytics__statistical__regression__ridge_test_20260315_114803

Total Observations200

Module Parameters

Parameter	Value	_row
lambda_selection	1se	lambda_selection
n_folds	10	n_folds
confidence_level	0.95	confidence_level
n_lambda	100	n_lambda

Ridge analysis for Test Company

Interpretation

Purpose

This analysis applies ridge regression to predict sales from three advertising channels (TikTok, Facebook, Google Ads) using L2 regularization. Ridge regression balances predictive accuracy with coefficient stability by penalizing large coefficients, making it ideal for multicollinear advertising spend data where all channels likely contribute to sales.

Key Findings

R-squared: 0.763 – The model explains 76.3% of sales variance, indicating strong predictive performance for a real-world advertising dataset
RMSE: $1,310 – Average prediction deviation is approximately $1,310 against a mean sales of $10,668 (12.3% relative deviation)
Optimal Lambda: 509.14 – Selected via 1-standard-deviation rule, providing conservative regularization that prioritizes stability over minimal loss
Coefficient Shrinkage: 15–17% – Ridge reduced all coefficients relative to OLS, with Google Ads shrinking most (17%), indicating it had the largest unconstrained effect

Interpretation

The model successfully captures the relationship between advertising spend and sales while controlling for multicollinearity through regularization. Google Ads shows the strongest effect (coefficient 1.01), followed by Facebook (0.41) and TikTok (0.31). The 1se lambda selection trades

Data preprocessing and column mapping

Data Quality

Initial Rows200

Final Rows200

Rows Removed0

Retention Rate100

Data Quality

Metric	Value
Initial Rows	200
Final Rows	200
Rows Removed	0
Retention Rate	100%

Processed 200 observations, retained 200 (100.0%) after cleaning

Interpretation

Purpose

This section documents the data cleaning and preparation phase for the ridge regression model predicting sales from multi-channel advertising spend. Perfect data retention (100%) indicates no observations were removed during preprocessing, meaning the full dataset of 200 records was available for model training and evaluation.

Key Findings

Initial Rows: 200 observations with no missing values in predictors or Sales columns
Final Rows: 200 observations retained after preprocessing
Retention Rate: 100% — no data loss occurred during cleaning
Data Split: Not explicitly documented in preprocessing; cross-validation (10 folds) was used instead of traditional train/test split

Interpretation

The absence of data removal suggests the dataset was clean and complete at intake, with no missing values requiring exclusion. This full retention is favorable for model stability and statistical power. However, the lack of a documented train/test split indicates the model evaluation relied entirely on cross-validation metrics rather than holdout validation, which may slightly overestimate generalization performance compared to independent test set evaluation.

Context

The preprocessing quality directly supports the model's R² of 0.763 and RMSE of 1310.17, as no data quality issues compromised the analysis. The 10-fold cross-validation approach (evident from the cv_plot_data) substitutes for explicit train/test documentation but provides robust

Key Metrics

r_squared: 0.7635
rmse: 1310.1733
lambda_optimal: 509.1368
n_predictors: 3
total_observations: 200

Key Findings

Finding	Value
Model Type	Ridge Regression (L2)
Observations Used	200
Predictors	3
R-squared	0.7635 (76.3% variance explained)
RMSE	1310.1733
MAE	1106.3535
Optimal Lambda	509.136800
Lambda Selection	1SE
CV Folds	10

Summary

Bottom Line: Ridge regression model shows good fit (R² = 0.7635) using 3 predictor(s) from 200 observations.

Key Findings:
• Model explains 76.3% of variance in the Sales
• RMSE = 1310.1733 — prediction deviation in original Sales units
• Optimal regularization: lambda = 509.136800 (selected via 1SE method)
• Ridge shrinkage visible in coefficient comparison vs OLS

Recommendation: If R² is strong (>0.6), use this model for prediction. Review the Ridge Trace to understand which predictors drive the Sales. If multicollinearity was a concern, ridge regression has stabilized coefficient estimates. For variable selection (sparse model), consider Lasso regression instead.

Interpretation

EXECUTIVE SUMMARY

Purpose

This ridge regression model was developed to predict sales from multi-channel advertising spend (TikTok, Facebook, Google Ads) using regularized linear regression. The analysis demonstrates whether the three advertising channels can reliably forecast sales outcomes and how regularization improves model stability compared to standard linear regression.

Key Findings

R-squared (0.763): Model explains 76.3% of sales variance—a strong fit indicating the three advertising channels are meaningful predictors of sales performance
RMSE (1,310.17): Average prediction deviation of approximately $1,310 in sales units; contextual assessment depends on typical sales magnitude
Optimal Lambda (509.14): Regularization parameter selected via 1SE method balances bias-variance tradeoff; shrinkage reduced all coefficients 15-17% versus OLS estimates
Coefficient Stability: Ridge shrinkage demonstrates multicollinearity management; Google Ads shows strongest effect (1.01), followed by Facebook (0.41) and TikTok (0.31)

Interpretation

The model achieves the stated objective of predicting sales from advertising spend with solid predictive power. Ridge regularization successfully stabilized coefficient estimates by penalizing large weights, reducing overfitting risk. The 1SE selection method prioritizes generalization

Cross-validation error across lambda values with optimal regularization selection

Interpretation

Purpose

This section evaluates regularization strength through 10-fold cross-validation across 100 lambda candidates. It identifies the optimal balance between model complexity and generalization loss—critical for ensuring the ridge regression model performs well on unseen data rather than overfitting to the training set.

Key Findings

Lambda Optimal (1se): 509.14 - Selected regularization parameter using the "one standard deviation" rule, prioritizing stability and generalization over minimal training loss
Lambda Minimum: 182.97 - Lowest cross-validated MSE, but more aggressive coefficient shrinkage may reduce robustness
CV Folds & Lambda Grid: 10 folds evaluated across 100 lambda values, providing robust loss estimates with wide regularization range (log-lambda: 5.21–14.42)
Loss Curve Pattern: MSE increases sharply at high lambda values (heavy regularization), indicating over-shrinkage degrades predictions

Interpretation

The 1se selection criterion trades ~7% higher CV loss (1,768,833 vs. 1,652,192 MSE) for a more conservative model with stronger regularization. This choice reflects the principle that simpler, more stable models generalize better when performance gains are marginal. The wide confidence bands around the loss curve confirm substantial fold-to-

Coefficient shrinkage paths showing how each predictor's coefficient changes as regularization increases

Interpretation

Purpose

The ridge trace visualizes how advertising channel coefficients shrink as regularization intensity increases across the full lambda path. This section demonstrates the core mechanism of L2 regularization: smooth, continuous coefficient reduction that prevents overfitting while retaining all predictors. Understanding this path is essential for validating why the optimal lambda (509.14) balances prediction accuracy with model stability.

Key Findings

Google Ads Dominance: Maintains the largest coefficient (1.13 at minimum lambda) and remains most influential even at high regularization, indicating robust predictive power for sales
Smooth Shrinkage Pattern: All three predictors follow continuous decay curves with no exact zeros, confirming L2 regularization behavior
Optimal Lambda (509.14): Located at log(lambda)≈6.23, representing the 1se selection criterion that trades minimal MSE increase for greater coefficient stability
Differential Sensitivity: TikTok and Facebook shrink more aggressively at high lambda values, suggesting lower robustness compared to Google Ads

Interpretation

The ridge trace reveals that Google Ads spending has the most stable relationship with sales across regularization levels, while TikTok and Facebook show greater coefficient volatility. At the selected lambda of 509.14, all three channels retain meaningful influence on sales predictions (coefficients: Google Ads

Actual vs predicted values showing overall model fit quality

Interpretation

Purpose

This section evaluates how well the ridge regression model predicts sales from multi-channel advertising spend. Model fit quality directly determines whether predictions are reliable for business decisions. Strong fit metrics indicate the three advertising channels (TikTok, Facebook, Google Ads) effectively explain sales variation.

Key Findings

R² = 0.7635: The model explains 76.3% of sales variance, exceeding the 0.6 threshold for good fit in cross-sectional data and indicating strong predictive power
RMSE = $1,310.17: Average prediction deviation in original sales units; relative to mean sales of $10,668, this represents ~12% typical deviation
MAE = $1,106.35: Median absolute deviation slightly lower than RMSE, suggesting relatively symmetric deviation distribution without extreme outliers
Residual Pattern: Mean residual = 0 confirms unbiased predictions; median = -$113.99 indicates slight systematic underprediction at lower sales levels

Interpretation

The ridge regression achieves solid predictive accuracy for the advertising spend-to-sales relationship. The 76.3% variance explained demonstrates that advertising channels are primary sales drivers, though 23.7% of variation remains unexplained—likely from unmeasured factors (seasonality, pricing, competition). Residual statistics show the

Comparison of ridge (regularized) versus OLS (unregularized) coefficients

Interpretation

Purpose

This section compares ridge-regularized coefficients against unregularized OLS estimates to reveal how regularization adjusts the model's parameter estimates. It demonstrates the bias-variance tradeoff inherent in ridge regression: shrinking coefficients reduces overfitting risk while introducing controlled bias. Understanding these adjustments is critical for interpreting the model's predictive behavior in the sales forecasting objective.

Key Findings

Google Ads Shrinkage (17%): The largest coefficient reduction (1.22 → 1.01), suggesting either the strongest true effect or highest correlation with other advertising channels
TikTok Shrinkage (14.9%): The smallest adjustment (0.36 → 0.31), indicating relatively independent predictive signal with minimal multicollinearity
Facebook Shrinkage (15.7%): Moderate regularization (0.49 → 0.41), positioned between TikTok and Google Ads
Optimal Lambda (509.14): Selected via 1-standard-deviation rule, balancing model complexity against cross-validation loss

Interpretation

All three advertising channels retain positive coefficients after regularization, confirming their continued relevance to sales prediction. The modest shrinkage percentages (14.9%–17%) indicate that multicollinearity among predictors is present but not

Detailed model performance metrics at optimal lambda

Metric	Value
R-squared	0.7635
RMSE	1310.1733
MAE	1106.3535
Lambda (optimal)	509.136847
Lambda selection	1se
Observations used	200
Predictors	3
CV folds	10

Interpretation

Purpose

This section evaluates how well the ridge regression model predicts sales from multi-channel advertising spend at the optimal regularization level. These metrics quantify prediction accuracy and validate whether the model provides useful forecasting capability for the business objective.

Key Findings

R-squared (0.7635): The model explains 76.3% of sales variance, indicating a strong fit that captures most systematic patterns in the data.
RMSE (1310.17): Average prediction deviation of $1,310, representing 48.5% of the Sales variable's standard deviation—well below the 50% threshold for useful predictive power.
MAE (1106.35): Median absolute deviation of $1,106, showing typical prediction deviations are slightly lower than RMSE, suggesting relatively symmetric deviation distribution.
Optimal Lambda (509.14): Selected via 1-standard-deviation rule, balancing bias-variance tradeoff by shrinking coefficients 15–17% from OLS estimates.

Interpretation

The ridge model demonstrates solid predictive performance for sales forecasting across TikTok, Facebook, and Google Ads channels. The 76% variance explained indicates the three advertising channels collectively capture the primary drivers of sales variation. RMSE relative to the outcome's spread confirms predictions are sufficiently accurate for practical business use,

Overview

Configuration

Module Parameters

Interpretation

Purpose

Key Findings

Interpretation

Data Preprocessing

Data Quality

Data Quality

Interpretation

Purpose

Key Findings

Interpretation

Context