Overview

Overview

Ridge Regression Configuration

Analysis overview and configuration

Configuration

Analysis TypeRidge
CompanyTest Company
ObjectivePredict sales from multi-channel advertising spend using ridge regression with cross-validated regularization
Analysis Date2026-03-15
Processing Idanalytics__statistical__regression__ridge_test_20260315_114803
Total Observations200

Module Parameters

ParameterValue_row
lambda_selection1selambda_selection
n_folds10n_folds
confidence_level0.95confidence_level
n_lambda100n_lambda
Ridge analysis for Test Company

Interpretation

Purpose

This analysis applies ridge regression to predict sales from three advertising channels (TikTok, Facebook, Google Ads) using L2 regularization. Ridge regression balances predictive accuracy with coefficient stability by penalizing large coefficients, making it ideal for multicollinear advertising spend data where all channels likely contribute to sales.

Key Findings

  • R-squared: 0.763 – The model explains 76.3% of sales variance, indicating strong predictive performance for a real-world advertising dataset
  • RMSE: $1,310 – Average prediction deviation is approximately $1,310 against a mean sales of $10,668 (12.3% relative deviation)
  • Optimal Lambda: 509.14 – Selected via 1-standard-deviation rule, providing conservative regularization that prioritizes stability over minimal loss
  • Coefficient Shrinkage: 15–17% – Ridge reduced all coefficients relative to OLS, with Google Ads shrinking most (17%), indicating it had the largest unconstrained effect

Interpretation

The model successfully captures the relationship between advertising spend and sales while controlling for multicollinearity through regularization. Google Ads shows the strongest effect (coefficient 1.01), followed by Facebook (0.41) and TikTok (0.31). The 1se lambda selection trades

Data Preparation

Data Preprocessing

Data Quality & Completeness

Data preprocessing and column mapping

Data Quality

Initial Rows200
Final Rows200
Rows Removed0
Retention Rate100

Data Quality

MetricValue
Initial Rows200
Final Rows200
Rows Removed0
Retention Rate100%
Processed 200 observations, retained 200 (100.0%) after cleaning

Interpretation

Purpose

This section documents the data cleaning and preparation phase for the ridge regression model predicting sales from multi-channel advertising spend. Perfect data retention (100%) indicates no observations were removed during preprocessing, meaning the full dataset of 200 records was available for model training and evaluation.

Key Findings

  • Initial Rows: 200 observations with no missing values in predictors or Sales columns
  • Final Rows: 200 observations retained after preprocessing
  • Retention Rate: 100% — no data loss occurred during cleaning
  • Data Split: Not explicitly documented in preprocessing; cross-validation (10 folds) was used instead of traditional train/test split

Interpretation

The absence of data removal suggests the dataset was clean and complete at intake, with no missing values requiring exclusion. This full retention is favorable for model stability and statistical power. However, the lack of a documented train/test split indicates the model evaluation relied entirely on cross-validation metrics rather than holdout validation, which may slightly overestimate generalization performance compared to independent test set evaluation.

Context

The preprocessing quality directly supports the model's R² of 0.763 and RMSE of 1310.17, as no data quality issues compromised the analysis. The 10-fold cross-validation approach (evident from the cv_plot_data) substitutes for explicit train/test documentation but provides robust

Executive Summary

Executive Summary

Ridge Regression Results

Key Metrics

r_squared
0.7635
rmse
1310.1733
lambda_optimal
509.1368
n_predictors
3
total_observations
200

Key Findings

FindingValue
Model TypeRidge Regression (L2)
Observations Used200
Predictors3
R-squared0.7635 (76.3% variance explained)
RMSE1310.1733
MAE1106.3535
Optimal Lambda509.136800
Lambda Selection1SE
CV Folds10

Summary

Bottom Line: Ridge regression model shows good fit (R² = 0.7635) using 3 predictor(s) from 200 observations.

Key Findings:
• Model explains 76.3% of variance in the Sales
• RMSE = 1310.1733 — prediction deviation in original Sales units
• Optimal regularization: lambda = 509.136800 (selected via 1SE method)
• Ridge shrinkage visible in coefficient comparison vs OLS

Recommendation: If R² is strong (>0.6), use this model for prediction. Review the Ridge Trace to understand which predictors drive the Sales. If multicollinearity was a concern, ridge regression has stabilized coefficient estimates. For variable selection (sparse model), consider Lasso regression instead.

Interpretation

EXECUTIVE SUMMARY

Purpose

This ridge regression model was developed to predict sales from multi-channel advertising spend (TikTok, Facebook, Google Ads) using regularized linear regression. The analysis demonstrates whether the three advertising channels can reliably forecast sales outcomes and how regularization improves model stability compared to standard linear regression.

Key Findings

  • R-squared (0.763): Model explains 76.3% of sales variance—a strong fit indicating the three advertising channels are meaningful predictors of sales performance
  • RMSE (1,310.17): Average prediction deviation of approximately $1,310 in sales units; contextual assessment depends on typical sales magnitude
  • Optimal Lambda (509.14): Regularization parameter selected via 1SE method balances bias-variance tradeoff; shrinkage reduced all coefficients 15-17% versus OLS estimates
  • Coefficient Stability: Ridge shrinkage demonstrates multicollinearity management; Google Ads shows strongest effect (1.01), followed by Facebook (0.41) and TikTok (0.31)

Interpretation

The model achieves the stated objective of predicting sales from advertising spend with solid predictive power. Ridge regularization successfully stabilized coefficient estimates by penalizing large weights, reducing overfitting risk. The 1SE selection method prioritizes generalization

Figure 4

Cross-Validation Error

MSE vs Regularization Strength

Cross-validation error across lambda values with optimal regularization selection

Interpretation

Purpose

This section evaluates regularization strength through 10-fold cross-validation across 100 lambda candidates. It identifies the optimal balance between model complexity and generalization loss—critical for ensuring the ridge regression model performs well on unseen data rather than overfitting to the training set.

Key Findings

  • Lambda Optimal (1se): 509.14 - Selected regularization parameter using the "one standard deviation" rule, prioritizing stability and generalization over minimal training loss
  • Lambda Minimum: 182.97 - Lowest cross-validated MSE, but more aggressive coefficient shrinkage may reduce robustness
  • CV Folds & Lambda Grid: 10 folds evaluated across 100 lambda values, providing robust loss estimates with wide regularization range (log-lambda: 5.21–14.42)
  • Loss Curve Pattern: MSE increases sharply at high lambda values (heavy regularization), indicating over-shrinkage degrades predictions

Interpretation

The 1se selection criterion trades ~7% higher CV loss (1,768,833 vs. 1,652,192 MSE) for a more conservative model with stronger regularization. This choice reflects the principle that simpler, more stable models generalize better when performance gains are marginal. The wide confidence bands around the loss curve confirm substantial fold-to-

Figure 5

Ridge Trace

Coefficient Shrinkage Paths

Coefficient shrinkage paths showing how each predictor's coefficient changes as regularization increases

Interpretation

Purpose

The ridge trace visualizes how advertising channel coefficients shrink as regularization intensity increases across the full lambda path. This section demonstrates the core mechanism of L2 regularization: smooth, continuous coefficient reduction that prevents overfitting while retaining all predictors. Understanding this path is essential for validating why the optimal lambda (509.14) balances prediction accuracy with model stability.

Key Findings

  • Google Ads Dominance: Maintains the largest coefficient (1.13 at minimum lambda) and remains most influential even at high regularization, indicating robust predictive power for sales
  • Smooth Shrinkage Pattern: All three predictors follow continuous decay curves with no exact zeros, confirming L2 regularization behavior
  • Optimal Lambda (509.14): Located at log(lambda)≈6.23, representing the 1se selection criterion that trades minimal MSE increase for greater coefficient stability
  • Differential Sensitivity: TikTok and Facebook shrink more aggressively at high lambda values, suggesting lower robustness compared to Google Ads

Interpretation

The ridge trace reveals that Google Ads spending has the most stable relationship with sales across regularization levels, while TikTok and Facebook show greater coefficient volatility. At the selected lambda of 509.14, all three channels retain meaningful influence on sales predictions (coefficients: Google Ads

Figure 6

Actual vs Predicted

Model Prediction Accuracy

Actual vs predicted values showing overall model fit quality

Interpretation

Purpose

This section evaluates how well the ridge regression model predicts sales from multi-channel advertising spend. Model fit quality directly determines whether predictions are reliable for business decisions. Strong fit metrics indicate the three advertising channels (TikTok, Facebook, Google Ads) effectively explain sales variation.

Key Findings

  • R² = 0.7635: The model explains 76.3% of sales variance, exceeding the 0.6 threshold for good fit in cross-sectional data and indicating strong predictive power
  • RMSE = $1,310.17: Average prediction deviation in original sales units; relative to mean sales of $10,668, this represents ~12% typical deviation
  • MAE = $1,106.35: Median absolute deviation slightly lower than RMSE, suggesting relatively symmetric deviation distribution without extreme outliers
  • Residual Pattern: Mean residual = 0 confirms unbiased predictions; median = -$113.99 indicates slight systematic underprediction at lower sales levels

Interpretation

The ridge regression achieves solid predictive accuracy for the advertising spend-to-sales relationship. The 76.3% variance explained demonstrates that advertising channels are primary sales drivers, though 23.7% of variation remains unexplained—likely from unmeasured factors (seasonality, pricing, competition). Residual statistics show the

Figure 7

Coefficient Comparison

Ridge vs OLS Coefficients

Comparison of ridge (regularized) versus OLS (unregularized) coefficients

Interpretation

Purpose

This section compares ridge-regularized coefficients against unregularized OLS estimates to reveal how regularization adjusts the model's parameter estimates. It demonstrates the bias-variance tradeoff inherent in ridge regression: shrinking coefficients reduces overfitting risk while introducing controlled bias. Understanding these adjustments is critical for interpreting the model's predictive behavior in the sales forecasting objective.

Key Findings

  • Google Ads Shrinkage (17%): The largest coefficient reduction (1.22 → 1.01), suggesting either the strongest true effect or highest correlation with other advertising channels
  • TikTok Shrinkage (14.9%): The smallest adjustment (0.36 → 0.31), indicating relatively independent predictive signal with minimal multicollinearity
  • Facebook Shrinkage (15.7%): Moderate regularization (0.49 → 0.41), positioned between TikTok and Google Ads
  • Optimal Lambda (509.14): Selected via 1-standard-deviation rule, balancing model complexity against cross-validation loss

Interpretation

All three advertising channels retain positive coefficients after regularization, confirming their continued relevance to sales prediction. The modest shrinkage percentages (14.9%–17%) indicate that multicollinearity among predictors is present but not

Table 8

Model Performance

Performance Metrics at Optimal Lambda

Detailed model performance metrics at optimal lambda

MetricValue
R-squared0.7635
RMSE1310.1733
MAE1106.3535
Lambda (optimal)509.136847
Lambda selection1se
Observations used200
Predictors3
CV folds10

Interpretation

Purpose

This section evaluates how well the ridge regression model predicts sales from multi-channel advertising spend at the optimal regularization level. These metrics quantify prediction accuracy and validate whether the model provides useful forecasting capability for the business objective.

Key Findings

  • R-squared (0.7635): The model explains 76.3% of sales variance, indicating a strong fit that captures most systematic patterns in the data.
  • RMSE (1310.17): Average prediction deviation of $1,310, representing 48.5% of the Sales variable's standard deviation—well below the 50% threshold for useful predictive power.
  • MAE (1106.35): Median absolute deviation of $1,106, showing typical prediction deviations are slightly lower than RMSE, suggesting relatively symmetric deviation distribution.
  • Optimal Lambda (509.14): Selected via 1-standard-deviation rule, balancing bias-variance tradeoff by shrinking coefficients 15–17% from OLS estimates.

Interpretation

The ridge model demonstrates solid predictive performance for sales forecasting across TikTok, Facebook, and Google Ads channels. The 76% variance explained indicates the three advertising channels collectively capture the primary drivers of sales variation. RMSE relative to the outcome's spread confirms predictions are sufficiently accurate for practical business use,

Want to run this analysis on your own data? Upload CSV — Free Analysis See Pricing