Analysis overview and configuration
| Parameter | Value | _row |
|---|---|---|
| alpha | 1 | alpha |
| n_folds | 10 | n_folds |
| lambda_choice | lambda.1se | lambda_choice |
| standardize | TRUE | standardize |
This LASSO regression analysis identifies which advertising channels drive sales by applying automatic variable selection to 300 observations across 6 predictors. The analysis uses cross-validated regularization to balance model complexity with predictive accuracy, selecting only the most influential channels while excluding noise.
The model successfully identifies 4 advertising channels as meaningful sales drivers while eliminating 2 as redundant. The strong R² indicates these selected channels capture the essential sales dynamics. The conservative lambda choice (1se vs. min) suggests
Data preprocessing and column mapping
| Metric | Value |
|---|---|
| Initial Rows | 300 |
| Final Rows | 300 |
| Rows Removed | 0 |
| Retention Rate | 100% |
This section documents the data preprocessing pipeline for the LASSO regression analysis. It shows that all 300 observations were retained without any rows removed during cleaning, indicating either pristine input data or minimal data quality issues. Understanding preprocessing integrity is critical for validating whether model performance (R² = 0.834) reflects true predictive power or data artifacts.
Perfect data retention suggests the dataset arrived clean and complete, with no missing values or anomalies requiring removal. This is favorable for model stability but raises a subtle concern: the absence of any data cleaning may indicate either exceptional data quality or insufficient validation rigor. The LASSO model's strong performance (RMSE = 5.033, MAE = 4.113) is therefore more likely attributable to genuine predictive relationships rather than data artifacts.
The lack of explicit train/test split documentation is notable; the analysis relied on 10-fold cross-validation for regularization parameter selection
| finding | value |
|---|---|
| Model Quality | Good fit (R² > 0.7) |
| Variables Selected | 4 of 6 predictors |
| Variables Excluded | 2 predictors set to 0 |
| R-Squared | 83.4% |
| RMSE | 5.033 |
| Optimal Lambda | 0.6614 |
This LASSO regression analysis evaluated 6 predictor variables to identify which ones meaningfully contribute to predicting the outcome while minimizing overfitting. The model successfully reduced the feature set through automatic variable selection, a key objective of LASSO regularization for improving model parsimony and interpretability.
The model achieved strong explanatory power while reducing complexity from 6 to 4 predictors. The 1se lambda selection method prioritizes generalization over training fit, suggesting the selected variables are robust and unlikely to be artifacts of overfitting. All 300 observations were retained with no data loss, and residuals
Coefficient trajectories across the regularization path as lambda varies
The regularization path visualizes how predictor coefficients shrink toward zero as the LASSO penalty (lambda) increases. This reveals the order and timing of variable selection—which predictors are most important (enter earliest at high lambda) versus least important (enter only at low lambda). Understanding this path is essential for identifying the core drivers of the model and validating the stability of selected features.
The regularization path demonstrates that predictor_1 enters the model first (at highest lambda ~9.82), marking it as the strongest signal. As lambda decreases, additional predictors sequentially activate, with all 6 variables eventually entering at minimal penalty (lambda=0
Cross-validation error across lambda values with optimal lambda selection
This section identifies the optimal regularization strength (lambda) for the LASSO model through 10-fold cross-validation. The lambda parameter controls the trade-off between model complexity and predictive accuracy—a critical decision that directly impacts which predictors are retained and how well the model generalizes to unseen data.
The analysis demonstrates a classic regularization trade-off: lambda.min minimizes error but at the cost of model complexity, while lambda.1se sacrifices minimal predictive power (~0.01 MSE difference) to achieve substantial simplification. This conservative choice aligns with the principle of parsimony—retaining only 4 of 6 predictors while maintaining strong cross-validated performance (R² = 0.834).
The 10-fold cross-validation design ensures robust lambda selection across multiple data splits. The narrow confidence intervals at low lambda values suggest the model's performance is stable and reliable at the chosen regularization level.
Non-zero coefficients at the selected lambda — the variables chosen by LASSO
This section identifies which variables drive the outcome and quantifies their individual impact. LASSO regularization automatically selected 4 of 6 predictors by shrinking irrelevant coefficients to zero, creating a parsimonious model that balances predictive accuracy with simplicity. Understanding coefficient magnitudes and directions reveals the relative importance and directional effect of each retained variable.
The model identifies predictor_1 as the dominant driver, followed by predictor_2 and predictor_3. The negative coefficient on predictor_3 indicates an inverse relationship: higher values decrease the predicted outcome. Predictors 4 and 5 were eliminated entirely, suggesting they add no independent predictive value beyond noise after accounting for the
Actual vs predicted scatter plot showing model fit quality
This section evaluates how accurately the LASSO regression model captures the relationship between predictors and the outcome variable. Model fit quality is essential for assessing whether the selected variables (4 of 6) provide reliable predictions and whether the regularization approach successfully balanced complexity with accuracy.
The model demonstrates strong fit quality, with the selected four predictors capturing the underlying data structure effectively. The tight alignment between R-squared and deviance explained (both 0.834) confirms the LASSO regularization successfully eliminated noise-bearing variables (predictor_4 and predictor_5) without sacrificing predictive power
Complete model performance metrics and parameter summary
| metric | value |
|---|---|
| RMSE | 5.033 |
| MAE | 4.113 |
| R-Squared | 0.8341 |
| Deviance Explained | 0.8341 |
| Lambda (1se) | 0.6614 |
| Lambda (min) | 0.0854 |
| Variables Selected | 4 |
| Total Predictors | 6 |
This section provides a comprehensive snapshot of the LASSO regression model's predictive performance and feature selection efficiency. It answers whether the model achieves adequate accuracy while maintaining interpretability through automatic variable selection, which is central to the analysis objective of balancing prediction quality with model simplicity.
The model demonstrates robust performance with strong explanatory power and effective dimensionality reduction. The gap between lambda.min and lambda.1se selection reflects a conservative regularization strategy that trades minimal training improvement for substantially improved generalization potential.