Analysis overview and configuration
| Parameter | Value | _row |
|---|---|---|
| n_components | 4 | n_components |
| scale_data | TRUE | scale_data |
| variance_threshold | 0.8 | variance_threshold |
This PCA analysis reduces 4 marketing and sales features into 2 principal components to identify the primary dimensions of variation in the dataset. By compressing the feature space while retaining 75.4% of total variance, the analysis enables simpler visualization and interpretation of marketing spend patterns without losing critical information.
The analysis successfully distills marketing spend and sales variation into two interpretable dimensions. PC1 appears to capture a scale or intensity factor (negative loadings on features 1 and 4), while PC2 represents a contrast between feature_3 and feature_2. The 75.4% variance threshold
Data preprocessing and column mapping
| Metric | Value |
|---|---|
| Initial Rows | 200 |
| Final Rows | 200 |
| Rows Removed | 0 |
| Retention Rate | 100% |
This section documents the data quality and retention outcomes during preprocessing for the PCA analysis. Perfect retention is critical for dimensionality reduction, as PCA requires complete feature matrices to compute meaningful variance structures across all 200 marketing observations.
The perfect retention rate indicates robust data quality in the marketing spend and sales dataset. No missing values or anomalies triggered removal, allowing the full 200-observation sample to contribute to principal component calculations. This maximizes statistical power for identifying variance dimensions and ensures the 75.4% cumulative variance explained by PC1 and PC2 is based on complete information rather than imputed or filtered data.
While 100% retention is favorable, the analysis assumes features were already standardized (scale_data=TRUE) during PCA execution. The lack of train/test splitting reflects PCA's unsupervised nature; however, this means no independent validation
| Finding | Value |
|---|---|
| Features Analyzed | 4 |
| Recommended Components | 2 |
| Variance Captured | 75.4% |
| PC1 Variance | 48% |
| Observations Used | 200 |
This PCA analysis successfully reduced 4 marketing spend and sales features into 2 principal components, achieving the stated objective of identifying key dimensions of variation in the dataset. The analysis enables downstream modeling with simplified feature space while retaining meaningful variance structure.
The analysis reveals that marketing spend and sales data cluster around two primary dimensions of variation. PC1 represents a scale or magnitude dimension (negative loadings suggest inverse relationships), while PC2 captures a distinct orthogonal pattern. Together, these components preserve three-quarters of the original information, making them suitable for clustering or predictive modeling without substantial degradation.
PCA assumes linear relationships and requires standard
Variance explained by each principal component
The scree plot visualizes the variance contribution of each principal component, helping identify which dimensions capture the most meaningful variation in marketing spend and sales data. This section is critical for determining dimensionality reduction effectiveness—showing how much information is retained when moving from 4 original features to fewer principal components.
The scree plot reveals a strong concentration of variance in the first two components, supporting the PCA recommendation to retain only PC1 and PC2. This pattern suggests the four original marketing features are highly correlated and can be effectively compressed into two uncorrelated dimensions without substantial information loss. The steep drop-off after PC2 indicates that PC3 and PC4 capture only marginal, increasingly redundant variation.
PCA assumes linear relationships among
Observations projected onto the first two principal components
This score plot projects 200 marketing observations onto the first two principal components, revealing the underlying structure of variation in marketing spend and sales data. By visualizing observations in reduced dimensional space, the plot identifies natural groupings, outliers, and patterns that would be invisible when examining the original four features individually.
The relatively even scatter across PC space suggests marketing spend and sales metrics vary continuously across the 200 observations rather than forming distinct clusters. The left-skewed PC1 distribution indicates one observation exhibits an extreme pattern in the primary dimension of variation—likely representing either an outlier or a genuinely
Contribution of each original variable to each principal component
The loadings heatmap reveals how each of the 4 original marketing features contributes to the principal components. By identifying which variables load strongly together on the same component, this section enables you to assign business meaning to the mathematical dimensions—transforming abstract PCs into interpretable marketing dimensions that explain variation in spend and sales patterns.
The negative loadings on PC1 (Feature_4, Feature_1) suggest these marketing metrics increase together in one direction. PC2's opposing loadings reveal a trade-off dynamic
Cumulative variance explained as more components are added
This section quantifies the trade-off between dimensionality reduction and information retention. It demonstrates how many principal components are needed to capture meaningful variance in the marketing spend and sales data, guiding decisions about which components to retain for downstream analysis or visualization.
The analysis reveals that two components effectively summarize three-quarters of the variation in the original four features. This suggests the underlying marketing and sales metrics share substantial covariance—likely reflecting common business drivers. While the 75.4% figure falls modestly below the 80% threshold, retaining only two dimensions reduces the feature space by 50% while preserving most meaningful variation, making it suitable for visualization and interpretation of marketing dynamics.
P
Summary statistics for each principal component
| Component | Eigenvalue | Variance_Pct | Cumulative_Pct | Recommended |
|---|---|---|---|---|
| PC1 | 1.92 | 48% | 48% | ✓ Retain |
| PC2 | 1.097 | 27.4% | 75.4% | ✓ Retain |
| PC3 | 0.865 | 21.6% | 97.1% | |
| PC4 | 0.117 | 2.9% | 100% |
This section identifies which principal components merit retention based on statistical criteria. It shows how much variance each component captures and whether it meets the Kaiser criterion (eigenvalue > 1 for scaled data). This directly supports the marketing analytics objective by determining how many dimensions are needed to represent the key variation in marketing spend and sales data.
The two-component solution efficiently summarizes the marketing dataset's structure. PC1 represents nearly half the total variation, while PC2 adds substantial explanatory power. Together, they capture three-quarters of the data's variance while eliminating noise from weaker components. This 2D representation enables simplified visualization and analysis of marketing spend-sales relationships without substantial information loss.
The Kaiser
Top variable loadings per component (top 3 by absolute value)
| Component | Variable | Loading | Abs_Loading |
|---|---|---|---|
| PC1 | Sales | -0.6984 | 0.6984 |
| PC1 | TikTok | -0.556 | 0.556 |
| PC1 | -0.3783 | 0.3783 | |
| PC2 | Google Ads | -0.7986 | 0.7986 |
| PC2 | 0.6 | 0.6 | |
| PC2 | Sales | -0.0482 | 0.0482 |
| PC3 | TikTok | 0.6595 | 0.6595 |
| PC3 | -0.6028 | 0.6028 | |
| PC3 | Google Ads | -0.447 | 0.447 |
| PC4 | Sales | -0.7129 | 0.7129 |
| PC4 | TikTok | 0.5058 | 0.5058 |
| PC4 | 0.3653 | 0.3653 |
This section identifies which original features most strongly define each principal component by ranking variables by absolute loading magnitude. High absolute loadings (near ±1) reveal the core drivers of variation in each dimension, enabling interpretation of what each PC represents in the marketing spend and sales context.
The analysis reveals that marketing spend and sales variation is primarily captured by feature_4 and feature_1 (PC1: 48% variance), with feature_3 providing orthogonal contrast (PC2: 27.4% variance). The consistent appearance of feature_2 across components suggests