Executive Summary
Key metrics from customer RFM segmentation
Analysis Overview
Analysis configuration and dataset characteristics
Analysis: Customer RFM Segmentation using K-Means clustering. Dataset contains 2000 transactions analyzed from 941 unique customers. Method: K-Means clustering on Recency, Frequency, and Monetary (RFM) metrics with 4 segments. Log transformation: enabled. Minimum transaction threshold: 1. RFM aggregation groups transactions by customer_id to compute purchase recency (days since last purchase), frequency (transaction count), and monetary value (total spending).
Data Quality
Data filtering and quality metrics
Started with 2000 transaction records. Quality filtering removed 1059 rows (53.0%) with missing customer_id, missing dates, negative quantities, or zero unit prices, leaving 1444 clean transactions. Customer aggregation resulted in 941 unique customers with at least 1 transaction(s). Final dataset for clustering: 941 customers across 4 unique segments with silhouette-based quality validation.
RFM Statistics
Central tendency and spread of recency, frequency, and monetary metrics across all customers
| Metric | Min | Max | Mean | Median | Std Dev |
|---|---|---|---|---|---|
| Recency (Days) | 1 | 374 | 142.9 | 108 | 111.7 |
| Frequency (Transactions) | 1 | 27 | 1.395 | 1 | 1.573 |
| Monetary Value | 0.19 | 1790 | 31.21 | 16.35 | 87.86 |
Cluster Size Distribution
Number and proportion of customers in each segment
Customers are distributed across 4 segments with 235 individuals per segment on average. Segment sizes range from 65 to 406 customers (6.9% to 43.1%). The distribution is skewed, requiring sized-specific strategies for targeted marketing campaigns.
Segment Profiles
Average RFM characteristics and cluster quality (silhouette coefficient) for each segment
| Cluster | N Customers | Mean Recency | Mean Monetary | Mean Frequency | Silhouette Coeff |
|---|---|---|---|---|---|
| Segment 1 | 406 | 195.3 | 39.42 | 1.16 | 0.3084 |
| Segment 2 | 65 | 47.17 | 124.7 | 4.74 | 0.232 |
| Segment 3 | 266 | 175.9 | 5.85 | 1.06 | 0.3141 |
| Segment 4 | 204 | 26.08 | 18.17 | 1.23 | 0.2335 |
Segment 1 has the highest recency (195.3 days), while Segment 4 shows the lowest recency (26.1 days), indicating most recent purchases. Monetary values range from 5.85 to 124.71 across segments. Cluster cohesion (silhouette: 0.232 to 0.314) suggests Segment 3 has the most cohesive customers, while Segment 2 contains more heterogeneous points.
Frequency vs Monetary (by Segment)
Customer distribution in frequency-monetary space, colored by cluster membership
The scatter shows clear separation of segments in frequency-monetary space. Segment 2 contains the highest-spending customers (average monetary: 124.71), while other segments show lower spending profiles. Most customers cluster in the low-to-moderate frequency and monetary ranges, with occasional high-value outliers representing premium customers.
Cluster Quality (Silhouette Scores)
Cohesion measure for each cluster (1=perfect separation, 0=borderline, <0=misclassified)
Silhouette scores range from 0.232 to 0.314, with an overall average of 0.272. 0 of 4 clusters show strong cohesion (>0.5), while 2 clusters contain borderline or overlapping points. Scores near 1 indicate tight, well-separated clusters suitable for targeting, while lower scores suggest segment boundaries may be ambiguous.
Top Customers by Monetary Value
Highest-spending customers and their RFM characteristics by segment
| Cluster | Customer ID | Recency Days | Monetary Value | Frequency Count |
|---|---|---|---|---|
| Segment 1 | 15838 | 65 | 1790 | 1 |
| Segment 2 | 17450 | 71 | 1155 | 4 |
| Segment 2 | 14646 | 3 | 867.9 | 8 |
| Segment 2 | 14911 | 3 | 686.1 | 27 |
| Segment 2 | 14156 | 33 | 586 | 5 |
| Segment 1 | 16333 | 124 | 540 | 1 |
| Segment 1 | 13082 | 255 | 390 | 1 |
| Segment 2 | 12748 | 5 | 294.8 | 26 |
| Segment 2 | 13408 | 18 | 270.4 | 4 |
| Segment 1 | 15061 | 212 | 244.8 | 1 |
| Segment 2 | 14298 | 53 | 244 | 4 |
| Segment 2 | 14096 | 5 | 218.2 | 10 |
| Segment 1 | 12415 | 144 | 217.4 | 2 |
| Segment 1 | 14608 | 30 | 204 | 1 |
| Segment 1 | 12590 | 212 | 203.4 | 1 |
| Segment 1 | 15189 | 205 | 194.1 | 2 |
| Segment 1 | 16684 | 310 | 183.6 | 1 |
| Segment 2 | 12433 | 1 | 180.9 | 4 |
| Segment 1 | 13798 | 94 | 179 | 1 |
| Segment 1 | 16003 | 360 | 175.2 | 1 |
Top 20 customers by spending are dominated by Segment 1 (11 of top 20), which likely represents your most valuable and retention-critical segment. These customers show spending from 175.20 to 1790.00 with varying recency (1 to 360 days) and transaction frequency (1 to 27). Prioritizing retention and upsell strategies for this segment can maximize customer lifetime value.
Recency Distribution by Segment
Distribution of days since last purchase for each customer segment
Segment 4 shows the most recent purchase behavior (median: 24 days), while Segment 1 contains customers at higher risk of churn with longer recency (median: 193 days). Recency variation within segments (visible in box widths) reveals customer engagement patterns—tight boxes indicate consistent behavior, wide boxes suggest mixed engagement levels requiring segment-specific retention strategies.