Survey Response Analysis — Categorical Breakdowns with Chi-Square and ANOVA

You ran a survey and now you have a spreadsheet full of responses. Which demographic groups scored highest? Are the differences between departments real or just noise? Do certain categorical variables predict outcomes? This module breaks down survey responses by every categorical variable in your data, runs chi-square tests for independence, ANOVA for group comparisons, and Tukey HSD for pairwise differences — all from a single CSV upload.

What Is Categorical Survey Analysis?

Categorical survey analysis examines how responses differ across groups defined by categorical variables — demographics like gender, age bracket, department, region, education level, or any other grouping in your data. The analysis answers two fundamental questions: Do the groups differ? And if so, which groups differ from which?

The first question is answered by ANOVA (Analysis of Variance) for numeric outcomes and chi-square tests for categorical outcomes. ANOVA compares the average score (or rating, or revenue, or any numeric measure) across all groups simultaneously and tells you whether at least one group stands apart. Chi-square tests examine whether two categorical variables are independent — for example, whether the distribution of satisfaction ratings is the same across regions or whether it varies systematically.

The second question — which specific groups differ — is answered by Tukey HSD (Honestly Significant Difference) post-hoc comparisons. After ANOVA finds a significant overall difference, Tukey tests every pair of groups and tells you which specific pairs are meaningfully different, with proper correction for the multiple comparison problem.

For example, consider an employee satisfaction survey with columns for department (Engineering, Sales, Marketing, Support), tenure bracket (0-1 years, 1-3 years, 3+ years), and a satisfaction score (1-100). ANOVA might find that departments differ significantly in satisfaction (F = 8.3, p = 0.001). Tukey HSD then reveals that the specific difference is between Support (mean 62) and Engineering (mean 78) — the other pairs are not significantly different. Meanwhile, chi-square tests might reveal that department and tenure bracket are not independent: Engineering has a disproportionate share of 3+ year employees.

When to Use Survey Analysis

This module is designed for any dataset that combines categorical grouping variables with numeric outcome variables. The most common use case is survey data, but it works equally well for educational assessments, clinical trial outcomes, employee performance reviews, customer feedback, or any structured data where you want to compare numeric outcomes across categories.

HR and employee engagement: Compare satisfaction, engagement, or eNPS scores across departments, locations, tenure groups, or management levels. Identify which segments are disengaged and whether the differences are statistically significant or just noise from small sample sizes.

Education and training: Compare test scores or assessment results across class sections, teaching methods, demographic groups, or prior experience levels. The multi-subject comparison feature is particularly useful here — if you have scores for multiple subjects (math, reading, science), the module compares all of them across groups simultaneously.

Customer research: Break down satisfaction ratings by customer segment, product line, region, or support channel. Identify which segments are underserved and quantify the gap with confidence intervals.

Clinical and research surveys: Compare outcome measures across treatment groups, demographic categories, or study sites. The chi-square independence tests check whether categorical variables (like treatment group and outcome category) are associated.

What Data Do You Need?

You need a CSV with at least two columns:

Required: categorical_1 — the primary grouping variable (e.g., "department", "region", "gender", "treatment_group"). This defines the groups you want to compare. numeric_target_1 — the primary numeric outcome (e.g., "satisfaction_score", "test_score", "rating", "revenue").

Optional (for richer analysis): categorical_2 through categorical_5 — additional grouping variables. Each one enables cross-tabulation and chi-square independence testing. With two categorical variables, you get a 2D heatmap of their association. With five, you get a comprehensive picture of how all categories interact. numeric_target_2 and numeric_target_3 — additional numeric outcomes. If you have scores for multiple subjects or dimensions (e.g., math score, reading score, overall score), map them here. The module produces correlation matrices and multi-subject comparison charts across groups.

For reliable statistical testing, aim for at least 5 observations per group (configurable via the min_group_size parameter). More is better — 20+ per group gives good statistical power. The significance level defaults to 0.05 and the pass threshold for performance tier grading defaults to 50 (set to 0 to disable grading).

The module is designed to handle messy survey data gracefully. It automatically handles groups with small sample sizes by flagging them rather than producing misleading statistics. Empty responses and sparse categories are reported in the preprocessing summary.

How to Read the Report

The report opens with the Categorical Analysis Overview and Data Preprocessing slides, showing total respondents, column types detected, and data quality metrics.

The Score Distributions slide shows density curves for each numeric target variable. Overlapping curves suggest similar distributions; separated curves with different centers reveal outcome differences. If you mapped multiple numeric targets, all density curves appear together, making it easy to compare the shape of scores across subjects.

The Score Correlations heatmap (when you have 2+ numeric targets) shows Pearson correlations between your numeric variables. High correlation (r > 0.7) between two scores suggests they measure similar things. Low or negative correlation suggests they capture distinct dimensions. This is valuable for understanding whether your survey dimensions are truly independent.

The Categorical Distributions chart shows the frequency breakdown of each categorical variable — how many respondents fall into each category. Look for severely imbalanced groups (one category with 90% of responses) which reduce the power of group comparisons.

The Categorical Relationships slide shows cross-tabulation heatmaps with chi-square test results. Each pair of categorical variables is tested for independence. A significant chi-square (p < 0.05) means the variables are associated — for example, department and education level are not independent. The heatmap shows where the concentrations are.

The Multi-Subject Comparison chart (when multiple numeric targets are mapped) shows mean scores by category for each numeric target, side by side. This is the "at a glance" view of how groups compare across all outcome dimensions simultaneously.

The ANOVA Results and Post-Hoc Comparisons slides show the formal statistical tests. ANOVA tells you whether groups differ overall (F-statistic and p-value). Tukey HSD then tells you which specific pairs differ, with adjusted p-values and confidence intervals for each pairwise difference. Focus on pairs where the confidence interval does not cross zero — those are the statistically significant differences.

The Performance Tiers slide (if the pass threshold is configured) grades each group based on their score distribution relative to the threshold, categorizing them as high performers, adequate, or underperforming.

The Executive Summary distills all findings into key takeaways and recommended actions.

When to Use Something Else

If you only have two groups (not three or more), use a t-test instead — it gives the same answer as ANOVA with two groups but provides a simpler, directional result.

If your numeric outcome is ordinal (like a 1-5 Likert scale) rather than continuous, and you have concerns about normality, consider the Kruskal-Wallis test, which compares medians without assuming a normal distribution.

If you want to test the association between two categorical variables only (no numeric outcomes), a standalone chi-square test provides deeper analysis with residual plots and effect sizes.

If you want to compare groups while controlling for a confounding continuous variable (e.g., compare department scores while controlling for tenure), use ANCOVA instead.

If your goal is to predict which group a new observation belongs to based on its features, you need a classification tool like logistic regression or Naive Bayes.

The R Code Behind the Analysis

Every report includes the exact R code used to produce the results — reproducible, auditable, and citable. This is not AI-generated code that changes every run. The same data produces the same analysis every time.

The analysis uses aov() for ANOVA F-tests, TukeyHSD() for pairwise post-hoc comparisons, and chisq.test() for chi-square independence tests — all from base R. Frequency analysis uses table() and prop.table(). Score distributions use kernel density estimation via density(). Correlation matrices use cor() with Pearson coefficients. Cross-tabulations use xtabs(). These are the same functions used in academic research and peer-reviewed publications. Every step is visible in the code tab of your report, so you or a statistician can verify exactly what was done.