Analysis Overview
Analysis overview and configuration
| Parameter | Value | _row |
|---|---|---|
| significance_level | 0.05 | significance_level |
| posthoc_method | tukey | posthoc_method |
| min_group_size | 5 | min_group_size |
Purpose
This one-way ANOVA analysis tests whether math test scores differ significantly across five student demographic groups. The analysis examines 1,000 observations with no data loss, establishing whether group membership explains meaningful variation in student performance outcomes.
Key Findings
- F-statistic (14.59): Highly significant with p-value ≈ 0, indicating group differences are not due to random chance
- Effect Size (η² = 0.055): Groups explain approximately 5.5% of score variance—statistically significant but practically small
- Equal Variances Confirmed: Levene's test (p = 0.67) validates ANOVA assumption of homogeneity across groups
- Normality Violated: Shapiro-Wilk test (p ≈ 0) indicates non-normal distribution, though ANOVA is robust with large n=1,000
- Post-hoc Comparisons: 6 of 10 pairwise comparisons significant; Group E (mean=73.82) differs substantially from Groups A-D (means 61.63–67.36)
Interpretation
Math scores demonstrate statistically significant differences across demographic groups, with Group E substantially outperforming others by 6–12 points. However, the small effect size indicates group membership alone
Data preprocessing and column mapping
Purpose
This section documents the data preprocessing pipeline for the one-way ANOVA analysis comparing five groups. It shows that no data cleaning or filtering was applied, meaning all 1,000 observations proceeded directly to statistical testing. Understanding preprocessing decisions is critical because they affect the validity of group comparisons and the reliability of the significant findings (p < 0.001).
Key Findings
- Retention Rate: 100% (1,000 of 1,000 rows retained) - No observations were removed during preprocessing, preserving the full dataset for analysis
- Rows Removed: 0 - No filtering, outlier removal, or missing value imputation occurred
- Data Quality: Complete dataset with no documented transformations applied before ANOVA testing
- Train/Test Split: Not applicable - This is a descriptive/inferential analysis rather than a predictive modeling task
Interpretation
The perfect retention rate indicates a clean, complete dataset with no missing values or quality issues requiring removal. However, the absence of any preprocessing steps means potential outliers or data quality issues were not explicitly addressed before statistical testing. Given the ANOVA results show statistical significance (F=14.59, p≈0) but the Q-Q plot reveals non-normality (Shapiro-Wilk p≈0), the lack of preprocessing decisions—such as transformation or outlier handling—
Executive Summary
Executive summary of ANOVA results and recommendations
| Finding | Value |
|---|---|
| ANOVA Result | Significant Difference Found (p < 0.001) |
| Effect Size | eta2=0.0554 (Small effect) |
| Post-Hoc | 6/10 pairs significant (Tukey HSD) |
| Homogeneity | Levene p=0.6697 (OK) |
| Normality | Shapiro p=0.0001 (Check) |
Effect Size: Eta-squared = 0.0554 (Small effect — group membership explains 5.5% of variance)
Post-Hoc (Tukey HSD): 6 of 10 pairwise comparisons were significant.
Recommendation: Focus on the groups identified as significantly different in the Tukey HSD results for targeted intervention.
EXECUTIVE SUMMARY: ANOVA ANALYSIS RESULTS
Purpose
This analysis tested whether meaningful differences exist across five groups using one-way ANOVA with 1,000 observations. The results determine whether group membership is a statistically significant predictor of the measured outcome and quantify the practical magnitude of those differences.
Key Findings
- F-Statistic (4,995): 14.594 with p < 0.001 — Highly statistically significant; group means are not equal
- Effect Size (η²): 0.0554 — Small effect; group membership explains only 5.5% of outcome variance
- Significant Pairwise Comparisons: 6 of 10 pairs differ significantly; Group E shows the largest mean (73.82) versus Group A (61.63), a 12.19-point difference
- Variance Homogeneity: Levene's test (p=0.67) confirms equal variances across groups, validating ANOVA assumptions
- Normality Caveat: Shapiro-Wilk test (p<0.001) indicates non-normal distribution, though ANOVA is robust with large samples
Interpretation
Statistical significance is confirmed, but the small effect size indicates that while group differences are real and reproducible, they account
ANOVA Results
One-Way ANOVA test results with F-statistic, degrees of freedom, and effect sizes
| Source | df | SS | MS | F_value | p_value | eta_squared | omega_squared |
|---|---|---|---|---|---|---|---|
| Between Groups | 4 | 1.273e+04 | 3182 | 14.59 | 0 | 0.0554 | 0.0516 |
| Within Groups | 995 | 2.17e+05 | 218.1 | ||||
| Total | 999 | 2.297e+05 |
Purpose
This section presents the one-way ANOVA F-test results, which determine whether statistically significant differences exist among the five groups. The test evaluates whether observed group differences are unlikely to occur by chance, serving as the foundation for subsequent pairwise comparisons and effect size interpretation.
Key Findings
- F-Statistic: F(4,995) = 14.594 — Substantially larger than 1, indicating between-group variance exceeds within-group variance
- P-Value: p < 0.001 — Highly significant; probability of observing these differences by random chance is less than 0.1%
- Eta-Squared (η²): 0.0554 — Groups explain only 5.5% of total variance; remaining 94.5% attributable to within-group variation
- Omega-Squared: 0.0516 — Conservative estimate confirms small practical effect size after accounting for sample size
Interpretation
The ANOVA confirms statistically significant differences exist across the five groups, with high confidence. However, the small effect size indicates these differences, while real and unlikely due to chance, account for minimal variance in the outcome. This distinction is critical: statistical significance reflects sample size and precision, whereas effect size reflects practical magnitude. The large sample (n=1000) enables detection of small but
Group Distributions
Distribution of the math score variable within each group shown as box plots
Purpose
This section visualizes the distribution of math scores across five groups, revealing the spread, central tendency, and variability within each group. Box plots enable quick visual comparison of group characteristics and identify potential outliers, providing context for the formal ANOVA statistical tests that follow.
Key Findings
- Overall Mean Score: 66.09 across all 1,000 observations with minimal skewness (0.02), indicating a symmetric distribution
- Group C Dominance: Represents 31.9% of the sample (319 observations), while Group A is smallest with 89 observations
- Score Range: Values span 0–100 with standard deviation of 15.16, showing substantial variability within groups
- Visual Separation: Non-overlapping boxes between groups (particularly Group E vs. Groups A–C) suggest meaningful differences in central tendency
Interpretation
The box plot distributions reveal that Group E exhibits notably higher median scores compared to Groups A, B, and C, while Group D occupies an intermediate position. The relatively consistent within-group variability (similar box heights) across groups supports the ANOVA assumption of equal variances, confirmed by Levene's test (p=0.67). This visual pattern aligns with the significant ANOVA result (F=14.594, p<0.001), indicating that group membership explains meaningful variation in
Group Means
Group means with 95% confidence interval error bars
Purpose
This section visualizes the central tendency and precision of each group's measurements through means and 95% confidence intervals. It provides a visual complement to the formal ANOVA test, allowing direct comparison of group locations and uncertainty ranges across the five groups.
Key Findings
- Group A Mean: 61.63 (95% CI: 58.57–64.69) — lowest mean value with widest confidence interval due to smallest sample size (n=89)
- Group E Mean: 73.82 (95% CI: 71.23–76.42) — highest mean value with moderate precision (n=140)
- Mean Range: 12.19-point spread from Group A to Group E indicates substantial between-group variation
- Confidence Interval Overlap: Groups A, B, and C show overlapping intervals; Groups D and E show minimal overlap with earlier groups, suggesting clearer separation at the upper end
Interpretation
The monotonic increase in means from Group A through Group E (61.63 → 73.82) aligns with the significant ANOVA result (F=14.594, p<0.001). The confidence intervals narrow for larger groups (C and D), reflecting greater precision. Non-overlapping intervals between Groups A/B and Groups D/E provide visual evidence of meaningful differences, though formal post-h
Tukey HSD Comparisons
Tukey HSD confidence intervals for all pairwise group comparisons
Purpose
This section identifies which specific group pairs differ significantly from one another following the overall ANOVA result. The Tukey HSD method controls for multiple comparisons, ensuring that the 60% significance rate (6 of 10 pairs) reflects genuine differences rather than false positives from testing many pairs simultaneously. This bridges the omnibus ANOVA finding to actionable group-level insights.
Key Findings
- Significant Pairs: 6 of 10 comparisons (60%) show confidence intervals excluding zero, indicating true mean differences
- Largest Difference: Group E vs. Group A (12.19 points, p<0.001) represents the most pronounced separation
- Group E Pattern: Consistently significant against all other groups (A, B, C, D), suggesting it occupies a distinct performance tier
- Smallest Differences: Groups A–C show minimal separation (1.01–2.83 points), with confidence intervals spanning zero
Interpretation
Group E emerges as a clear outlier with substantially higher values (mean=73.82) compared to the lower-performing clusters (Groups A–C, means 61.6–64.5). The moderate effect size (Cohen's f=0.242) and small eta-squared (0.055) indicate that while differences are statistically reliable, group membership explains only ~5.
Group Statistics
Descriptive statistics for each group — mean, SD, SE, confidence intervals
| Group | N | Mean | Median | SD | SE | CI_Lower | CI_Upper | Min | Max |
|---|---|---|---|---|---|---|---|---|---|
| group A | 89 | 61.63 | 61 | 14.52 | 1.539 | 58.57 | 64.69 | 28 | 100 |
| group B | 190 | 63.45 | 63 | 15.47 | 1.122 | 61.24 | 65.67 | 8 | 97 |
| group C | 319 | 64.46 | 65 | 14.85 | 0.832 | 62.83 | 66.1 | 0 | 98 |
| group D | 262 | 67.36 | 69 | 13.77 | 0.851 | 65.69 | 69.04 | 26 | 100 |
| group E | 140 | 73.82 | 74.5 | 15.53 | 1.313 | 71.23 | 76.42 | 30 | 100 |
Purpose
This section provides foundational descriptive statistics for each of the five groups, enabling direct comparison of central tendencies, variability, and precision. These metrics establish the baseline distributions that underpin the ANOVA test and post-hoc comparisons, allowing users to understand both the magnitude and reliability of observed group differences.
Key Findings
- Mean Range (61.63–73.82): Group E shows the highest mean (73.82), while Group A is lowest (61.63)—a 12.19-point spread that aligns with the significant ANOVA result (p < 0.001)
- Sample Size Imbalance: Group C dominates with 319 observations; Group A has only 89, affecting standard error precision
- Consistent Variability: Standard deviations cluster tightly (13.77–15.53), indicating homogeneous within-group spread across all groups
- Confidence Interval Width: Narrower CIs for larger groups (Group C: ±1.64) versus smaller groups (Group A: ±3.06) reflect sampling precision differences
Interpretation
The progressive increase in means from Group A through Group E (61.63 → 73.82) demonstrates a clear directional trend. Comparable standard deviations (~14.8 average) satisfy the homogeneity
Post-Hoc Results
Tukey HSD pairwise comparison table with adjusted p-values
| Comparison | Mean_Diff | Lower_95CI | Upper_95CI | p_adj | Significant |
|---|---|---|---|---|---|
| group B-group A | 1.823 | -3.36 | 7.007 | 0.8724 | No |
| group C-group A | 2.835 | -2.003 | 7.672 | 0.4968 | No |
| group D-group A | 5.733 | 0.782 | 10.68 | 0.0138 | Yes |
| group E-group A | 12.19 | 6.722 | 17.66 | 0 | Yes |
| group C-group B | 1.011 | -2.687 | 4.709 | 0.9452 | No |
| group D-group B | 3.91 | 0.065 | 7.755 | 0.044 | Yes |
| group E-group B | 10.37 | 5.874 | 14.86 | 0 | Yes |
| group D-group C | 2.899 | -0.466 | 6.263 | 0.129 | No |
| group E-group C | 9.357 | 5.266 | 13.45 | 0 | Yes |
| group E-group D | 6.459 | 2.234 | 10.68 | 0.0003 | Yes |
Purpose
This section identifies which specific group pairs differ significantly after controlling for multiple comparisons. The Tukey HSD procedure protects against inflated Type I error rates across all 10 pairwise comparisons, making it essential for confirming the overall ANOVA finding that group differences exist.
Key Findings
- Significant Comparisons: 6 of 10 pairs (60%) show statistically significant differences at α=0.05, indicating Group E consistently differs from Groups A, B, C, and D
- Largest Mean Difference: Group E vs. Group A (12.19 points, p<0.001) with confidence interval entirely above zero
- Smallest Significant Difference: Group D vs. Group B (3.91 points, p=0.04), barely crossing the significance threshold
- Non-Significant Pairs: 4 comparisons (40%) lack sufficient evidence of difference, primarily among Groups A, B, and C
Interpretation
Group E emerges as distinctly elevated, with mean differences ranging 6.46–12.19 points versus other groups. Groups A, B, and C form a lower cluster with minimal pairwise separation (differences ≤2.84). Group D occupies an intermediate position, differing significantly from Groups A, B, and E but not C. These patterns align
Assumption Checks
Residual Q-Q plot and ANOVA assumption checks (Levene's test, Shapiro-Wilk)
Purpose
This section validates whether the data meets critical assumptions required for ANOVA validity. The equal variances and normality assumptions are foundational to ANOVA's reliability; violations can compromise the validity of the F-test and subsequent pairwise comparisons. These diagnostics determine whether the parametric ANOVA results are trustworthy or if non-parametric alternatives should be considered.
Key Findings
- Levene's Test (F=0.590, p=0.6697): Equal variances assumption is MET. Homogeneity of variance across the five groups is satisfied, supporting ANOVA's validity on this dimension.
- Shapiro-Wilk Test (W=0.9929, p<0.0001): Normality assumption is VIOLATED. The p-value indicates residuals deviate significantly from normality, despite the high W-statistic suggesting near-normal behavior.
- Q-Q Plot Pattern: Systematic deviations visible in the lower and upper tails (sample quantiles compress relative to theoretical values), confirming non-normal residual distribution.
Interpretation
The analysis presents a mixed assumption picture. While equal variances across groups strengthen confidence in the ANOVA F-statistic (F=14.594, p<0.0001), the normality