Analysis overview and configuration
| Parameter | Value | _row |
|---|---|---|
| significance_level | 0.05 | significance_level |
| posthoc_method | tukey | posthoc_method |
| min_group_size | 5 | min_group_size |
This one-way ANOVA analysis tests whether math test scores differ significantly across five student demographic groups. The analysis examines 1,000 observations with no data loss, establishing whether group membership explains meaningful variation in student performance outcomes.
Math scores demonstrate statistically significant differences across demographic groups, with Group E substantially outperforming others by 6–12 points. However, the small effect size indicates group membership alone
Data preprocessing and column mapping
| Metric | Value |
|---|---|
| Initial Rows | 1,000 |
| Final Rows | 1,000 |
| Rows Removed | 0 |
| Retention Rate | 100% |
This section documents the data preprocessing pipeline for the one-way ANOVA analysis comparing five groups. It shows that no data cleaning or filtering was applied, meaning all 1,000 observations proceeded directly to statistical testing. Understanding preprocessing decisions is critical because they affect the validity of group comparisons and the reliability of the significant findings (p < 0.001).
The perfect retention rate indicates a clean, complete dataset with no missing values or quality issues requiring removal. However, the absence of any preprocessing steps means potential outliers or data quality issues were not explicitly addressed before statistical testing. Given the ANOVA results show statistical significance (F=14.59, p≈0) but the Q-Q plot reveals non-normality (Shapiro-Wilk p≈0), the lack of preprocessing decisions—such as transformation or outlier handling—
| Finding | Value |
|---|---|
| ANOVA Result | Significant Difference Found (p < 0.001) |
| Effect Size | eta2=0.0554 (Small effect) |
| Post-Hoc | 6/10 pairs significant (Tukey HSD) |
| Homogeneity | Levene p=0.6697 (OK) |
| Normality | Shapiro p=0.0001 (Check) |
This analysis tested whether meaningful differences exist across five groups using one-way ANOVA with 1,000 observations. The results determine whether group membership is a statistically significant predictor of the measured outcome and quantify the practical magnitude of those differences.
Statistical significance is confirmed, but the small effect size indicates that while group differences are real and reproducible, they account
One-Way ANOVA test results with F-statistic, degrees of freedom, and effect sizes
| Source | df | SS | MS | F_value | p_value | eta_squared | omega_squared |
|---|---|---|---|---|---|---|---|
| Between Groups | 4 | 1.273e+04 | 3182 | 14.59 | 0 | 0.0554 | 0.0516 |
| Within Groups | 995 | 2.17e+05 | 218.1 | ||||
| Total | 999 | 2.297e+05 |
This section presents the one-way ANOVA F-test results, which determine whether statistically significant differences exist among the five groups. The test evaluates whether observed group differences are unlikely to occur by chance, serving as the foundation for subsequent pairwise comparisons and effect size interpretation.
The ANOVA confirms statistically significant differences exist across the five groups, with high confidence. However, the small effect size indicates these differences, while real and unlikely due to chance, account for minimal variance in the outcome. This distinction is critical: statistical significance reflects sample size and precision, whereas effect size reflects practical magnitude. The large sample (n=1000) enables detection of small but
Distribution of the math score variable within each group shown as box plots
This section visualizes the distribution of math scores across five groups, revealing the spread, central tendency, and variability within each group. Box plots enable quick visual comparison of group characteristics and identify potential outliers, providing context for the formal ANOVA statistical tests that follow.
The box plot distributions reveal that Group E exhibits notably higher median scores compared to Groups A, B, and C, while Group D occupies an intermediate position. The relatively consistent within-group variability (similar box heights) across groups supports the ANOVA assumption of equal variances, confirmed by Levene's test (p=0.67). This visual pattern aligns with the significant ANOVA result (F=14.594, p<0.001), indicating that group membership explains meaningful variation in
Group means with 95% confidence interval error bars
This section visualizes the central tendency and precision of each group's measurements through means and 95% confidence intervals. It provides a visual complement to the formal ANOVA test, allowing direct comparison of group locations and uncertainty ranges across the five groups.
The monotonic increase in means from Group A through Group E (61.63 → 73.82) aligns with the significant ANOVA result (F=14.594, p<0.001). The confidence intervals narrow for larger groups (C and D), reflecting greater precision. Non-overlapping intervals between Groups A/B and Groups D/E provide visual evidence of meaningful differences, though formal post-h
Tukey HSD confidence intervals for all pairwise group comparisons
This section identifies which specific group pairs differ significantly from one another following the overall ANOVA result. The Tukey HSD method controls for multiple comparisons, ensuring that the 60% significance rate (6 of 10 pairs) reflects genuine differences rather than false positives from testing many pairs simultaneously. This bridges the omnibus ANOVA finding to actionable group-level insights.
Group E emerges as a clear outlier with substantially higher values (mean=73.82) compared to the lower-performing clusters (Groups A–C, means 61.6–64.5). The moderate effect size (Cohen's f=0.242) and small eta-squared (0.055) indicate that while differences are statistically reliable, group membership explains only ~5.
Descriptive statistics for each group — mean, SD, SE, confidence intervals
| Group | N | Mean | Median | SD | SE | CI_Lower | CI_Upper | Min | Max |
|---|---|---|---|---|---|---|---|---|---|
| group A | 89 | 61.63 | 61 | 14.52 | 1.539 | 58.57 | 64.69 | 28 | 100 |
| group B | 190 | 63.45 | 63 | 15.47 | 1.122 | 61.24 | 65.67 | 8 | 97 |
| group C | 319 | 64.46 | 65 | 14.85 | 0.832 | 62.83 | 66.1 | 0 | 98 |
| group D | 262 | 67.36 | 69 | 13.77 | 0.851 | 65.69 | 69.04 | 26 | 100 |
| group E | 140 | 73.82 | 74.5 | 15.53 | 1.313 | 71.23 | 76.42 | 30 | 100 |
This section provides foundational descriptive statistics for each of the five groups, enabling direct comparison of central tendencies, variability, and precision. These metrics establish the baseline distributions that underpin the ANOVA test and post-hoc comparisons, allowing users to understand both the magnitude and reliability of observed group differences.
The progressive increase in means from Group A through Group E (61.63 → 73.82) demonstrates a clear directional trend. Comparable standard deviations (~14.8 average) satisfy the homogeneity
Tukey HSD pairwise comparison table with adjusted p-values
| Comparison | Mean_Diff | Lower_95CI | Upper_95CI | p_adj | Significant |
|---|---|---|---|---|---|
| group B-group A | 1.823 | -3.36 | 7.007 | 0.8724 | No |
| group C-group A | 2.835 | -2.003 | 7.672 | 0.4968 | No |
| group D-group A | 5.733 | 0.782 | 10.68 | 0.0138 | Yes |
| group E-group A | 12.19 | 6.722 | 17.66 | 0 | Yes |
| group C-group B | 1.011 | -2.687 | 4.709 | 0.9452 | No |
| group D-group B | 3.91 | 0.065 | 7.755 | 0.044 | Yes |
| group E-group B | 10.37 | 5.874 | 14.86 | 0 | Yes |
| group D-group C | 2.899 | -0.466 | 6.263 | 0.129 | No |
| group E-group C | 9.357 | 5.266 | 13.45 | 0 | Yes |
| group E-group D | 6.459 | 2.234 | 10.68 | 3.00e-04 | Yes |
This section identifies which specific group pairs differ significantly after controlling for multiple comparisons. The Tukey HSD procedure protects against inflated Type I error rates across all 10 pairwise comparisons, making it essential for confirming the overall ANOVA finding that group differences exist.
Group E emerges as distinctly elevated, with mean differences ranging 6.46–12.19 points versus other groups. Groups A, B, and C form a lower cluster with minimal pairwise separation (differences ≤2.84). Group D occupies an intermediate position, differing significantly from Groups A, B, and E but not C. These patterns align
Residual Q-Q plot and ANOVA assumption checks (Levene's test, Shapiro-Wilk)
This section validates whether the data meets critical assumptions required for ANOVA validity. The equal variances and normality assumptions are foundational to ANOVA's reliability; violations can compromise the validity of the F-test and subsequent pairwise comparisons. These diagnostics determine whether the parametric ANOVA results are trustworthy or if non-parametric alternatives should be considered.
The analysis presents a mixed assumption picture. While equal variances across groups strengthen confidence in the ANOVA F-statistic (F=14.594, p<0.0001), the normality