Analytics · Statistical · Groups · Anova
Overview

Analysis Overview

Analysis overview and configuration

Analysis TypeAnova
CompanyEducational Research Institute
ObjectiveTest whether math scores differ significantly across student ethnic groups using one-way ANOVA
Analysis Date2026-03-13
Processing Idtest_1773386485
Total Observations1000
ParameterValue_row
significance_level0.05significance_level
posthoc_methodtukeyposthoc_method
min_group_size5min_group_size
Interpretation

Purpose

This one-way ANOVA analysis tests whether math test scores differ significantly across five student demographic groups. The analysis examines 1,000 observations with no data loss, establishing whether group membership explains meaningful variation in student performance outcomes.

Key Findings

  • F-statistic (14.59): Highly significant with p-value ≈ 0, indicating group differences are not due to random chance
  • Effect Size (η² = 0.055): Groups explain approximately 5.5% of score variance—statistically significant but practically small
  • Equal Variances Confirmed: Levene's test (p = 0.67) validates ANOVA assumption of homogeneity across groups
  • Normality Violated: Shapiro-Wilk test (p ≈ 0) indicates non-normal distribution, though ANOVA is robust with large n=1,000
  • Post-hoc Comparisons: 6 of 10 pairwise comparisons significant; Group E (mean=73.82) differs substantially from Groups A-D (means 61.63–67.36)

Interpretation

Math scores demonstrate statistically significant differences across demographic groups, with Group E substantially outperforming others by 6–12 points. However, the small effect size indicates group membership alone

Data preprocessing and column mapping

Initial Rows1000
Final Rows1000
Rows Removed0
Retention Rate100
Interpretation

Purpose

This section documents the data preprocessing pipeline for the one-way ANOVA analysis comparing five groups. It shows that no data cleaning or filtering was applied, meaning all 1,000 observations proceeded directly to statistical testing. Understanding preprocessing decisions is critical because they affect the validity of group comparisons and the reliability of the significant findings (p < 0.001).

Key Findings

  • Retention Rate: 100% (1,000 of 1,000 rows retained) - No observations were removed during preprocessing, preserving the full dataset for analysis
  • Rows Removed: 0 - No filtering, outlier removal, or missing value imputation occurred
  • Data Quality: Complete dataset with no documented transformations applied before ANOVA testing
  • Train/Test Split: Not applicable - This is a descriptive/inferential analysis rather than a predictive modeling task

Interpretation

The perfect retention rate indicates a clean, complete dataset with no missing values or quality issues requiring removal. However, the absence of any preprocessing steps means potential outliers or data quality issues were not explicitly addressed before statistical testing. Given the ANOVA results show statistical significance (F=14.59, p≈0) but the Q-Q plot reveals non-normality (Shapiro-Wilk p≈0), the lack of preprocessing decisions—such as transformation or outlier handling—

Executive Summary

Executive Summary

Executive summary of ANOVA results and recommendations

Overall Result
Significant Difference Found
F-Statistic
F(4,995) = 14.594
P-Value
p < 0.001
Effect Size (η²)
0.0554 — Small
Sig. Pairs
6 of 10 pairs significant
FindingValue
ANOVA ResultSignificant Difference Found (p < 0.001)
Effect Sizeeta2=0.0554 (Small effect)
Post-Hoc6/10 pairs significant (Tukey HSD)
HomogeneityLevene p=0.6697 (OK)
NormalityShapiro p=0.0001 (Check)
Bottom Line: Group means differ significantly (F(4,995)=14.594, p < 0.001).

Effect Size: Eta-squared = 0.0554 (Small effect — group membership explains 5.5% of variance)

Post-Hoc (Tukey HSD): 6 of 10 pairwise comparisons were significant.

Recommendation: Focus on the groups identified as significantly different in the Tukey HSD results for targeted intervention.
Interpretation

EXECUTIVE SUMMARY: ANOVA ANALYSIS RESULTS

Purpose

This analysis tested whether meaningful differences exist across five groups using one-way ANOVA with 1,000 observations. The results determine whether group membership is a statistically significant predictor of the measured outcome and quantify the practical magnitude of those differences.

Key Findings

  • F-Statistic (4,995): 14.594 with p < 0.001 — Highly statistically significant; group means are not equal
  • Effect Size (η²): 0.0554 — Small effect; group membership explains only 5.5% of outcome variance
  • Significant Pairwise Comparisons: 6 of 10 pairs differ significantly; Group E shows the largest mean (73.82) versus Group A (61.63), a 12.19-point difference
  • Variance Homogeneity: Levene's test (p=0.67) confirms equal variances across groups, validating ANOVA assumptions
  • Normality Caveat: Shapiro-Wilk test (p<0.001) indicates non-normal distribution, though ANOVA is robust with large samples

Interpretation

Statistical significance is confirmed, but the small effect size indicates that while group differences are real and reproducible, they account

Data Table

ANOVA Results

One-Way ANOVA test results with F-statistic, degrees of freedom, and effect sizes

SourcedfSSMSF_valuep_valueeta_squaredomega_squared
Between Groups41.273e+04318214.5900.05540.0516
Within Groups9952.17e+05218.1
Total9992.297e+05
Interpretation

Purpose

This section presents the one-way ANOVA F-test results, which determine whether statistically significant differences exist among the five groups. The test evaluates whether observed group differences are unlikely to occur by chance, serving as the foundation for subsequent pairwise comparisons and effect size interpretation.

Key Findings

  • F-Statistic: F(4,995) = 14.594 — Substantially larger than 1, indicating between-group variance exceeds within-group variance
  • P-Value: p < 0.001 — Highly significant; probability of observing these differences by random chance is less than 0.1%
  • Eta-Squared (η²): 0.0554 — Groups explain only 5.5% of total variance; remaining 94.5% attributable to within-group variation
  • Omega-Squared: 0.0516 — Conservative estimate confirms small practical effect size after accounting for sample size

Interpretation

The ANOVA confirms statistically significant differences exist across the five groups, with high confidence. However, the small effect size indicates these differences, while real and unlikely due to chance, account for minimal variance in the outcome. This distinction is critical: statistical significance reflects sample size and precision, whereas effect size reflects practical magnitude. The large sample (n=1000) enables detection of small but

Visualization

Group Distributions

Distribution of the math score variable within each group shown as box plots

Interpretation

Purpose

This section visualizes the distribution of math scores across five groups, revealing the spread, central tendency, and variability within each group. Box plots enable quick visual comparison of group characteristics and identify potential outliers, providing context for the formal ANOVA statistical tests that follow.

Key Findings

  • Overall Mean Score: 66.09 across all 1,000 observations with minimal skewness (0.02), indicating a symmetric distribution
  • Group C Dominance: Represents 31.9% of the sample (319 observations), while Group A is smallest with 89 observations
  • Score Range: Values span 0–100 with standard deviation of 15.16, showing substantial variability within groups
  • Visual Separation: Non-overlapping boxes between groups (particularly Group E vs. Groups A–C) suggest meaningful differences in central tendency

Interpretation

The box plot distributions reveal that Group E exhibits notably higher median scores compared to Groups A, B, and C, while Group D occupies an intermediate position. The relatively consistent within-group variability (similar box heights) across groups supports the ANOVA assumption of equal variances, confirmed by Levene's test (p=0.67). This visual pattern aligns with the significant ANOVA result (F=14.594, p<0.001), indicating that group membership explains meaningful variation in

Visualization

Group Means

Group means with 95% confidence interval error bars

Interpretation

Purpose

This section visualizes the central tendency and precision of each group's measurements through means and 95% confidence intervals. It provides a visual complement to the formal ANOVA test, allowing direct comparison of group locations and uncertainty ranges across the five groups.

Key Findings

  • Group A Mean: 61.63 (95% CI: 58.57–64.69) — lowest mean value with widest confidence interval due to smallest sample size (n=89)
  • Group E Mean: 73.82 (95% CI: 71.23–76.42) — highest mean value with moderate precision (n=140)
  • Mean Range: 12.19-point spread from Group A to Group E indicates substantial between-group variation
  • Confidence Interval Overlap: Groups A, B, and C show overlapping intervals; Groups D and E show minimal overlap with earlier groups, suggesting clearer separation at the upper end

Interpretation

The monotonic increase in means from Group A through Group E (61.63 → 73.82) aligns with the significant ANOVA result (F=14.594, p<0.001). The confidence intervals narrow for larger groups (C and D), reflecting greater precision. Non-overlapping intervals between Groups A/B and Groups D/E provide visual evidence of meaningful differences, though formal post-h

Visualization

Tukey HSD Comparisons

Tukey HSD confidence intervals for all pairwise group comparisons

Interpretation

Purpose

This section identifies which specific group pairs differ significantly from one another following the overall ANOVA result. The Tukey HSD method controls for multiple comparisons, ensuring that the 60% significance rate (6 of 10 pairs) reflects genuine differences rather than false positives from testing many pairs simultaneously. This bridges the omnibus ANOVA finding to actionable group-level insights.

Key Findings

  • Significant Pairs: 6 of 10 comparisons (60%) show confidence intervals excluding zero, indicating true mean differences
  • Largest Difference: Group E vs. Group A (12.19 points, p<0.001) represents the most pronounced separation
  • Group E Pattern: Consistently significant against all other groups (A, B, C, D), suggesting it occupies a distinct performance tier
  • Smallest Differences: Groups A–C show minimal separation (1.01–2.83 points), with confidence intervals spanning zero

Interpretation

Group E emerges as a clear outlier with substantially higher values (mean=73.82) compared to the lower-performing clusters (Groups A–C, means 61.6–64.5). The moderate effect size (Cohen's f=0.242) and small eta-squared (0.055) indicate that while differences are statistically reliable, group membership explains only ~5.

Data Table

Group Statistics

Descriptive statistics for each group — mean, SD, SE, confidence intervals

GroupNMeanMedianSDSECI_LowerCI_UpperMinMax
group A8961.636114.521.53958.5764.6928100
group B19063.456315.471.12261.2465.67897
group C31964.466514.850.83262.8366.1098
group D26267.366913.770.85165.6969.0426100
group E14073.8274.515.531.31371.2376.4230100
Interpretation

Purpose

This section provides foundational descriptive statistics for each of the five groups, enabling direct comparison of central tendencies, variability, and precision. These metrics establish the baseline distributions that underpin the ANOVA test and post-hoc comparisons, allowing users to understand both the magnitude and reliability of observed group differences.

Key Findings

  • Mean Range (61.63–73.82): Group E shows the highest mean (73.82), while Group A is lowest (61.63)—a 12.19-point spread that aligns with the significant ANOVA result (p < 0.001)
  • Sample Size Imbalance: Group C dominates with 319 observations; Group A has only 89, affecting standard error precision
  • Consistent Variability: Standard deviations cluster tightly (13.77–15.53), indicating homogeneous within-group spread across all groups
  • Confidence Interval Width: Narrower CIs for larger groups (Group C: ±1.64) versus smaller groups (Group A: ±3.06) reflect sampling precision differences

Interpretation

The progressive increase in means from Group A through Group E (61.63 → 73.82) demonstrates a clear directional trend. Comparable standard deviations (~14.8 average) satisfy the homogeneity

Data Table

Post-Hoc Results

Tukey HSD pairwise comparison table with adjusted p-values

ComparisonMean_DiffLower_95CIUpper_95CIp_adjSignificant
group B-group A1.823-3.367.0070.8724No
group C-group A2.835-2.0037.6720.4968No
group D-group A5.7330.78210.680.0138Yes
group E-group A12.196.72217.660Yes
group C-group B1.011-2.6874.7090.9452No
group D-group B3.910.0657.7550.044Yes
group E-group B10.375.87414.860Yes
group D-group C2.899-0.4666.2630.129No
group E-group C9.3575.26613.450Yes
group E-group D6.4592.23410.680.0003Yes
Interpretation

Purpose

This section identifies which specific group pairs differ significantly after controlling for multiple comparisons. The Tukey HSD procedure protects against inflated Type I error rates across all 10 pairwise comparisons, making it essential for confirming the overall ANOVA finding that group differences exist.

Key Findings

  • Significant Comparisons: 6 of 10 pairs (60%) show statistically significant differences at α=0.05, indicating Group E consistently differs from Groups A, B, C, and D
  • Largest Mean Difference: Group E vs. Group A (12.19 points, p<0.001) with confidence interval entirely above zero
  • Smallest Significant Difference: Group D vs. Group B (3.91 points, p=0.04), barely crossing the significance threshold
  • Non-Significant Pairs: 4 comparisons (40%) lack sufficient evidence of difference, primarily among Groups A, B, and C

Interpretation

Group E emerges as distinctly elevated, with mean differences ranging 6.46–12.19 points versus other groups. Groups A, B, and C form a lower cluster with minimal pairwise separation (differences ≤2.84). Group D occupies an intermediate position, differing significantly from Groups A, B, and E but not C. These patterns align

Visualization

Assumption Checks

Residual Q-Q plot and ANOVA assumption checks (Levene's test, Shapiro-Wilk)

Interpretation

Purpose

This section validates whether the data meets critical assumptions required for ANOVA validity. The equal variances and normality assumptions are foundational to ANOVA's reliability; violations can compromise the validity of the F-test and subsequent pairwise comparisons. These diagnostics determine whether the parametric ANOVA results are trustworthy or if non-parametric alternatives should be considered.

Key Findings

  • Levene's Test (F=0.590, p=0.6697): Equal variances assumption is MET. Homogeneity of variance across the five groups is satisfied, supporting ANOVA's validity on this dimension.
  • Shapiro-Wilk Test (W=0.9929, p<0.0001): Normality assumption is VIOLATED. The p-value indicates residuals deviate significantly from normality, despite the high W-statistic suggesting near-normal behavior.
  • Q-Q Plot Pattern: Systematic deviations visible in the lower and upper tails (sample quantiles compress relative to theoretical values), confirming non-normal residual distribution.

Interpretation

The analysis presents a mixed assumption picture. While equal variances across groups strengthen confidence in the ANOVA F-statistic (F=14.594, p<0.0001), the normality

Want to run this analysis on your own data? Upload CSV — Free Analysis See Pricing