Analysis overview and configuration

Configuration

Analysis TypeNaive Bayes

CompanyEducational Research Institute

ObjectiveIdentify student characteristics that predict test preparation completion using Naive Bayes classification

Analysis Date2026-03-15

Processing Idtest_1773599677

Total Observations1000

Module Parameters

Parameter	Value	_row
confidence_level	0.95	confidence_level
test_size	0.3	test_size
classification_threshold	0.5	classification_threshold
positive_class	completed	positive_class
laplace	1	laplace

Naive Bayes analysis for Educational Research Institute

Interpretation

Purpose

This analysis applies Naive Bayes classification to identify which student characteristics predict test preparation completion. The model was trained on 701 observations and tested on 299 to evaluate its ability to distinguish between students who completed versus did not complete test preparation, using three numeric predictors.

Key Findings

AUC (0.673): Acceptable discriminative ability—the model performs moderately better than random chance at ranking completed vs. non-completed students
Accuracy (0.629): The model correctly classifies approximately 63% of test cases, indicating moderate overall performance
Precision (0.484): When the model predicts completion, it is correct only 48% of the time, suggesting high false positive rates
Sensitivity (0.579): The model identifies 58% of students who actually completed preparation, missing 42% of true completers
Class Imbalance: 64.2% of students did not complete preparation versus 35.8% who did, reflecting an imbalanced dataset
Feature Separation: All three predictors show meaningful differences between completion groups (mean differences of 5–10 points), with predictor_3 showing the largest separation

Interpretation

The Naive Bayes model demonstrates moderate predictive capability but with notable limitations. The model is more conservative in predicting completion (

Data preprocessing and column mapping

Data Quality

Initial Rows1000

Final Rows1000

Rows Removed0

Retention Rate100

Data Quality

Metric	Value
Initial Rows	1,000
Final Rows	1,000
Rows Removed	0
Retention Rate	100%

Processed 1,000 observations, retained 1,000 (100.0%) after cleaning

Interpretation

Purpose

This section documents the data preprocessing pipeline for a binary classification model predicting task completion. It shows that no data loss occurred during cleaning, which is critical for understanding whether the subsequent model performance (AUC=0.673, Accuracy=0.629) reflects true predictive capability or is constrained by data quality issues.

Key Findings

Retention Rate: 100% (1,000 rows preserved) - No observations were removed during preprocessing, indicating either pristine input data or minimal quality filtering applied
Rows Removed: 0 - No missing values, duplicates, or outliers were flagged for exclusion
Data Completeness: All 1,000 observations proceeded to model training and testing without attrition

Interpretation

The perfect retention rate suggests the dataset arrived clean and required no corrective preprocessing. However, this raises a subtle concern: the model's moderate performance (Kappa=0.226, Precision=0.484) may not stem from data quality issues but rather from weak predictive signal in the three numeric predictors themselves. The class imbalance (64.2% "none" vs. 35.8% "completed") was preserved unchanged, which appropriately reflects real-world distribution but may explain the lower sensitivity (0.579) for the minority class.

Context

The train/test split details are not

Key Metrics

AUC: 0.673
Accuracy: 62.9%
F1_Score: 0.528
Kappa: 0.226

Key Findings

Finding	Value	Assessment
AUC	0.673	Acceptable
Accuracy	62.9%	Moderate
Sensitivity	57.9%	Moderate
Specificity	65.6%	Moderate
Precision (PPV)	48.4%	Moderate
F1 Score	0.528	Moderate
Kappa	0.226	Moderate

Summary

Bottom Line: The Naive Bayes classifier predicts 'completed' vs 'none' with Acceptable discriminatory power (AUC = 0.673).

Key Findings:
• Accuracy: 62.9% on held-out test set (n=299)
• Correctly identified 57.9% of 'completed' cases
• Correctly identified 65.6% of 'none' cases
• Kappa = 0.226 (moderate agreement beyond chance)
• F1 = 0.528 (balance of precision and recall)

Recommendation: Model performance is moderate. Consider adding more features, checking class balance, or comparing with logistic regression.

Interpretation

EXECUTIVE SUMMARY

Purpose

This section synthesizes the Naive Bayes classification model's performance across all evaluation metrics to assess whether the model achieves acceptable predictive capability for distinguishing "completed" from "none" outcomes. Understanding these results is critical for determining whether the model is ready for operational deployment or requires further refinement.

Key Findings

AUC (0.673): Acceptable discriminatory power—the model performs moderately better than random chance at ranking positive cases, though substantial room for improvement exists
Accuracy (62.9%): Nearly 2 in 3 predictions correct on the test set, indicating moderate overall classification performance
Sensitivity (57.9%): The model misses approximately 42% of actual "completed" cases, representing a meaningful false-negative rate
Specificity (65.6%): Better at identifying "none" cases, though still produces false positives at a notable rate
Kappa (0.226): Moderate agreement beyond chance alone; suggests the model captures real signal but with substantial noise
F1 Score (0.528): Reflects the tension between precision (48.4%) and recall (57.9%), indicating the model trades accuracy for coverage

Interpretation

The model demonstrates acceptable but limited predictive utility. With an AUC of 0.673

ROC curve showing Naive Bayes model discrimination ability

Interpretation

Purpose

The ROC curve visualizes the Naive Bayes model's ability to discriminate between "completed" and "none" classes across all possible classification thresholds. This section evaluates whether the model performs meaningfully better than random chance (AUC = 0.5) in ranking positive and negative instances, which is essential for understanding overall predictive reliability.

Key Findings

AUC (0.673): The model correctly ranks 67.3% of positive-negative pairs, indicating acceptable but modest discriminatory power—substantially better than random guessing but with room for improvement.
Optimal Threshold (0.454): Youden's J statistic identifies 0.454 as the threshold that best balances sensitivity (true positive rate) and specificity (true negative rate) for this dataset.
Threshold-Performance Tradeoff: Mean FPR of 0.44 and mean TPR of 0.61 across evaluated thresholds reflect the inherent tension between catching true positives and minimizing false positives.

Interpretation

The AUC of 0.673 indicates the model has acceptable but limited discrimination ability. This aligns with the overall accuracy (62.9%) and moderate kappa (0.226), suggesting the three numeric predictors provide meaningful but incomplete separation between classes. The optimal threshold of

Confusion matrix heatmap showing prediction accuracy by class

Interpretation

Purpose

The confusion matrix quantifies how the Naive Bayes classifier performed on the test set (n=299), breaking down predictions into four categories: correct and incorrect classifications for each class. This section is essential for understanding not just overall accuracy, but the specific types and frequencies of errors the model makes, which reveals whether misclassifications are balanced or skewed toward one class.

Key Findings

Overall Accuracy: 62.9% (188 of 299 correct) — slightly better than random guessing but indicates moderate predictive performance
True Negatives: 126 (42.1%) — the largest single cell, showing the model excels at identifying "none" cases
False Positives: 66 (22.1%) — the model over-predicts "completed" status, incorrectly flagging "none" cases as "completed"
False Negatives: 45 (15.1%) — the model misses some actual "completed" cases, under-predicting this class
Class Imbalance Effect: The 64.2% prevalence of "none" in the data drives the high TN count, masking weaker performance on the minority "completed" class

Interpretation

The model demonstrates asymmetric error patterns. While it correctly identifies negative cases (TN=126), it struggles with

Distribution of predicted probabilities by actual class

Interpretation

Purpose

This section visualizes how well the Naive Bayes model separates predicted probabilities between the two classes. Good discrimination occurs when "completed" observations cluster at high probabilities and "none" observations cluster at low probabilities. This distribution directly reflects the model's ability to rank-order cases by likelihood of completion.

Key Findings

Sensitivity (57.9%): The model correctly identifies 58% of actual "completed" cases, indicating moderate true positive detection at the 0.454 optimal threshold.
Specificity (65.6%): The model correctly identifies 66% of actual "none" cases, showing slightly better performance at avoiding false positives.
Probability Range (0–0.83): Maximum predicted probability of only 0.83 reflects Naive Bayes' characteristic probability miscalibration; probabilities are compressed and don't reach extreme values.
Class Imbalance (64.2% "none"): The dataset skew toward the negative class influences both metrics and threshold optimization.

Interpretation

The modest separation between distributions indicates the three numeric predictors provide moderate discriminative power. The 57.9% sensitivity-specificity balance (near 60-65% range) suggests the model performs slightly better at identifying non-completions than completions. This aligns with the overall AUC of 0

Conditional feature means by class showing predictive power of each feature

Interpretation

Purpose

Feature profiles reveal the conditional distributions (mean and standard deviation) of each predictor within each class, representing the core parameters learned by the Naive Bayes classifier. Features exhibiting larger mean differences between "completed" and "none" classes possess greater discriminative power for predicting task completion. This section directly illuminates which predictors drive the model's 0.673 AUC performance.

Key Findings

Predictor_3 Separation: Mean difference of 9.92 points (74.42 vs 64.5) — the strongest class discriminator
Predictor_2 Separation: Mean difference of 7.36 points (73.89 vs 66.53) — moderate predictive value
Predictor_1 Separation: Mean difference of 5.62 points (69.7 vs 64.08) — weakest but still meaningful
Consistent Variance: Standard deviations range 13.38–15.19 across all features and classes, indicating stable within-class spread

Interpretation

All three numeric predictors show consistent upward shifts in the "completed" class relative to "none," confirming they contribute positively to classification. The modest mean separations (5–10 points) align with the model's moderate performance metrics (accuracy 0.629, F

Comprehensive classification performance metrics

metric	value	interpretation
AUC	0.6731	Acceptable
Accuracy	0.6288	Moderate
Sensitivity (Recall)	0.5794	True positive rate
Specificity	0.6562	True negative rate
Precision (PPV)	0.4844	Positive predictive value
NPV	0.7368	Negative predictive value
F1 Score	0.5277	Harmonic mean of precision & recall
Kappa	0.2259	Moderate agreement
Optimal Threshold	0.4538	Youden J optimal cutoff

Interpretation

Purpose

This section evaluates the Naive Bayes classifier's overall discriminative ability and predictive reliability on the test set (n=299). These metrics collectively assess whether the model generalizes beyond training data and whether its predictions are trustworthy for the "completed" vs. "none" classification task.

Key Findings

AUC = 0.673: Acceptable discrimination power; the model ranks positive cases better than random (0.5) but with modest separation between classes
Accuracy = 62.9%: Moderate overall correctness; slightly better than baseline due to class imbalance (64.2% "none" class)
F1 Score = 0.528: Balanced precision-recall trade-off is weak; reflects low precision (0.48) despite reasonable sensitivity (0.58)
Kappa = 0.226: Weak agreement beyond chance; indicates the model's improvement over random guessing is limited, suggesting substantial room for refinement

Interpretation

The model demonstrates acceptable but suboptimal performance. While AUC suggests the model can distinguish between classes across thresholds, the low F1 score and Kappa reveal that predictions lack reliability, particularly for identifying "completed" cases (precision only 48%). The imbalanced dataset (64% negative class) inflates accuracy; the model performs better at

Naive Bayes Classifier

Configuration

Module Parameters

Interpretation

Purpose

Key Findings

Interpretation

Data Preprocessing

Data Quality

Data Quality

Interpretation

Purpose

Key Findings

Interpretation

Context