Overview

Naive Bayes Classifier

Binary Classification Analysis

Analysis overview and configuration

Configuration

Analysis TypeNaive Bayes
CompanyEducational Research Institute
ObjectiveIdentify student characteristics that predict test preparation completion using Naive Bayes classification
Analysis Date2026-03-15
Processing Idtest_1773599677
Total Observations1000

Module Parameters

ParameterValue_row
confidence_level0.95confidence_level
test_size0.3test_size
classification_threshold0.5classification_threshold
positive_classcompletedpositive_class
laplace1laplace
Naive Bayes analysis for Educational Research Institute

Interpretation

Purpose

This analysis applies Naive Bayes classification to identify which student characteristics predict test preparation completion. The model was trained on 701 observations and tested on 299 to evaluate its ability to distinguish between students who completed versus did not complete test preparation, using three numeric predictors.

Key Findings

  • AUC (0.673): Acceptable discriminative ability—the model performs moderately better than random chance at ranking completed vs. non-completed students
  • Accuracy (0.629): The model correctly classifies approximately 63% of test cases, indicating moderate overall performance
  • Precision (0.484): When the model predicts completion, it is correct only 48% of the time, suggesting high false positive rates
  • Sensitivity (0.579): The model identifies 58% of students who actually completed preparation, missing 42% of true completers
  • Class Imbalance: 64.2% of students did not complete preparation versus 35.8% who did, reflecting an imbalanced dataset
  • Feature Separation: All three predictors show meaningful differences between completion groups (mean differences of 5–10 points), with predictor_3 showing the largest separation

Interpretation

The Naive Bayes model demonstrates moderate predictive capability but with notable limitations. The model is more conservative in predicting completion (

Data Preparation

Data Preprocessing

Data Quality & Train/Test Split

Data preprocessing and column mapping

Data Quality

Initial Rows1000
Final Rows1000
Rows Removed0
Retention Rate100

Data Quality

MetricValue
Initial Rows1,000
Final Rows1,000
Rows Removed0
Retention Rate100%
Processed 1,000 observations, retained 1,000 (100.0%) after cleaning

Interpretation

Purpose

This section documents the data preprocessing pipeline for a binary classification model predicting task completion. It shows that no data loss occurred during cleaning, which is critical for understanding whether the subsequent model performance (AUC=0.673, Accuracy=0.629) reflects true predictive capability or is constrained by data quality issues.

Key Findings

  • Retention Rate: 100% (1,000 rows preserved) - No observations were removed during preprocessing, indicating either pristine input data or minimal quality filtering applied
  • Rows Removed: 0 - No missing values, duplicates, or outliers were flagged for exclusion
  • Data Completeness: All 1,000 observations proceeded to model training and testing without attrition

Interpretation

The perfect retention rate suggests the dataset arrived clean and required no corrective preprocessing. However, this raises a subtle concern: the model's moderate performance (Kappa=0.226, Precision=0.484) may not stem from data quality issues but rather from weak predictive signal in the three numeric predictors themselves. The class imbalance (64.2% "none" vs. 35.8% "completed") was preserved unchanged, which appropriately reflects real-world distribution but may explain the lower sensitivity (0.579) for the minority class.

Context

The train/test split details are not

Executive Summary

Executive Summary

Key Findings & Model Performance

Key Metrics

AUC
0.673
Accuracy
62.9%
F1_Score
0.528
Kappa
0.226

Key Findings

FindingValueAssessment
AUC0.673Acceptable
Accuracy62.9%Moderate
Sensitivity57.9%Moderate
Specificity65.6%Moderate
Precision (PPV)48.4%Moderate
F1 Score0.528Moderate
Kappa0.226Moderate

Summary

Bottom Line: The Naive Bayes classifier predicts 'completed' vs 'none' with Acceptable discriminatory power (AUC = 0.673).

Key Findings:
• Accuracy: 62.9% on held-out test set (n=299)
• Correctly identified 57.9% of 'completed' cases
• Correctly identified 65.6% of 'none' cases
• Kappa = 0.226 (moderate agreement beyond chance)
• F1 = 0.528 (balance of precision and recall)

Recommendation: Model performance is moderate. Consider adding more features, checking class balance, or comparing with logistic regression.

Interpretation

EXECUTIVE SUMMARY

Purpose

This section synthesizes the Naive Bayes classification model's performance across all evaluation metrics to assess whether the model achieves acceptable predictive capability for distinguishing "completed" from "none" outcomes. Understanding these results is critical for determining whether the model is ready for operational deployment or requires further refinement.

Key Findings

  • AUC (0.673): Acceptable discriminatory power—the model performs moderately better than random chance at ranking positive cases, though substantial room for improvement exists
  • Accuracy (62.9%): Nearly 2 in 3 predictions correct on the test set, indicating moderate overall classification performance
  • Sensitivity (57.9%): The model misses approximately 42% of actual "completed" cases, representing a meaningful false-negative rate
  • Specificity (65.6%): Better at identifying "none" cases, though still produces false positives at a notable rate
  • Kappa (0.226): Moderate agreement beyond chance alone; suggests the model captures real signal but with substantial noise
  • F1 Score (0.528): Reflects the tension between precision (48.4%) and recall (57.9%), indicating the model trades accuracy for coverage

Interpretation

The model demonstrates acceptable but limited predictive utility. With an AUC of 0.673

Figure 4

ROC Curve

Sensitivity vs. Specificity Tradeoff

ROC curve showing Naive Bayes model discrimination ability

Interpretation

Purpose

The ROC curve visualizes the Naive Bayes model's ability to discriminate between "completed" and "none" classes across all possible classification thresholds. This section evaluates whether the model performs meaningfully better than random chance (AUC = 0.5) in ranking positive and negative instances, which is essential for understanding overall predictive reliability.

Key Findings

  • AUC (0.673): The model correctly ranks 67.3% of positive-negative pairs, indicating acceptable but modest discriminatory power—substantially better than random guessing but with room for improvement.
  • Optimal Threshold (0.454): Youden's J statistic identifies 0.454 as the threshold that best balances sensitivity (true positive rate) and specificity (true negative rate) for this dataset.
  • Threshold-Performance Tradeoff: Mean FPR of 0.44 and mean TPR of 0.61 across evaluated thresholds reflect the inherent tension between catching true positives and minimizing false positives.

Interpretation

The AUC of 0.673 indicates the model has acceptable but limited discrimination ability. This aligns with the overall accuracy (62.9%) and moderate kappa (0.226), suggesting the three numeric predictors provide meaningful but incomplete separation between classes. The optimal threshold of

Figure 5

Confusion Matrix

Classification Accuracy by Class

Confusion matrix heatmap showing prediction accuracy by class

Interpretation

Purpose

The confusion matrix quantifies how the Naive Bayes classifier performed on the test set (n=299), breaking down predictions into four categories: correct and incorrect classifications for each class. This section is essential for understanding not just overall accuracy, but the specific types and frequencies of errors the model makes, which reveals whether misclassifications are balanced or skewed toward one class.

Key Findings

  • Overall Accuracy: 62.9% (188 of 299 correct) — slightly better than random guessing but indicates moderate predictive performance
  • True Negatives: 126 (42.1%) — the largest single cell, showing the model excels at identifying "none" cases
  • False Positives: 66 (22.1%) — the model over-predicts "completed" status, incorrectly flagging "none" cases as "completed"
  • False Negatives: 45 (15.1%) — the model misses some actual "completed" cases, under-predicting this class
  • Class Imbalance Effect: The 64.2% prevalence of "none" in the data drives the high TN count, masking weaker performance on the minority "completed" class

Interpretation

The model demonstrates asymmetric error patterns. While it correctly identifies negative cases (TN=126), it struggles with

Figure 6

Predicted Probability Distribution

Class Separation by Predicted Score

Distribution of predicted probabilities by actual class

Interpretation

Purpose

This section visualizes how well the Naive Bayes model separates predicted probabilities between the two classes. Good discrimination occurs when "completed" observations cluster at high probabilities and "none" observations cluster at low probabilities. This distribution directly reflects the model's ability to rank-order cases by likelihood of completion.

Key Findings

  • Sensitivity (57.9%): The model correctly identifies 58% of actual "completed" cases, indicating moderate true positive detection at the 0.454 optimal threshold.
  • Specificity (65.6%): The model correctly identifies 66% of actual "none" cases, showing slightly better performance at avoiding false positives.
  • Probability Range (0–0.83): Maximum predicted probability of only 0.83 reflects Naive Bayes' characteristic probability miscalibration; probabilities are compressed and don't reach extreme values.
  • Class Imbalance (64.2% "none"): The dataset skew toward the negative class influences both metrics and threshold optimization.

Interpretation

The modest separation between distributions indicates the three numeric predictors provide moderate discriminative power. The 57.9% sensitivity-specificity balance (near 60-65% range) suggests the model performs slightly better at identifying non-completions than completions. This aligns with the overall AUC of 0

Figure 7

Feature Profiles

Conditional Feature Means by Class

Conditional feature means by class showing predictive power of each feature

Interpretation

Purpose

Feature profiles reveal the conditional distributions (mean and standard deviation) of each predictor within each class, representing the core parameters learned by the Naive Bayes classifier. Features exhibiting larger mean differences between "completed" and "none" classes possess greater discriminative power for predicting task completion. This section directly illuminates which predictors drive the model's 0.673 AUC performance.

Key Findings

  • Predictor_3 Separation: Mean difference of 9.92 points (74.42 vs 64.5) — the strongest class discriminator
  • Predictor_2 Separation: Mean difference of 7.36 points (73.89 vs 66.53) — moderate predictive value
  • Predictor_1 Separation: Mean difference of 5.62 points (69.7 vs 64.08) — weakest but still meaningful
  • Consistent Variance: Standard deviations range 13.38–15.19 across all features and classes, indicating stable within-class spread

Interpretation

All three numeric predictors show consistent upward shifts in the "completed" class relative to "none," confirming they contribute positively to classification. The modest mean separations (5–10 points) align with the model's moderate performance metrics (accuracy 0.629, F

Table 8

Performance Metrics

Classification Accuracy Breakdown

Comprehensive classification performance metrics

metricvalueinterpretation
AUC0.6731Acceptable
Accuracy0.6288Moderate
Sensitivity (Recall)0.5794True positive rate
Specificity0.6562True negative rate
Precision (PPV)0.4844Positive predictive value
NPV0.7368Negative predictive value
F1 Score0.5277Harmonic mean of precision & recall
Kappa0.2259Moderate agreement
Optimal Threshold0.4538Youden J optimal cutoff

Interpretation

Purpose

This section evaluates the Naive Bayes classifier's overall discriminative ability and predictive reliability on the test set (n=299). These metrics collectively assess whether the model generalizes beyond training data and whether its predictions are trustworthy for the "completed" vs. "none" classification task.

Key Findings

  • AUC = 0.673: Acceptable discrimination power; the model ranks positive cases better than random (0.5) but with modest separation between classes
  • Accuracy = 62.9%: Moderate overall correctness; slightly better than baseline due to class imbalance (64.2% "none" class)
  • F1 Score = 0.528: Balanced precision-recall trade-off is weak; reflects low precision (0.48) despite reasonable sensitivity (0.58)
  • Kappa = 0.226: Weak agreement beyond chance; indicates the model's improvement over random guessing is limited, suggesting substantial room for refinement

Interpretation

The model demonstrates acceptable but suboptimal performance. While AUC suggests the model can distinguish between classes across thresholds, the low F1 score and Kappa reveal that predictions lack reliability, particularly for identifying "completed" cases (precision only 48%). The imbalanced dataset (64% negative class) inflates accuracy; the model performs better at

Want to run this analysis on your own data? Upload CSV — Free Analysis See Pricing