Analysis overview and configuration
| Parameter | Value | _row |
|---|---|---|
| confidence_level | 0.95 | confidence_level |
| ties_method | efron | ties_method |
| group_col | group_col | group_col |
This Cox proportional hazards analysis identifies patient factors predicting time to disease recurrence after treatment. The analysis evaluates six predictors across 500 patients to quantify their individual effects on recurrence risk, enabling clinicians to stratify patients by prognosis and tailor follow-up strategies.
Data preprocessing and column mapping
| Metric | Value |
|---|---|
| Initial Rows | 500 |
| Final Rows | 500 |
| Rows Removed | 0 |
| Retention Rate | 100% |
This section documents the data preprocessing pipeline for a survival analysis model with 500 observations. Perfect retention (100%) indicates no rows were removed during cleaning, suggesting either exceptionally clean source data or minimal data quality validation. Understanding preprocessing decisions is critical for assessing whether the model's strong performance (concordance: 0.654, all 6 predictors significant) reflects genuine predictive power or potential data quality issues masked by incomplete validation.
The perfect retention rate is unusual for real-world data and raises questions about preprocessing rigor. With 93.4% event rate and 467 events across 500 observations, the data appears complete but potentially unvalidated. The absence of documented train/test splits means model performance metrics (concordance, AIC) may reflect training set performance rather than generalization capability. This is particularly concerning given all 6 predictors achieved statistical significance—a pattern that could indicate ov
This analysis presents the results of a Cox proportional hazards regression model designed to identify and quantify risk factors associated with time-to-event outcomes. The model's performance and predictor significance directly inform risk stratification and intervention prioritization decisions.
Overall model performance and discrimination statistics
| metric | value |
|---|---|
| N Observations | 500 |
| N Events | 467 |
| Event Rate (%) | 93.4 |
| Concordance (C-statistic) | 0.6536 |
| Concordance SE | 0.0133 |
| Log-rank p-value | 0 |
| AIC | 4870.64 |
| N Predictors | 6 |
| N Significant | 6 |
This section evaluates the overall discriminative performance and statistical significance of the Cox proportional hazards model. It answers whether the model reliably distinguishes between subjects at different risk levels and whether the included predictors meaningfully explain survival variation in the dataset.
The model demonstrates moderate but clinically meaningful discriminative ability. The universal significance of all six predictors—combined with a zero log-rank p-value—indicates the model successfully identifies survival-relevant factors and stratifies the cohort into meaningfully different risk groups. The high event rate ensures adequate statistical power for parameter estimation without sparse-data bias.
Concordance of 0.65 is typical for survival models in observational data; clinical utility depends
Hazard ratios and 95% confidence intervals for all predictors
This section presents the Cox proportional hazards model results, quantifying how each of the 6 predictors influences event risk. The forest plot visualizes hazard ratios with 95% confidence intervals, allowing rapid assessment of effect direction and statistical significance. All predictors achieved significance (p < 0.05), indicating robust associations with the outcome in this 500-observation cohort with 93.4% event rate.
All six predictors demonstrate statistically significant associations with event hazard. Risk factors (HR > 1) include predictor_1, predictor_2, and predictor
Full coefficient table with hazard ratios, confidence intervals, and p-values
| term | coef | hr | se | z_score | p_value | hr_lower | hr_upper | significance |
|---|---|---|---|---|---|---|---|---|
| patient_age | 0.0204 | 1.021 | 0.0039 | 5.174 | 0 | 1.013 | 1.029 | *** |
| biomarker_level | 0.1073 | 1.113 | 0.0227 | 4.719 | 0 | 1.065 | 1.164 | *** |
| predictor_3late | 0.473 | 1.605 | 0.095 | 4.98 | 0 | 1.332 | 1.933 | *** |
| predictor_4male | 0.1952 | 1.216 | 0.0943 | 2.069 | 0.0385 | 1.01 | 1.462 | * |
| group_colB | -0.5086 | 0.6013 | 0.1141 | -4.459 | 0 | 0.4809 | 0.752 | *** |
| group_colC | -0.8298 | 0.4361 | 0.1181 | -7.025 | 0 | 0.346 | 0.5498 | *** |
| Semantic | Actual |
|---|---|
| days_observed | days_observed |
| event_status | event_status |
| treatment_group | treatment_group |
| patient_age | patient_age |
| biomarker_level | biomarker_level |
| disease_stage | disease_stage |
| patient_gender | patient_gender |
| Predictor | HR | CI_Lower | CI_Upper | P_Value | Significance |
|---|---|---|---|---|---|
| patient_age | 1.021 | 1.013 | 1.029 | 0.0000 | *** |
| biomarker_level | 1.113 | 1.065 | 1.164 | 0.0000 | *** |
| predictor_3late | 1.605 | 1.332 | 1.933 | 0.0000 | *** |
| predictor_4male | 1.216 | 1.01 | 1.462 | 0.0385 | * |
| group_colB | 0.601 | 0.481 | 0.752 | 0.0000 | *** |
| group_colC | 0.436 | 0.346 | 0.55 | 0.0000 | *** |
This section presents the Cox proportional hazards regression coefficients for all 6 predictors in the survival model. It quantifies how each predictor affects the instantaneous risk of the event (hazard), enabling identification of protective and risk-elevating factors while accounting for censoring and competing risks in the 500-observation cohort.
The model identifies three protective factors (group_colB, group_colC, and baseline reference) and three risk elevators (predictor_1, predictor_2, predictor_3late). The high event rate (93.4
Survival probability curves over time, stratified by group
This section visualizes Kaplan-Meier survival curves for three distinct groups, showing how the probability of survival changes over time (0–1,491 days). Wider separation between curves indicates stronger group differences in survival outcomes. The 95% confidence intervals (shaded bands) quantify uncertainty around each estimate, enabling assessment of whether observed differences are statistically meaningful.
The curves demonstrate substantial group stratification in survival outcomes. Early separation suggests groups experience markedly different hazard rates. The log-rank p-value of 0 (from overall metrics) confirms these differences are statistically significant. This aligns with the Cox model's concordance of 0.654, indicating
Cumulative hazard over time by group
The cumulative hazard plot visualizes accumulated risk over time using the Nelson-Aalen estimator, enabling assessment of whether the exponential distribution assumption holds and whether the proportional hazards assumption is satisfied across groups. This diagnostic is critical for validating the Cox proportional hazards model used in the overall survival analysis.
The cumulative hazard curves do not appear strictly linear, indicating the exponential distribution assumption may not hold perfectly. Group C demonstrates markedly elevated cumulative hazard relative to Groups A and B, consistent with the Cox model results showing group_colC has the lowest hazard ratio (HR=
Proportional hazards assumption test using Schoenfeld residuals
This section validates a core assumption of the Cox proportional hazards model: that hazard ratios remain constant over time. The test uses Schoenfeld residuals to detect whether any predictor's effect changes as follow-up time increases. Violations suggest that a predictor's impact on survival is time-dependent rather than constant, which affects the validity of the reported hazard ratios.
The Cox model assumes constant hazard ratios, but predictor_3 shows evidence of time-varying effects—its impact on survival probability changes as follow-up time progresses. This is particularly notable given predictor_3's strong effect (HR = 1.60, p < 0.001) in the main model.
Interpretation guide for hazard ratios and model output
| metric | value |
|---|---|
| N Observations | 500 |
| N Events | 467 |
| Event Rate (%) | 93.4 |
| Concordance (C-statistic) | 0.6536 |
| Concordance SE | 0.0133 |
| Log-rank p-value | 0 |
| AIC | 4870.64 |
| N Predictors | 6 |
| N Significant | 6 |
This section interprets the Cox proportional hazards model results, translating hazard ratios into clinically or operationally meaningful risk changes. It explains how each predictor affects instantaneous event hazard and validates overall model discrimination ability, enabling stakeholders to understand which factors most strongly influence survival outcomes.
The model demonstrates that group membership and late presentation status are primary drivers of event risk. The 0.654 concordance suggests the model correctly ranks risk pairs approximately 65% of the time, reflecting meaningful but incomplete discrimination. The tight clustering of all predictors around significance thresholds indicates consistent, reliable effects across the covariate set.
###