Overview

Analysis Overview

Cox Proportional Hazards Configuration

Analysis overview and configuration

Configuration

Analysis TypeProportional Hazards
CompanyClinical Research Institute
ObjectiveIdentify patient factors that predict time to disease recurrence after treatment
Analysis Date2026-03-15
Processing Idcox_test_20260315_143721
Total Observations500

Module Parameters

ParameterValue_row
confidence_level0.95confidence_level
ties_methodefronties_method
group_colgroup_colgroup_col
Proportional Hazards analysis for Clinical Research Institute

Interpretation

Purpose

This Cox proportional hazards analysis identifies patient factors predicting time to disease recurrence after treatment. The analysis evaluates six predictors across 500 patients to quantify their individual effects on recurrence risk, enabling clinicians to stratify patients by prognosis and tailor follow-up strategies.

Key Findings

  • Event Rate: 93.4% (467 of 500 patients experienced recurrence) — exceptionally high event frequency provides strong statistical power for detecting predictor effects
  • Model Concordance: 0.654 — moderate discriminative ability; the model correctly orders recurrence timing 65.4% of the time, better than chance but with room for improvement
  • All Predictors Significant: All 6 predictors achieved statistical significance (p < 0.05), with 5 at p < 0.001 level
  • Strongest Effect: predictor_3late shows highest hazard ratio (HR=1.60, 95% CI: 1.33–1.93), indicating 60% increased recurrence risk
  • Protective Factor: group_colC demonstrates strongest protective effect (HR=0.44, 95% CI: 0.35–0.55), reducing recurrence hazard by 56%
  • Proportional Hazards Assumption: Mostly satisfied globally (p
Data Preparation

Data Preprocessing

Data Quality & Completeness

Data preprocessing and column mapping

Data Quality

Initial Rows500
Final Rows500
Rows Removed0
Retention Rate100

Data Quality

MetricValue
Initial Rows500
Final Rows500
Rows Removed0
Retention Rate100%
Processed 500 observations, retained 500 (100.0%) after cleaning

Interpretation

Purpose

This section documents the data preprocessing pipeline for a survival analysis model with 500 observations. Perfect retention (100%) indicates no rows were removed during cleaning, suggesting either exceptionally clean source data or minimal data quality validation. Understanding preprocessing decisions is critical for assessing whether the model's strong performance (concordance: 0.654, all 6 predictors significant) reflects genuine predictive power or potential data quality issues masked by incomplete validation.

Key Findings

  • Retention Rate: 100% (500/500 rows retained) - No observations were excluded during preprocessing
  • Rows Removed: 0 - Complete absence of filtering, outlier removal, or missing value handling
  • Train/Test Split: Not documented - No explicit validation strategy is recorded in the preprocessing stage
  • Data Transformations: No transformations explicitly noted despite survival analysis typically requiring time-to-event and event indicator preparation

Interpretation

The perfect retention rate is unusual for real-world data and raises questions about preprocessing rigor. With 93.4% event rate and 467 events across 500 observations, the data appears complete but potentially unvalidated. The absence of documented train/test splits means model performance metrics (concordance, AIC) may reflect training set performance rather than generalization capability. This is particularly concerning given all 6 predictors achieved statistical significance—a pattern that could indicate ov

Executive Summary

Executive Summary

Key Findings & Recommendations

Key Metrics

concordance
0.6536
n_events
467
event_rate_pct
93.4
n_significant
6
n_predictors
6

Summary

Bottom Line: Cox proportional hazards regression identified 6 significant predictors of time-to-event out of 6 total. The model achieves concordance of 0.6536 (Moderate discrimination).

Key Findings:
• Event rate: 93.4% (467 events in 500 subjects)
• Top significant predictors: predictor_1, predictor_2, predictor_3late
• Model AIC: 4870.64

Recommendation: Use hazard ratios to prioritize risk factors. Subjects with HR > 1 on significant covariates have elevated event risk and may warrant targeted intervention. Verify proportional hazards assumption holds before relying on estimates.

Interpretation

EXECUTIVE SUMMARY: COX PROPORTIONAL HAZARDS MODEL

Purpose

This analysis presents the results of a Cox proportional hazards regression model designed to identify and quantify risk factors associated with time-to-event outcomes. The model's performance and predictor significance directly inform risk stratification and intervention prioritization decisions.

Key Findings

  • Concordance (C-statistic): 0.654 – Indicates moderate discriminative ability; the model correctly orders event timing approximately 65% of the time, suggesting reasonable but not exceptional predictive power
  • Event Rate: 93.4% (467 of 500 observations) – Extremely high event prevalence indicates a mature cohort with substantial follow-up time
  • Significant Predictors: All 6 predictors achieved statistical significance (p < 0.05), with predictor_3late showing the strongest effect (HR = 1.60, 95% CI: 1.33–1.93)
  • Model Fit: AIC of 4870.64 with log-rank p-value = 0, confirming overall model significance
  • Proportional Hazards Assumption: Global test p-value = 0.11 (holds); however, predictor_3 violates the assumption (p = 0.02)

Interpretation

Table 4

Model Performance

Concordance & Discrimination Statistics

Overall model performance and discrimination statistics

metricvalue
N Observations500
N Events467
Event Rate (%)93.4
Concordance (C-statistic)0.6536
Concordance SE0.0133
Log-rank p-value0
AIC4870.64
N Predictors6
N Significant6

Interpretation

Purpose

This section evaluates the overall discriminative performance and statistical significance of the Cox proportional hazards model. It answers whether the model reliably distinguishes between subjects at different risk levels and whether the included predictors meaningfully explain survival variation in the dataset.

Key Findings

  • Concordance (C-statistic): 0.654 - Indicates moderate ability to rank subjects by risk; better than random (0.5) but below strong discrimination (>0.75)
  • All 6 Predictors Significant: 100% of included terms achieved p < 0.05, suggesting genuine associations with survival outcomes
  • Log-rank p-value: 0 - Highly significant group differences, confirming the model captures meaningful stratification
  • Event Rate: 93.4% (467/500 events) - High event prevalence supports robust model estimation with sufficient outcome variation

Interpretation

The model demonstrates moderate but clinically meaningful discriminative ability. The universal significance of all six predictors—combined with a zero log-rank p-value—indicates the model successfully identifies survival-relevant factors and stratifies the cohort into meaningfully different risk groups. The high event rate ensures adequate statistical power for parameter estimation without sparse-data bias.

Context

Concordance of 0.65 is typical for survival models in observational data; clinical utility depends

Figure 5

Hazard Ratios (Forest Plot)

Effect Size with 95% Confidence Intervals

Hazard ratios and 95% confidence intervals for all predictors

Interpretation

Purpose

This section presents the Cox proportional hazards model results, quantifying how each of the 6 predictors influences event risk. The forest plot visualizes hazard ratios with 95% confidence intervals, allowing rapid assessment of effect direction and statistical significance. All predictors achieved significance (p < 0.05), indicating robust associations with the outcome in this 500-observation cohort with 93.4% event rate.

Key Findings

  • Predictor_3late: HR = 1.60 (95% CI: 1.33–1.93) — strongest risk factor; 60% increased hazard per unit increase
  • Group_colC: HR = 0.44 (95% CI: 0.35–0.55) — strongest protective effect; 56% hazard reduction versus reference
  • Predictor_1: HR = 1.02 (95% CI: 1.01–1.03) — smallest but highly significant effect (z = 5.17)
  • Confidence Interval Width: Ranges 0.11–0.33, reflecting varying precision; narrower intervals indicate more stable estimates

Interpretation

All six predictors demonstrate statistically significant associations with event hazard. Risk factors (HR > 1) include predictor_1, predictor_2, and predictor

Table 6

Model Coefficients

Hazard Ratios, Confidence Intervals, and P-values

Full coefficient table with hazard ratios, confidence intervals, and p-values

termcoefhrsez_scorep_valuehr_lowerhr_uppersignificance
patient_age0.02041.0210.00395.17401.0131.029***
biomarker_level0.10731.1130.02274.71901.0651.164***
predictor_3late0.4731.6050.0954.9801.3321.933***
predictor_4male0.19521.2160.09432.0690.03851.011.462*
group_colB-0.50860.60130.1141-4.45900.48090.752***
group_colC-0.82980.43610.1181-7.02500.3460.5498***
SemanticActual
days_observeddays_observed
event_statusevent_status
treatment_grouptreatment_group
patient_agepatient_age
biomarker_levelbiomarker_level
disease_stagedisease_stage
patient_genderpatient_gender
PredictorHRCI_LowerCI_UpperP_ValueSignificance
patient_age1.0211.0131.0290.0000***
biomarker_level1.1131.0651.1640.0000***
predictor_3late1.6051.3321.9330.0000***
predictor_4male1.2161.011.4620.0385*
group_colB0.6010.4810.7520.0000***
group_colC0.4360.3460.550.0000***

Interpretation

Purpose

This section presents the Cox proportional hazards regression coefficients for all 6 predictors in the survival model. It quantifies how each predictor affects the instantaneous risk of the event (hazard), enabling identification of protective and risk-elevating factors while accounting for censoring and competing risks in the 500-observation cohort.

Key Findings

  • predictor_3late: HR=1.60 (95% CI: 1.33–1.93, p<0.001) – Strongest risk factor; increases hazard by 60%
  • group_colC: HR=0.44 (95% CI: 0.35–0.55, p<0.001) – Strongest protective effect; reduces hazard by 56%
  • All 6 predictors significant: Five at p<0.001 (*), one at p=0.04 (*); all 95% CIs exclude 1.0
  • Hazard ratio range: 0.44–1.60, indicating substantial effect heterogeneity across predictors

Interpretation

The model identifies three protective factors (group_colB, group_colC, and baseline reference) and three risk elevators (predictor_1, predictor_2, predictor_3late). The high event rate (93.4

Figure 7

Survival Curves

Survival Probability Over Time by Group

Survival probability curves over time, stratified by group

Interpretation

Purpose

This section visualizes Kaplan-Meier survival curves for three distinct groups, showing how the probability of survival changes over time (0–1,491 days). Wider separation between curves indicates stronger group differences in survival outcomes. The 95% confidence intervals (shaded bands) quantify uncertainty around each estimate, enabling assessment of whether observed differences are statistically meaningful.

Key Findings

  • Time Range: Observations span 1 to 1,491 days (median 151 days), with right-skewed distribution indicating longer follow-up tails
  • Survival Decline: Mean survival probability decreases from 0.98 at early timepoints to 0.04–0.05 at late timepoints, reflecting cumulative event occurrence
  • Group Distribution: Groups A, B, and C are nearly balanced (129–138 observations each), enabling fair comparison
  • Confidence Intervals: Narrow at early times, widening substantially at later timepoints due to reduced sample size from censoring/events

Interpretation

The curves demonstrate substantial group stratification in survival outcomes. Early separation suggests groups experience markedly different hazard rates. The log-rank p-value of 0 (from overall metrics) confirms these differences are statistically significant. This aligns with the Cox model's concordance of 0.654, indicating

Figure 8

Cumulative Hazard

Accumulated Risk Over Time by Group

Cumulative hazard over time by group

Interpretation

Purpose

The cumulative hazard plot visualizes accumulated risk over time using the Nelson-Aalen estimator, enabling assessment of whether the exponential distribution assumption holds and whether the proportional hazards assumption is satisfied across groups. This diagnostic is critical for validating the Cox proportional hazards model used in the overall survival analysis.

Key Findings

  • Time Range: Events tracked from 1 to 1,491 days (mean=213 days), capturing the full follow-up period with right-skewed distribution
  • Cumulative Hazard Range: 0.01 to 23.03 across groups, with Group C showing substantially higher accumulated risk (3.32 at endpoint) compared to earlier timepoints
  • Group Distribution: Balanced representation across three groups (A: 129, B: 138, C: 134 observations), enabling fair cross-group comparison
  • Hazard Accumulation Pattern: Curves show non-linear acceleration, particularly for Group C, suggesting increasing hazard rates over time rather than constant exponential hazard

Interpretation

The cumulative hazard curves do not appear strictly linear, indicating the exponential distribution assumption may not hold perfectly. Group C demonstrates markedly elevated cumulative hazard relative to Groups A and B, consistent with the Cox model results showing group_colC has the lowest hazard ratio (HR=

Figure 9

PH Assumption Diagnostics

Schoenfeld Residuals — Test for Proportional Hazards

Proportional hazards assumption test using Schoenfeld residuals

Interpretation

Purpose

This section validates a core assumption of the Cox proportional hazards model: that hazard ratios remain constant over time. The test uses Schoenfeld residuals to detect whether any predictor's effect changes as follow-up time increases. Violations suggest that a predictor's impact on survival is time-dependent rather than constant, which affects the validity of the reported hazard ratios.

Key Findings

  • Number of Violations: 1 predictor violates the proportional hazards assumption (p < 0.05)
  • Predictor_3: Identified as the violating variable with p-value = 0.02, indicating its hazard ratio is not constant over time
  • Majority Compliance: 5 of 6 predictors satisfy the PH assumption (p-values range from 0.12 to 0.89)
  • Global Test: The overall model passes (p = 0.11), suggesting the violation is isolated and does not invalidate the entire model

Interpretation

The Cox model assumes constant hazard ratios, but predictor_3 shows evidence of time-varying effects—its impact on survival probability changes as follow-up time progresses. This is particularly notable given predictor_3's strong effect (HR = 1.60, p < 0.001) in the main model.

Table 10

Model Interpretation

Hazard Ratio Interpretation Guide

Interpretation guide for hazard ratios and model output

metricvalue
N Observations500
N Events467
Event Rate (%)93.4
Concordance (C-statistic)0.6536
Concordance SE0.0133
Log-rank p-value0
AIC4870.64
N Predictors6
N Significant6

Interpretation

Purpose

This section interprets the Cox proportional hazards model results, translating hazard ratios into clinically or operationally meaningful risk changes. It explains how each predictor affects instantaneous event hazard and validates overall model discrimination ability, enabling stakeholders to understand which factors most strongly influence survival outcomes.

Key Findings

  • Most Protective Predictor: group_colC (HR = 0.44) reduces hazard by 56.4%, indicating substantially lower event risk in this group
  • Highest Risk Predictor: predictor_3late (HR = 1.61) increases hazard by 60.5%, representing the strongest adverse effect among all predictors
  • Model Discrimination: Concordance = 0.654 indicates moderate predictive accuracy—substantially better than random (0.5) but with room for improvement toward perfect (1.0)
  • Statistical Strength: All 6 predictors are significant (p < 0.05); 467 events provide robust evidence

Interpretation

The model demonstrates that group membership and late presentation status are primary drivers of event risk. The 0.654 concordance suggests the model correctly ranks risk pairs approximately 65% of the time, reflecting meaningful but incomplete discrimination. The tight clustering of all predictors around significance thresholds indicates consistent, reliable effects across the covariate set.

###

Want to run this analysis on your own data? Upload CSV — Free Analysis See Pricing