Code for Cox PH Model Table
Table of Contents
Introduction
This is the code to generate summary table of uni-/multi-variate Cox PH models results.
Add details later.
Package and Data
library(tidyverse)
library(survival)
data("lung")
data = lung %>%
mutate(
surv_mon = time/30.437,
surv_status = status - 1,
sex = factor(sex, levels = c(1,2), labels = c("Male","Female"))
)
head(data)
## inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss
## 1 3 306 2 74 Male 1 90 100 1175 NA
## 2 3 455 2 68 Male 0 90 90 1225 15
## 3 3 1010 1 56 Male 0 90 90 NA 15
## 4 5 210 2 57 Male 1 90 60 1150 11
## 5 1 883 2 60 Male 0 100 90 NA 0
## 6 12 1022 1 74 Male 1 50 80 513 0
## surv_mon surv_status
## 1 10.053553 1
## 2 14.948911 1
## 3 33.183297 0
## 4 6.899497 1
## 5 29.010744 1
## 6 33.577554 0
We use the lung
dataset available from the survival package survival. The data contain subjects with advanced lung cancer from the North Central Cancer Treatment Group. It includes the 10 following variables:
inst
: Institution codetime
: Survival time in days. We calculate the survival time in month atsurv_mon
.status
: Censoring status 1=censored, 2=dead. We re-code it as 0=censored, 1=dead atsurv_status
.age
: Age in yearssex
: Male=1 Female=2ph.ecog
: ECOG performance score as rated by the physician. 0=asymptomatic, 1= symptomatic but completely ambulatory, 2= in bed <50% of the day, 3= in bed > 50% of the day but not bedbound, 4 = bedboundph.karno
: Karnofsky performance score (bad=0-good=100) rated by physicianpat.karno
: Karnofsky performance score (0 = bad, 100 = good) as rated by patientmeal.cal
: Calories consumed at mealswt.loss
: Weight loss in last six months
Code and Example
Univariate Models
Input:
data
= Your cleaned-up survival dataset.outcomeTimeVar
= Your outcome time variable, please make sure the unit is correct. The unit is Months by default.outcomeVar
= The numeric outcome indicator, 0 = censored, 1 = event.covList
= Names of variables of interest.labelList
= Label of variables of interest.outcomeStart
= Start time of outcome, NULL if you have calculated survival time.outcomeStop
= Stop time of outcome, NULL if you have calculated survival time.cluster
= Will remove this option later, please keep it as NULL (default).fgwt
= Weight, using the weights gives you a weighted estimator. The resulting estimates of β and λ are the same as if you had wi identical copies of observation i (except that wi doesn’t have to be an integer).estimand
= Name of estimand if you use propensity score weighting method. (ARGUMENT to be added)
UniCoxTable <- function(outcomeTimeVar = "outcomeTime",
outcomeVar = "outcome",
covList = NamesList,
labelList = LabelsList,
data = CleanData,
outcomeStart = NULL,
outcomeStop = NULL,
cluster = NULL,
fgwt = NULL
){
ResTable = NULL
for (i in 1: length(covList)){
v = covList[i]
l = labelList[i]
if(length(table(data[[v]])) <2){
ResTable = rbind(ResTable, data.frame(Name = l, Levels = "No enough levels, must have >=2 levels." ,Ref = "" ,HR = "", low95 = "", up95 = "", pVal = ""))
} else {
if (is.null(outcomeTimeVar)) {
survF = as.formula(paste0("Surv(", outcomeStart,",",outcomeStop,",",outcomeVar,") ~",paste(c(v,cluster), collapse = "+")))
} else {
survF = as.formula(paste0("Surv(",outcomeTimeVar,",",outcomeVar,") ~", paste(c(v,cluster), collapse = "+")))
}
sdf.cox = coxph(survF, data = data, weights = fgwt)
if (is.null(cluster)) {
p.val = round(summary(sdf.cox)$coefficient[,5],4)
} else {
p.val = round(summary(sdf.cox)$coefficient[,6],4)
}
HR = round(summary(sdf.cox)$conf.int[,1],2)
up95 = round(summary(sdf.cox)$conf.int[,4],2)
low95 = round(summary(sdf.cox)$conf.int[,3],2)
if(is.null(sdf.cox$xlevels)){
Ref = ""
Levels = ""
Name = l
} else{
tempxlevel = sdf.cox$xlevel[[1]]
Ref = rep(tempxlevel[1], length(tempxlevel)-1)
Levels = tempxlevel[-1]
Name = rep("", length(tempxlevel)-1)
Name[1] = l
}
ResTable = rbind(ResTable, data.frame(Name = Name, Levels = Levels ,Ref = Ref ,HR = HR, low95 = low95, up95 = up95, pVal = p.val))
}
}
row.names(ResTable) = NULL
return(ResTable)
}
UniCoxTable(outcomeTimeVar = "surv_mon",
outcomeVar = "surv_status",
covList = c("age","sex","ph.ecog","wt.loss"),
labelList = c("Age","Gender","ECOG","Weight loss"),
data = data,
cluster = NULL,fgwt = NULL)
## Name Levels Ref HR low95 up95 pVal
## 1 Age 1.02 1.00 1.04 0.0419
## 2 Gender Female Male 0.59 0.42 0.82 0.0015
## 3 ECOG 1.61 1.29 2.01 0.0000
## 4 Weight loss 1.00 0.99 1.01 0.8282
Multivariate Models
Input:
data
= Your cleaned-up survival dataset.outcomeTimeVar
= Your outcome time variable, please make sure the unit is correct. The unit is Months by default.outcomeVar
= The numeric outcome indicator, 0 = censored, 1 = event.covList
= Names of variables of interest.labelList
= Label of variables of interest.exposureVar
= If you have interaction terms, argument needs to be improved, please keep it as NULL for now.exposureLab
= If you have interaction terms, argument needs to be improved, please keep it as NULL for now.outcomeStart
= Start time of outcome, NULL if you have calculated survival time.outcomeStop
= Stop time of outcome, NULL if you have calculated survival time.cluster
= Will remove this option later, please keep it as NULL (default).fgwt
= Weight, using the weights gives you a weighted estimator. The resulting estimates of β and λ are the same as if you had wi identical copies of observation i (except that wi doesn’t have to be an integer).estimand
= Name of estimand if you use propensity score weighting method.
MultiCoxTable <- function(outcomeTimeVar = "outcomeTime",
outcomeVar = "outcomeIndicator",
exposureVar = NULL,
exposureLab = NULL,
covList = MultiCovList,
labelList = MultiCovLabelList,
data = surv_data,
outcomeStart = NULL,
outcomeStop = NULL,
cluster = NULL,
fgwt = NULL,
estimand = NULL){
if (length(exposureVar)>1){
covListSub = c(exposureVar, covList,paste0(exposureVar, collapse = "*"))
covList = covListSub
labelList = c(exposureLab, labelList, paste0(exposureVar, collapse = "*"))
} else {
covList = c(exposureVar, covList)
covListSub = covList
labelList = c(exposureLab, labelList)
}
if (is.null(outcomeTimeVar)) {
survF = as.formula(paste0("Surv(", outcomeStart,",",outcomeStop,",",outcomeVar,") ~",paste(c(covListSub), collapse = "+")))
} else {
survF = as.formula(paste0("Surv(",outcomeTimeVar,",",outcomeVar,") ~", paste(c(covListSub), collapse = "+")))
}
sdf.cox = coxph(survF, data = data, weights = fgwt)
if (is.null(cluster)) {
p.val = round(summary(sdf.cox)$coefficient[,5],4)
} else {
p.val = round(summary(sdf.cox)$coefficient[,6],4)
}
HR = round(summary(sdf.cox)$conf.int[,1],2)
up95 = round(summary(sdf.cox)$conf.int[,4],2)
low95 = round(summary(sdf.cox)$conf.int[,3],2)
Ref = NULL
Levels = NULL
Name = NULL
for(i in 1:(length(covListSub))){
v = covListSub[i]
l = labelList[which(covList == v)]
if(is.null(sdf.cox$xlevels[[v]])){
Ref_add = ""
Levels_add = ""
Name_add = l
} else{
tempxlevel = sdf.cox$xlevel[[v]]
Ref_add = rep(tempxlevel[1], length(tempxlevel)-1)
Levels_add = tempxlevel[-1]
Name_add = rep("", length(tempxlevel)-1)
Name_add[1] = l
}
Name = c(Name, Name_add)
Ref = c(Ref, Ref_add)
Levels = c(Levels, Levels_add)
}
ResTable = data.frame(Name = Name, Levels = Levels ,Ref = Ref,
hr_ci = paste0(HR," [",low95,", ",up95,"]"),
p = p.val)
ResTable <- setNames(ResTable, c("Name","Levels","Ref", paste0(estimand, " HR (95% CI)"), paste0(estimand, " P-Value")))
row.names(ResTable) = NULL
return(ResTable)
}
MultiCoxTable(outcomeTimeVar = "surv_mon",
outcomeVar = "surv_status",
exposureVar = NULL, exposureLab = NULL,
covList = c("age","sex","ph.ecog","wt.loss"),
labelList = c("Age","Gender","ECOG","Weight loss"),
data = data,
cluster = NULL,fgwt = NULL, estimand = NULL)
## Name Levels Ref HR (95% CI) P-Value
## 1 Age 1.01 [0.99, 1.03] 0.1650
## 2 Gender Female Male 0.55 [0.39, 0.78] 0.0008
## 3 ECOG 1.67 [1.31, 2.14] 0.0000
## 4 Weight loss 0.99 [0.98, 1] 0.1761
Session Info
sessionInfo()
## R version 4.4.0 (2024-04-24 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 22631)
##
## Matrix products: default
##
##
## locale:
## [1] LC_COLLATE=English_United States.utf8
## [2] LC_CTYPE=English_United States.utf8
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.utf8
##
## time zone: America/New_York
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] survival_3.6-4 lubridate_1.9.0 timechange_0.1.1 forcats_1.0.0
## [5] stringr_1.5.1 dplyr_1.1.3 purrr_1.0.1 readr_2.1.4
## [9] tidyr_1.3.1 tibble_3.2.1 ggplot2_3.4.4 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] sass_0.4.5 utf8_1.2.3 generics_0.1.3 blogdown_1.18
## [5] stringi_1.7.12 lattice_0.22-6 hms_1.1.3 digest_0.6.33
## [9] magrittr_2.0.3 evaluate_0.22 grid_4.4.0 bookdown_0.35
## [13] fastmap_1.1.1 jsonlite_1.8.7 Matrix_1.7-0 fansi_1.0.4
## [17] scales_1.3.0 jquerylib_0.1.4 cli_3.6.1 rlang_1.1.1
## [21] munsell_0.5.1 splines_4.4.0 withr_3.0.0 cachem_1.0.8
## [25] yaml_2.3.6 tools_4.4.0 tzdb_0.3.0 colorspace_2.1-0
## [29] vctrs_0.6.3 R6_2.5.1 lifecycle_1.0.4 pkgconfig_2.0.3
## [33] pillar_1.9.0 bslib_0.5.1 gtable_0.3.5 glue_1.6.2
## [37] xfun_0.43 tidyselect_1.2.1 rstudioapi_0.15.0 knitr_1.46
## [41] htmltools_0.5.4 rmarkdown_2.25 compiler_4.4.0