Code for Cox PH Model Table

Last updated on Oct 16, 2024 7 min read Biostatistics

Table of Contents

Introduction

This is the code to generate summary table of uni-/multi-variate Cox PH models results.

Add details later.

Package and Data

library(tidyverse)
library(survival)

data("lung")

data = lung %>% 
  mutate(
    surv_mon = time/30.437,
    surv_status = status - 1,
    sex = factor(sex, levels = c(1,2), labels = c("Male","Female"))
  )

head(data)

##   inst time status age  sex ph.ecog ph.karno pat.karno meal.cal wt.loss
## 1    3  306      2  74 Male       1       90       100     1175      NA
## 2    3  455      2  68 Male       0       90        90     1225      15
## 3    3 1010      1  56 Male       0       90        90       NA      15
## 4    5  210      2  57 Male       1       90        60     1150      11
## 5    1  883      2  60 Male       0      100        90       NA       0
## 6   12 1022      1  74 Male       1       50        80      513       0
##    surv_mon surv_status
## 1 10.053553           1
## 2 14.948911           1
## 3 33.183297           0
## 4  6.899497           1
## 5 29.010744           1
## 6 33.577554           0

We use the lung dataset available from the survival package survival. The data contain subjects with advanced lung cancer from the North Central Cancer Treatment Group. It includes the 10 following variables:

inst: Institution code
time: Survival time in days. We calculate the survival time in month at surv_mon.
status: Censoring status 1=censored, 2=dead. We re-code it as 0=censored, 1=dead at surv_status.
age: Age in years
sex: Male=1 Female=2
ph.ecog: ECOG performance score as rated by the physician. 0=asymptomatic, 1= symptomatic but completely ambulatory, 2= in bed <50% of the day, 3= in bed > 50% of the day but not bedbound, 4 = bedbound
ph.karno: Karnofsky performance score (bad=0-good=100) rated by physician
pat.karno: Karnofsky performance score (0 = bad, 100 = good) as rated by patient
meal.cal: Calories consumed at meals
wt.loss: Weight loss in last six months

Code and Example

Univariate Models

Input:

data = Your cleaned-up survival dataset.
outcomeTimeVar = Your outcome time variable, please make sure the unit is correct. The unit is Months by default.
outcomeVar = The numeric outcome indicator, 0 = censored, 1 = event.
covList = Names of variables of interest.
labelList = Label of variables of interest.
outcomeStart = Start time of outcome, NULL if you have calculated survival time.
outcomeStop = Stop time of outcome, NULL if you have calculated survival time.
cluster = Will remove this option later, please keep it as NULL (default).
fgwt = Weight, using the weights gives you a weighted estimator. The resulting estimates of β and λ are the same as if you had wi identical copies of observation i (except that wi doesn’t have to be an integer).
estimand = Name of estimand if you use propensity score weighting method. (ARGUMENT to be added)

UniCoxTable <- function(outcomeTimeVar = "outcomeTime", 
                        outcomeVar = "outcome",
                        covList = NamesList,
                        labelList = LabelsList,
                        data = CleanData,
                        outcomeStart = NULL,
                        outcomeStop = NULL,
                        cluster = NULL,
                        fgwt = NULL
){
    ResTable = NULL
    for (i in 1: length(covList)){
        v = covList[i]
        l = labelList[i]
        if(length(table(data[[v]])) <2){
            ResTable = rbind(ResTable, data.frame(Name = l, Levels = "No enough levels, must have >=2 levels." ,Ref = "" ,HR = "", low95 = "", up95 = "", pVal = ""))
        } else {
            if (is.null(outcomeTimeVar)) {
                survF = as.formula(paste0("Surv(", outcomeStart,",",outcomeStop,",",outcomeVar,") ~",paste(c(v,cluster), collapse = "+")))
                
            } else {
                survF = as.formula(paste0("Surv(",outcomeTimeVar,",",outcomeVar,") ~", paste(c(v,cluster), collapse = "+")))
                
            }
            sdf.cox = coxph(survF, data = data, weights = fgwt)
            if (is.null(cluster)) {
                p.val = round(summary(sdf.cox)$coefficient[,5],4)
                
            } else {
                p.val = round(summary(sdf.cox)$coefficient[,6],4)
                
            }
            HR = round(summary(sdf.cox)$conf.int[,1],2)
            up95 = round(summary(sdf.cox)$conf.int[,4],2)
            low95 = round(summary(sdf.cox)$conf.int[,3],2)
            if(is.null(sdf.cox$xlevels)){
                Ref = ""
                Levels = ""
                Name = l
                
            } else{
                tempxlevel = sdf.cox$xlevel[[1]]
                Ref = rep(tempxlevel[1], length(tempxlevel)-1)
                Levels = tempxlevel[-1]
                Name = rep("", length(tempxlevel)-1)
                Name[1] = l
            }
            ResTable = rbind(ResTable, data.frame(Name = Name, Levels = Levels ,Ref = Ref ,HR = HR, low95 = low95, up95 = up95, pVal = p.val)) 
        }

        
        
}
    row.names(ResTable) = NULL
    
    return(ResTable)
    
}

UniCoxTable(outcomeTimeVar = "surv_mon",
            outcomeVar = "surv_status",
            covList = c("age","sex","ph.ecog","wt.loss"),
            labelList = c("Age","Gender","ECOG","Weight loss"),
            data = data,
            cluster = NULL,fgwt = NULL)

##          Name Levels  Ref   HR low95 up95   pVal
## 1         Age             1.02  1.00 1.04 0.0419
## 2      Gender Female Male 0.59  0.42 0.82 0.0015
## 3        ECOG             1.61  1.29 2.01 0.0000
## 4 Weight loss             1.00  0.99 1.01 0.8282

Multivariate Models

Input:

data = Your cleaned-up survival dataset.
outcomeTimeVar = Your outcome time variable, please make sure the unit is correct. The unit is Months by default.
outcomeVar = The numeric outcome indicator, 0 = censored, 1 = event.
covList = Names of variables of interest.
labelList = Label of variables of interest.
exposureVar = If you have interaction terms, argument needs to be improved, please keep it as NULL for now.
exposureLab = If you have interaction terms, argument needs to be improved, please keep it as NULL for now.
outcomeStart = Start time of outcome, NULL if you have calculated survival time.
outcomeStop = Stop time of outcome, NULL if you have calculated survival time.
cluster = Will remove this option later, please keep it as NULL (default).
fgwt = Weight, using the weights gives you a weighted estimator. The resulting estimates of β and λ are the same as if you had wi identical copies of observation i (except that wi doesn’t have to be an integer).
estimand = Name of estimand if you use propensity score weighting method.

MultiCoxTable <- function(outcomeTimeVar = "outcomeTime",
                          outcomeVar = "outcomeIndicator",
                          exposureVar = NULL,
                          exposureLab = NULL,
                          covList = MultiCovList, 
                          labelList = MultiCovLabelList,
                          data = surv_data,                          
                          outcomeStart = NULL,
                          outcomeStop = NULL,
                          cluster = NULL,
                          fgwt = NULL,
                          estimand = NULL){
  if (length(exposureVar)>1){
    covListSub = c(exposureVar, covList,paste0(exposureVar, collapse = "*"))
    covList = covListSub
    labelList = c(exposureLab, labelList, paste0(exposureVar, collapse = "*"))
  } else {
    covList = c(exposureVar, covList)
    covListSub = covList
    labelList = c(exposureLab, labelList)
  }
  

    
    if (is.null(outcomeTimeVar)) {
        survF = as.formula(paste0("Surv(", outcomeStart,",",outcomeStop,",",outcomeVar,") ~",paste(c(covListSub), collapse = "+")))
        
    } else {
        survF = as.formula(paste0("Surv(",outcomeTimeVar,",",outcomeVar,") ~", paste(c(covListSub), collapse = "+")))
        
    }
    
    sdf.cox = coxph(survF, data = data, weights = fgwt)
    
    if (is.null(cluster)) {
      p.val = round(summary(sdf.cox)$coefficient[,5],4)
      
    } else {
      p.val = round(summary(sdf.cox)$coefficient[,6],4)
      
    }
    HR = round(summary(sdf.cox)$conf.int[,1],2)
    up95 = round(summary(sdf.cox)$conf.int[,4],2)
    low95 = round(summary(sdf.cox)$conf.int[,3],2)
    
    Ref = NULL
    Levels = NULL
    Name = NULL
    for(i in 1:(length(covListSub))){
        v = covListSub[i]
        l = labelList[which(covList == v)]
        if(is.null(sdf.cox$xlevels[[v]])){
            Ref_add = ""
            Levels_add = ""
            Name_add = l
            
        } else{
            tempxlevel = sdf.cox$xlevel[[v]]
            Ref_add = rep(tempxlevel[1], length(tempxlevel)-1)
            Levels_add = tempxlevel[-1]
            Name_add = rep("", length(tempxlevel)-1)
            Name_add[1] = l
        }
        Name = c(Name, Name_add)
        Ref = c(Ref, Ref_add)
        Levels = c(Levels, Levels_add)
        
    }
    
    ResTable = data.frame(Name = Name, Levels = Levels ,Ref = Ref,
                          hr_ci = paste0(HR," [",low95,", ",up95,"]"), 
                          p = p.val) 
    
    ResTable <- setNames(ResTable, c("Name","Levels","Ref", paste0(estimand, " HR (95% CI)"), paste0(estimand, " P-Value"))) 
    
    row.names(ResTable) = NULL
    
    return(ResTable)
    
}

MultiCoxTable(outcomeTimeVar = "surv_mon",
            outcomeVar = "surv_status",
            exposureVar = NULL, exposureLab = NULL,
            covList = c("age","sex","ph.ecog","wt.loss"),
            labelList = c("Age","Gender","ECOG","Weight loss"),
            data = data,
            cluster = NULL,fgwt = NULL, estimand = NULL)

##          Name Levels  Ref       HR (95% CI)  P-Value
## 1         Age             1.01 [0.99, 1.03]   0.1650
## 2      Gender Female Male 0.55 [0.39, 0.78]   0.0008
## 3        ECOG             1.67 [1.31, 2.14]   0.0000
## 4 Weight loss                0.99 [0.98, 1]   0.1761

Session Info

sessionInfo()

## R version 4.4.0 (2024-04-24 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 22631)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=English_United States.utf8 
## [2] LC_CTYPE=English_United States.utf8   
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.utf8    
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] survival_3.6-4   lubridate_1.9.0  timechange_0.1.1 forcats_1.0.0   
##  [5] stringr_1.5.1    dplyr_1.1.3      purrr_1.0.1      readr_2.1.4     
##  [9] tidyr_1.3.1      tibble_3.2.1     ggplot2_3.4.4    tidyverse_2.0.0 
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.5        utf8_1.2.3        generics_0.1.3    blogdown_1.18    
##  [5] stringi_1.7.12    lattice_0.22-6    hms_1.1.3         digest_0.6.33    
##  [9] magrittr_2.0.3    evaluate_0.22     grid_4.4.0        bookdown_0.35    
## [13] fastmap_1.1.1     jsonlite_1.8.7    Matrix_1.7-0      fansi_1.0.4      
## [17] scales_1.3.0      jquerylib_0.1.4   cli_3.6.1         rlang_1.1.1      
## [21] munsell_0.5.1     splines_4.4.0     withr_3.0.0       cachem_1.0.8     
## [25] yaml_2.3.6        tools_4.4.0       tzdb_0.3.0        colorspace_2.1-0 
## [29] vctrs_0.6.3       R6_2.5.1          lifecycle_1.0.4   pkgconfig_2.0.3  
## [33] pillar_1.9.0      bslib_0.5.1       gtable_0.3.5      glue_1.6.2       
## [37] xfun_0.43         tidyselect_1.2.1  rstudioapi_0.15.0 knitr_1.46       
## [41] htmltools_0.5.4   rmarkdown_2.25    compiler_4.4.0

R Code Table