How to Prepare a DSMB Report

Last updated on Apr 24, 2025 15 min read Instructions

Table of Contents

⚠️ Make Sure You Understand the Code Before Using It ⚠️

What is DSMB Meeting

The function of Data and Safety Monitoring Boards (DSMBs) is to review accumulating data (e.g. safety, efficacy, accrual rate) during a trial to advise sponsors and investigators as to whether to continue the trial unaltered.

We’ll collaborate with the study team to draft the report. Trials may undergo review either semi-annually or annually, based on the associated treatment risks.

Tips and Warnings

The data structure for most trials will be similar. The main focus should be on ensuring data completeness and checking for consistency. For instance, there could be typos, misclassifications, and discrepancies.
I’ll share the code for generating a two-arm randomization trial here. Modifying the code to suit a one-arm or multi-arm trial should be straightforward. I’ll show the image of table shell at the beginning of corresponding session. However, the template and requirement may vary depend on specific study.
We only need the code - we don’t generate the report directly from the .RMD file.
Due to confidentiality concerns, I won’t display the datasets or the final report, and the code cannot be used directly. For more details, please reach out to Hanfei at hqi11@jhmi.edu.

Report Session Code Breakdown

Metadata Section

For DSMB reports, we typically don’t knit the .Rmd to PDF or HTML files. We will copy and paste the table 1 and write other tables into .rtf shell table. However, feel free to choose settings that work best for you.

---
title: "JXXXX_DSMB_Summer/Winter_2023"
author: "YOUR NAME"
date: "`r format(Sys.time(), '%B %d, %Y')`"
output:
  html_document:
    df_print: paged
    toc: yes
    toc_depth: '4'
  pdf_document:
    df_print: kable
    fig_caption: yes
    fig_height: 6
    fig_width: 7
    number_sections: yes
    toc: yes
    toc_depth: 2
  word_document:
    toc: yes
    toc_depth: '2'
---

Loading Libraries and Functions

To generate tables in .rtf file, we use CG’s code to replace ‘AAAA’s in the shell table (table with AAAA in cells to be filled).

knitr::opts_chunk$set(echo = FALSE, message=FALSE, warning=FALSE)
options(knitr.kable.NA = "")
rm(list = ls())

## ATTENTION: ALWAYS check your date data!!!!
as.date=function(x) as.Date(x,origin='1970-01-01') # For excel, '1899-12-30'

knit_format = "html"
# knit_format = "latex"


data_lock_date = as.date("2023-03-08") # Edit the data lock date
data_dir = paste0(dirname(rprojroot::find_rstudio_root_file()),"/Data/")


library(readxl)
library(knitr)
library(kableExtra)
library(tidyverse)
library(survival)
library(ggfortify)
library(survminer)
library(rlang)
library(xtable)
library(table1)


## CG's code
## The 'shells' need to be modified based on specific project
## Replace AAAA in each cell with your data frame cell
tk_exp_rst <- function(numbers, shell_f,  rst_f = NULL,
                       sub_str = "AAAA", ext = "_rst.rtf",
                       append = FALSE) {
    if (!file.exists(shell_f)) {
        return
    }

    ##read
    tpla <- readChar(shell_f, file.info(shell_f)$size)

    ##substitute
    for (i in seq_len(length(numbers))) {
        tpla <- sub(sub_str, numbers[i], tpla)
    }

    ##write out
    if (is.null(rst_f))
        rst_f <- sub("\\.([^\\.]*)$", ext, shell_f)

    write(tpla, file = rst_f, append = append)
}

Data Import

We now need to load each CRF into a separate data frame. Keep in mind that not all trials have these CRFs. Please modify by adding or removing as per the specific project.

crms_report = 
  read_excel(
    paste0(data_dir,"JXXXX Statistician Report.xlsx"), 
    sheet = "Sheet1",
    na = "NULL"
    )

data_demographics = 
  read_excel(
    paste0(data_dir,"JXXXX Demographics.xlsx"), 
    sheet = "XXX"
    )


data_adverse_events = 
  read_excel(
    paste0(data_dir,"JXXXX AE Log.xlsx"), 
    sheet = "XXX"
    )


data_enrollment_randomization = 
  read_excel(
    paste0(data_dir,"JXXXX Randomization.xlsx"), 
    sheet = "XXX"
    )


data_drugA_admin = 
  read_excel(
    paste0(data_dir,"JXXXX Drug A Administration.xlsx"),
    sheet = "XXX"
    )


data_drugB_admin = 
  read_excel(
    paste0(data_dir,"JXXXX Drug B Administration.xlsx"),
    sheet = "XXX"
    )


data_end_of_treatment = read_excel(
    paste0(data_dir,"JXXXX EOT.xlsx"),
    sheet = "XXX"
    )

data_end_of_study = read_excel(
    paste0(data_dir,"JXXXX Death_EOS.xlsx"),
    sheet = "XXX"
    )

data_follow_up = read_excel(
    paste0(data_dir,"JXXXX Follow Up.xlsx"),
    sheet = "XXX"
    )

data_deviations = read_excel(
  paste0(data_dir,"JXXXX Deviation Log.xlsx"),
  sheet = "XXX"
)

data_surgery = read_excel(
  paste0(data_dir,"JXXXX Surgery.xlsx"),
  sheet = "XXX"
)

Data Wrangling

Based on the code book, wrangling and labeling datasets. Sometimes the same variable in different CRF has different format.

Special case example: 1 patient hasn’t been included in the randomization CRF since they were assigned directly to one arm instead of being randomized.

data_demographics_wrg = 
  data_demographics %>%
  mutate(
    Ethnicity = factor(
      Ethnicity, 
      levels = c("Hispanic or Latino","Not Hispanic or Latino","Declined to Answer","Unknown"), 
      labels = c("Hispanic or Latino","Not Hispanic or Latino","Declined to Answer","Unknown")
      ),
    Race = factor(
      Race,
      levels = c("White","Black","Asian","Other","Unknown"),
      labels = c("White","Black","Asian","Other","Unknown")),
    Gender = factor(
      Gender,
      levels = c("Male","Female"),
      labels = c("Male","Female")
      )
    )

data_enrollment_randomization_wrg = 
  data_enrollment_randomization %>%
  add_row(`Subject ID` = "JXXXX-XX", `Randomization Assignment` = "Arm X", `Complete?` = "Complete") %>% 
  mutate(`Randomization Assignment` = factor(`Randomization Assignment`))

ENROLLMENT STATUS

Summary current enrollment status. Make sure you confirm the Projected Closure Date with research coordinator. The expected accrual rate is calculated by current accrual devided by (time difference in months from the activation date of the study to data cutoff date). Confirm with the team if they want to separate by arms.

Usually the starting date is open to enroll date (OnCore->Institution->Status Date) OR the date of first patient enrolled into study, depending on the specific study, if the study took too long to enroll the first patient, use the open to enroll date. If the study enrolled the first patient pretty soon, it’s okay to use either one.

Figure: Sample Enrollment Status Table Shell

total_expected_enroll = 30
total_expected_evaluable = 30 
total_expected_duration_month = 50 #Edit

expected_accrual_rate = round(total_expected_enroll/total_expected_duration_month, 2) 
study_activation_date = as.date("20YY-MM-DD") #Edit, usually the date of first patient enrolled into study OR open to enroll date, depending on the specific study, if the study took too long to enroll the first patient, use the open to enroll date.
first_accrual_date = "20YY-MM-DD"

projected_closure_date = "June 2025" #Edit

actual_enroll = sum(crms_report$`Enrollment Status` %in% c("Enrolled","Off Study","Follow Up","Eligible")) ## eligible = 004
actual_duration_month = as.numeric(difftime(time1 = data_lock_date, time2 = study_activation_date, units = "days"))/(365/12)
actual_accrual_rate = round(actual_enroll/actual_duration_month, 2)


## using enrollment report from CRMS

header = c("Accrual Target (# patients)",
           "Current Accrual (# patients)", 
           "Expected Accrual Rate (# patients/month)",
           "Actual Accrual Rate (# patients/month)",
           "Date of First Patient Accrual",
           "Projected Closure Date (enrollment completion)")

## This is the table that will be extracted
number = c(total_expected_enroll, 
           actual_enroll, 
           expected_accrual_rate, 
           actual_accrual_rate, 
           first_accrual_date,
           projected_closure_date)

accrual_table = cbind(header, number)

## Take a look at R studio

kable(accrual_table, knit_format, booktabs=T, longtable =T,
      col.names = c(" ", "Enrolled")) %>%
  row_spec(0, bold = TRUE) %>%
  column_spec(1, bold = TRUE) %>%
  kable_classic(full_width = F)



## rtf
to_fill <- t(number)
tk_exp_rst(to_fill, shell_f = "JXXXX_shell_accrual.rtf", rst_f = "JXXXX_rst_accrual.rtf")

EXCLUSIONS

The Patient Inclusion and Exclusion table reflects the screening steps. After providing consent, patients will undergo screening. Some might withdraw their consent before being enrolled in the trial. Withdrawals after enrollment aren’t included in this table.

After signing consent, patients will receive a study subject ID. However, there are instances where the study team may not record this information. Therefore, we need to confirm with the study coordinator how many patients failed screening.

Figure: Sample Exclusions Table Shell

data_exclusions = left_join(
  crms_report, 
  data_enrollment_randomization_wrg[,c("Subject ID", "Randomization Assignment")],
  by = c("Subject #" = "Subject ID")
  )

patient_consented = 
  data_exclusions %>%
  filter(!(is.na(`Consent Date`))) %>%
  nrow()

patient_screen_failure = 
  data_exclusions %>% 
  filter(!(is.na(`Consent Date`))) %>%
  filter(`Enrollment Status` == "Not eligible") %>% 
  nrow()


patient_withdrawn_consent = data_exclusions %>% 
  filter(!(is.na(`Consent Date`))) %>%
  filter(is.na(`On Study`)) %>%
  filter(`Off Study Reason` == "Patient withdrew consent") %>% 
  nrow()

patient_in_screening = data_exclusions %>% 
  filter(!(is.na(`Consent Date`))) %>%
  filter(`Enrollment Status` == "Candidate") %>% 
  nrow()
  

ArmA_enrolled = data_exclusions %>%
  filter(`Randomization Assignment` == "Arm A") %>%
  filter(`Enrollment Status` %in% c("Enrolled","Off Study", "Follow Up", "Eligible")) %>%
  nrow()

ArmB_enrolled = data_exclusions %>%
  filter(`Randomization Assignment` == "Arm B") %>%
  filter(`Enrollment Status` %in% c("Enrolled","Off Study", "Follow Up", "Eligible")) %>%
  nrow()


header = c("Consented",
           "Screen Failures",
           "Withdrawn Consent",
           "Currently in Screening",
           "Enrolled")

number = c(patient_consented, 
           patient_screen_failure, 
           patient_withdrawn_consent,
           patient_in_screening,
           paste(ArmA_enrolled, ArmB_enrolled))

exclusion_table = cbind(header, number)

## Take a look at the table in R markdown
kable(exclusion_table, knit_format, booktabs=T, longtable =T,
      col.names = c(" ", "Number of Patients")) %>%
  row_spec(0, bold = TRUE) %>%
  column_spec(1, bold = TRUE) %>%
  kable_classic(full_width = F) 

## rtf

to_fill <- t(c(patient_consented, patient_screen_failure, patient_withdrawn_consent, patient_in_screening, ArmA_enrolled, ArmB_enrolled))
 tk_exp_rst(to_fill, shell_f = "JXXXX_shell_exclusions.rtf", rst_f = "JXXXX_rst_exclusions.rtf")

PATIENT DEMOGRAPHICS

You won’t need to create a .rtf file for this table. Just open the ’table1’ object in your browser, and you can directly copy and paste the table into a Word document.

data_demographics_wrg = left_join(
  data_demographics_wrg,
  data_enrollment_randomization_wrg[,c("Subject ID","Randomization Assignment")],
  by = "Subject ID"
  )

## generate table1

units(data_demographics_wrg$`Age On Study`) <- "years"
demo = table1(~ `Age On Study` + Race + Gender + Ethnicity | `Randomization Assignment`, 
       data = data_demographics_wrg, overall = "Total")

DATA COMPLETENESS

Data completeness can be challenging to assess. Sometimes data entry personnel might not label the form as ‘Incomplete’, or sometimes they will add empty incomplete forms as placeholder once they have the planned treatment schedule for the patient. Additionally, it’s hard for us to determine if a form is missing entirely since we can’t always discern if a form should have been present or not.

The best way is to work closely with the study coordinator for data completeness form. Add/delete hearder and crf list below as necessary.

Figure: Sample Completeness Table Shell

header = c("Adverse Events",
           "Demographics",
           "Enrollment/Randomization",
           "Drug A Administration",
           "Drug B Administration",
           "End of Treatment",
           "Deviations",
           "End of Study",
           "Follow Up",
           "Surgery")

crf_list = list(data_adverse_events, 
                data_demographics, 
                data_enrollment_randomization_wrg, 
                data_drugA_admin,
                data_drugB_admin, 
                data_end_of_treatment, 
                data_deviations,
                data_end_of_study,
                data_follow_up,
                data_surgery)

crf_df = data.frame(crf = NA, 
                    unverified = NA,
                    complete = NA,
                    incomplete = NA, 
                    n = NA)

for (i in seq_along(crf_list)) {

  H = header[i]
  CRF = crf_list[[i]]
  
  if (is.numeric(CRF)) {
    
    crf_df[i,1] = H
    crf_df[i,2] = 0
    crf_df[i,3] = 0
    crf_df[i,4] = 0
    crf_df[i,5] = 0
  
  } else {
  
  # print(H)
  NT = CRF$`Complete?`
  N = length(NT)
  
  crf_df[i,1] = H
  crf_df[i,2] = sum(NT == "Unverified")
  crf_df[i,3] = sum(NT == "Complete")
  crf_df[i,4] = sum(NT == "Incomplete")
  crf_df[i,5] = N
  
  }
  
}


crf_df2 = crf_df %>%
  janitor::adorn_totals() %>%
  mutate(percent_incomplete = round(incomplete/n*100,1),
         percent_incomplete = replace(percent_incomplete, is.na(percent_incomplete), "N/A")) %>%
  select(crf, incomplete, n, percent_incomplete)

kable(crf_df2, knit_format, booktabs=T, longtable =T,
      col.names = c("Case Report Form", "Number of Incomplete forms", "Number of Expected Forms","Percent of Incomplete Forms")) %>%
  row_spec(0, bold = TRUE) %>%
  column_spec(1, bold = TRUE) %>%
  kable_classic(full_width = F)


## rtf
to_fill <- t(crf_df2)
# tk_exp_rst(to_fill, shell_f = "JXXXX_shell_completeness.rtf", rst_f = "JXXXX_rst_completeness.rtf")

Adverse Event

Adding this sentence at the beginning of AE session: Patient is counted only once at the max grade of the same AE when multiple instances are recorded.

We typically generate 4-5 AE tables: All AEs; Treatment-Related AEs; Severe AEs; and Unacceptable AEs. It’s important to review these for any typos or misclassifications. For instance, an AE with a grade of 3 or higher could potentially be classified as either a severe AE, an unacceptable AE, or both.

Below is a sample AE table. The N represents number of pts who are evaluable for AEs, often defined as pts who have received at least one dose of study treatment. Usually we only include >= Grade 3 in SAEs and unacceptable AEs table.

Usually, the analysis population for toxicity includes patients who received at least one dose of study treatment. Sometimes, baseline AEs are recorded - that is, AEs that occur after a patient signs the consent but before receiving any study intervention. We’ll need to check the AE start/onset date to determine which AEs should be included.

Figure: Sample AE Table Shell

data_adverse_events_wrg = 
  data_adverse_events %>%
  left_join(data_enrollment_randomization_wrg[,c("Subject ID", "Randomization Assignment")], by = c("Subject ID" = "Subject ID")) %>%
  mutate(
    ae_ctcae_term = `Adverse Event - CTCAE Term`,
    `Adverse Event` = `Adverse Event - Verbatim`,
    related_to_study_drug = case_when(
      `AE related to Study Drug A?` == "Yes" ~ 1,
      `AE related to Study Drug B?` == "Yes"  ~ 1,
      TRUE ~ 0)
    ) %>%
  relocate(ae_ctcae_term, .before = Grade)

## Add code to summarize adverse events for double-check

ae_all =
  data_adverse_events_wrg %>%
  select(`Subject ID`, `Adverse Event - Verbatim`, ae_ctcae_term,
         `Grade`, `Start Date`, `Stop Date`, `AE related to Study Drug A?`, `AE related to Study Drug B?`)

ae_sae =
  data_adverse_events_wrg %>%
  filter(`Serious?` %in% c("Yes","yes")) %>%
  select(`Subject ID`, `Adverse Event - Verbatim`, ae_ctcae_term,
         `Grade`, `Start Date`, `Stop Date`, `AE related to Study Drug A?`, `AE related to Study Drug B?`)

ae_unacceptable =
  data_adverse_events_wrg %>%
  filter(`Unacceptable Toxicity?` %in% c("Yes","yes")) %>%
  select(`Subject ID`,`Randomization Assignment`, `Adverse Event - Verbatim`,ae_ctcae_term, `Grade`, `AE related to Study Drug A?`, `AE related to Study Drug B?`)


ae_attributed = 
  data_adverse_events_wrg %>%
  filter(`AE related to Study Drug A?` %in% c("Yes","yes") | `AE related to Study Drug B?`  %in% c("Yes","yes"))

All AEs

ae_summary_table =
  data_adverse_events_wrg %>%
  group_by(`Subject ID`, ae_ctcae_term) %>%
  summarise(
    type = `Randomization Assignment`[1],
    max_grade = factor(max(Grade), levels = c(1,2,3,4,5), labels = c(1,2,3,4,5)),
    max_grade_text = factor(paste("Grade", max_grade), levels = paste("Grade", rep(1:5)))
    )


## This is for your own reference to double-check the data.
label(ae_summary_table$ae_ctcae_term) = "AE CTCAE Terminology"
table1( ~ ae_ctcae_term | type * max_grade_text, data = summary_table, overall = FALSE, droplevels = TRUE)

namesA = c("CTCAE","grade1A","grade2A","grade3A","grade4A","grade5A","totalA")
namesB= c("CTCAE","grade1B","grade2B","grade3B","grade4B","grade5B","totalB")

summary_A = summary_table %>% 
  filter(type == "Arm A")

summary_B = summary_table %>% 
  filter(type == "Arm B")

tableA =
  table(summary_A$ae_ctcae_term,
        summary_A$max_grade_text) %>%
  as.data.frame.matrix() %>%
  tibble::rownames_to_column(., "AE CTCAE Terminology") %>%
  mutate(`Total` = `Grade 1` + `Grade 2` + `Grade 3`+ `Grade 4`+ `Grade 5`)
names(tableA) = namesA

tableB =
  table(summary_B$ae_ctcae_term,
        summary_B$max_grade_text) %>%
  as.data.frame.matrix() %>%
  tibble::rownames_to_column(., "AE CTCAE Terminology") %>%
  mutate(`Total` = `Grade 1` + `Grade 2` + `Grade 3`+ `Grade 4`+ `Grade 5`)
names(tableB) = namesB

all_ae_df = merge(tableA, tableB, by = "CTCAE", all=T) %>% 
  relocate("CTCAE","grade1A","grade1B","grade2A","grade2B","grade3A","grade3B","grade4A","grade4B","grade5A","grade5B","totalA","totalB")
all_ae_df[is.na(all_ae_df)] = 0

to_fill <- t(all_ae_df)
tk_exp_rst(to_fill, shell_f = "JXXXX_shell_AE_summary.rtf", rst_f = "JXXXX_rst_AE_summary.rtf")

summary_table_related =
  data_adverse_events_wrg %>%
  filter(related_to_study_drug == 1) %>%
  group_by(`Subject ID`, ae_ctcae_term) %>%
  summarise(
    type = `Randomization Assignment`[1],
    max_grade = factor(max(Grade), levels = c(1,2,3,4,5), labels = c(1,2,3,4,5)),
    max_grade_text = factor(paste("Grade", max_grade), levels = paste("Grade", rep(1:5)))
    )

label(summary_table_related$ae_ctcae_term) = "AE CTCAE Terminology"

## This is for your own reference to double-check the data.
table1( ~ ae_ctcae_term | type * max_grade_text, data = summary_table_related, overall = FALSE, droplevels = TRUE)

summary_A = summary_table_related %>% 
  filter(type == "Arm A")

summary_B = summary_table_related %>% 
  filter(type == "Arm B")

tableA =
  table(summary_A$ae_ctcae_term,
        summary_A$max_grade_text) %>%
  as.data.frame.matrix() %>%
  tibble::rownames_to_column(., "AE CTCAE Terminology") %>%
  mutate(`Total` = `Grade 1` + `Grade 2` + `Grade 3`+ `Grade 4`+ `Grade 5`)
names(tableA) = namesA

tableB =
  table(summary_B$ae_ctcae_term,
        summary_B$max_grade_text) %>%
  as.data.frame.matrix() %>%
  tibble::rownames_to_column(., "AE CTCAE Terminology") %>%
  mutate(`Total` = `Grade 1` + `Grade 2` + `Grade 3`+ `Grade 4`+ `Grade 5`)
names(tableB) = namesB

related_ae_df = merge(tableA, tableB, by = "CTCAE", all=T) %>% 
  relocate("CTCAE","grade1A","grade1B","grade2A","grade2B","grade3A","grade3B","grade4A","grade4B","grade5A","grade5B","totalA","totalB")
related_ae_df[is.na(related_ae_df)] = 0

to_fill <- t(related_ae_df)
tk_exp_rst(to_fill, shell_f = "JXXXX_shell_related_AE_summary.rtf", rst_f = "JXXXX_rst_related_AE_summary.rtf")

Severe AEs

summary_table_sae = 
  data_adverse_events_wrg %>%
  filter(`Serious?` %in% c("Yes","yes")) %>% 
  group_by(`Subject ID`, ae_ctcae_term) %>%
  summarise(
    type = `Randomization Assignment`[1],
    max_grade = factor(max(Grade), levels = c(1,2,3,4,5), labels = c(1,2,3,4,5)),
    max_grade_text = factor(paste("Grade", max_grade), levels = paste("Grade", rep(1:5)))
    )

summary_A = summary_table_sae %>% 
  filter(type == "Arm A")

summary_B = summary_table_sae %>% 
  filter(type == "Arm B")

tableA =
  table(summary_A$ae_ctcae_term,
        summary_A$max_grade_text) %>%
  as.data.frame.matrix() %>%
  tibble::rownames_to_column(., "AE CTCAE Terminology") %>%
  mutate(`Total` = `Grade 1` + `Grade 2` + `Grade 3`+ `Grade 4`+ `Grade 5`)
names(tableA) = namesA

tableB =
  table(summary_B$ae_ctcae_term,
        summary_B$max_grade_text) %>%
  as.data.frame.matrix() %>%
  tibble::rownames_to_column(., "AE CTCAE Terminology") %>%
  mutate(`Total` = `Grade 1` + `Grade 2` + `Grade 3`+ `Grade 4`+ `Grade 5`)
names(tableB) = namesB

sae_df = merge(tableA, tableB, by = "CTCAE", all=T) %>% 
  relocate("CTCAE","grade1A","grade1B","grade2A","grade2B","grade3A","grade3B","grade4A","grade4B","grade5A","grade5B","totalA","totalB") %>% 
  select("CTCAE","grade3A","grade3B","grade4A","grade4B","grade5A","grade5B","totalA","totalB")
sae_df[is.na(sae_df)] = 0

to_fill <- t(sae_df)
tk_exp_rst(to_fill, shell_f = "JXXXX_shell_SAE_summary.rtf", rst_f = "JXXXX_rst_SAE_summary.rtf")

Unacceptable AEs

summary_table_unacceptable = 
  data_adverse_events_wrg %>%
  filter(`Unacceptable Toxicity?` %in% c("Yes","yes")) %>% 
  group_by(`Subject ID`, ae_ctcae_term) %>%
  summarise(
    type = `Randomization Assignment`[1],
    max_grade = factor(max(Grade), levels = c(1,2,3,4,5), labels = c(1,2,3,4,5)),
    max_grade_text = factor(paste("Grade", max_grade), levels = paste("Grade", rep(1:5)))
    )

summary_A = summary_table_unacceptable %>% 
  filter(type == "Arm A")

summary_B = summary_table_unacceptable %>% 
  filter(type == "Arm B")

tableA =
  table(summary_A$ae_ctcae_term,
        summary_A$max_grade_text) %>%
  as.data.frame.matrix() %>%
  tibble::rownames_to_column(., "AE CTCAE Terminology") %>%
  mutate(`Total` = `Grade 1` + `Grade 2` + `Grade 3`+ `Grade 4`+ `Grade 5`)
names(tableA) = namesA

tableB =
  table(summary_B$ae_ctcae_term,
        summary_B$max_grade_text) %>%
  as.data.frame.matrix() %>%
  tibble::rownames_to_column(., "AE CTCAE Terminology") %>%
  mutate(`Total` = `Grade 1` + `Grade 2` + `Grade 3`+ `Grade 4`+ `Grade 5`)
names(tableB) = namesB

unacceptable_ae_df = merge(tableA, tableB, by = "CTCAE", all=T) %>% 
  relocate("CTCAE","grade1A","grade1B","grade2A","grade2B","grade3A","grade3B","grade4A","grade4B","grade5A","grade5B","totalA","totalB") %>% 
  select("CTCAE","grade3A","grade3B","grade4A","grade4B","grade5A","grade5B","totalA","totalB")
unacceptable_ae_df[is.na(unacceptable_ae_df)] = 0

to_fill <- t(unacceptable_ae_df)
tk_exp_rst(to_fill, shell_f = "JXXXX_shell_unacceptable_AE_summary.rtf", rst_f = "JXXXX_rst_unacceptable_AE_summary.rtf")

Deviations

Most of the time the Deviation tables will be provided by research coordinator. The current requirement is to place major deviations here, and the complete deviation table will be added aS an appendix.

Figure: Sample Deviation Table Shell

data_deviations_wrg = data_deviations %>% 
  select(`Subject ID`,`Deviation severity`,`Description of Deviation`, `Date of Deviation`,`Corrective Action`) %>% 
  mutate(repeat_instance=1, .after = `Subject ID`)

data_deviations_wrg_after_last_dsmb = data_deviations %>% 
  filter(`Date of Deviation` > as.date("2022-09-14")) %>% ## Edit the date to be last DSMB meeting data cut-off date.
  select(`Subject ID`,`Deviation severity`,`Description of Deviation`, `Date of Deviation`,`Corrective Action`) %>% 
  mutate(repeat_instance=1, .after = `Subject ID`)

to_fill = t(data_deviations_wrg)
tk_exp_rst(to_fill, shell_f = "JXXXX_shell_deviations.rtf", rst_f = "JXXXX_rst_deviations.rtf")

to_fill = t(data_deviations_wrg_after_last_dsmb)
tk_exp_rst(to_fill, shell_f = "JXXXX_shell_deviations_after_last_DSMB.rtf", rst_f = "JXXXX_rst_deviations_after_last_DSMB.rtf")

Off Treatment Reasons

The N represents total sample size in each arm. The bold part is the total number of EoT pts, with percentage of (#EoT pts / all pts in the arm). Below is #pts for each EoT reason (the number of EoT pts / #EoT pts in each arm).

Figure: Sample End of Treatment Table Shell

data_off_treatment = data_end_of_treatment %>% 
  filter(`Is patient off treatment?` %in% c("Yes","yes")) %>% 
  left_join(data_enrollment_randomization_wrg[,c("Subject ID", "Randomization Assignment")], by = c("Subject ID" = "Subject ID")) %>% 
  select(`Reason for study treatment discontinuation`,`Randomization Assignment`) %>% 
  mutate(
    `Reason for study treatment discontinuation` = str_replace(`Reason for study treatment discontinuation`, " \\s*\\([^\\)]+\\)", "")
  )

eot_table1 = table1(~ `Reason for study treatment discontinuation` | `Randomization Assignment`, data = data_off_treatment, overall = FALSE, droplevels = TRUE)

eot_table = eot_table1 %>% 
  as.data.frame(row.names = NULL)
eot_table = eot_table[-c(1,2),]


## rtf
to_fill <- t(eot_table)
tk_exp_rst(to_fill, shell_f = "JXXXX_shell_off_treatment.rtf", rst_f = "JXXXX_rst_off_treatment.rtf")

Session Info

sessionInfo()

## R version 4.4.0 (2024-04-24 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 22631)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=English_United States.utf8 
## [2] LC_CTYPE=English_United States.utf8   
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.utf8    
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.33     R6_2.5.1          bookdown_0.35     fastmap_1.1.1    
##  [5] xfun_0.43         blogdown_1.18     cachem_1.0.8      knitr_1.46       
##  [9] htmltools_0.5.8.1 rmarkdown_2.25    cli_3.6.1         sass_0.4.9       
## [13] jquerylib_0.1.4   compiler_4.4.0    rstudioapi_0.15.0 tools_4.4.0      
## [17] evaluate_0.22     bslib_0.5.1       yaml_2.3.6        jsonlite_1.8.7   
## [21] rlang_1.1.5

DSMB R Code