Often the data from one column is considered the reference/baseline/comparison group and is compared to the data from the other columns.
For example, lets calculate the average age:
library(rtables)
Loading required package: magrittr
Loading required package: formatters
basic_table() %>%
split_cols_by("ARM") %>%
analyze("AGE") %>%
build_table(DM)
A: Drug X B: Placebo C: Combination
——————————————————————————————————————————————
Mean 34.91 33.02 34.57
and then the difference of the average AGE
between the arms to the placebo arm:
basic_table() %>%
split_cols_by("ARM", ref_group = "B: Placebo") %>%
analyze("AGE", afun = function(x, .ref_group) {
in_rows(
"Difference of Averages" = rcell(mean(x) - mean(.ref_group), format = "xx.xx")
)%>%
}) build_table(DM)
B: Placebo A: Drug X C: Combination
————————————————————————————————————————————————————————————————
Difference of Averages 0.00 1.89 1.55
Note that the column order has changed and the reference group is displayed in the first column.
In cases where we want cells to be blank in the reference column, (e.g., “B: Placebo”) we use non_ref_rcell
instead of rcell
, and pass .in_ref_col
as the second argument:
basic_table() %>%
split_cols_by("ARM", ref_group = "B: Placebo") %>%
analyze("AGE", afun = function(x, .ref_group, .in_ref_col){
in_rows("Difference of Averages" = non_ref_rcell(mean(x) - mean(.ref_group),
is_ref = .in_ref_col,
format = "xx.xx"))
%>%
}) build_table(DM)
B: Placebo A: Drug X C: Combination
————————————————————————————————————————————————————————————————
Difference of Averages 1.89 1.55
basic_table() %>%
split_cols_by("ARM", ref_group = "B: Placebo") %>%
analyze("AGE", afun = function(x, .ref_group, .in_ref_col){
in_rows(
"Difference of Averages" = non_ref_rcell(mean(x) - mean(.ref_group),
is_ref = .in_ref_col,
format = "xx.xx"),
"another row" = non_ref_rcell("aaa", .in_ref_col)
)%>%
}) build_table(DM)
B: Placebo A: Drug X C: Combination
————————————————————————————————————————————————————————————————
Difference of Averages 1.89 1.55
another row aaa aaa
You can see which arguments are available for afun
in the manual fro analyze
.
When adding row-splitting the reference data might be one represented by the column with or without row splitting. For example:
basic_table() %>%
split_cols_by("ARM", ref_group = "B: Placebo") %>%
add_colcounts() %>%
split_rows_by("SEX", split_fun = drop_split_levels) %>%
analyze("AGE", afun = function(x, .ref_group, .ref_full, .in_ref_col){
in_rows(
"is reference (.in_ref_col)" = rcell(.in_ref_col),
"ref cell N (.ref_group)" = rcell(length(.ref_group)),
"ref column N (.ref_full)" = rcell(length(.ref_full))
)%>%
}) build_table(subset(DM, SEX %in% c("M", "F")))
B: Placebo A: Drug X C: Combination
(N=106) (N=121) (N=129)
——————————————————————————————————————————————————————————————————————
F
is reference (.in_ref_col) TRUE FALSE FALSE
ref cell N (.ref_group) 56 56 56
ref column N (.ref_full) 106 106 106
M
is reference (.in_ref_col) TRUE FALSE FALSE
ref cell N (.ref_group) 50 50 50
ref column N (.ref_full) 106 106 106
so the data assigned to .ref_full
is the full data of the reference column where the data assigned to .ref_group
respects the subsetting defined by row-splitting and hence is from the same subset as the argument x
or df
to afun
.