This document aims to illustrate the usage of the function exactmed()
and its behavior via additional examples.
We recall that exactmed()
computes natural direct and indirect effects, and controlled direct effects for a binary outcome and a binary mediator. The user can specify the reference levels of the outcome and mediator variables using the input parameters hvalue_m
and hvalue_y
, respectively (see the function help). Controlled direct effects are obtained for both possible mediator values (\(m=0\) and \(m=1\)). Natural and controlled effects can either be unadjusted or adjusted for covariates (that is, conditional effects). Adjusted effect estimates are obtained for covariates fixed at their sample-specific mean values (default) or at specific values (user-provided). Also, by default, exactmed()
incorporates a mediator-exposure interaction term in the outcome model, which can be removed by setting interaction=FALSE
. Concerning interval estimates, exactmed()
generates, by default, \(95\%\) confidence intervals obtained by the delta method. Alternatively, bootstrap confidence intervals, instead of delta method confidence intervals, can be obtained by specifying boot=TRUE
in the function call. In this case, 1000 bootstrap datasets are generated by default.
In all the examples presented below we use the dataset datamed
, available after loading the ExactMed package. Some of the features of this dataset can be found in its corresponding help file (help(datamed)
). We recall that the exactmed()
function only works on data frames with named columns and no missing values.
> library(ExactMed)
>
> head(datamed)
X M Y C1 C21 1 1 0 0 0.3753731
2 1 0 0 1 0.1971635
3 1 0 1 1 -0.5971041
4 0 1 0 0 0.7576990
5 0 1 0 0 1.2056864
6 0 1 0 0 -0.5983204
The following command verifies whether the dataset contains any missing values:
>
> as.logical(sum(is.na(datamed)))
1] FALSE [
Suppose that one wishes to obtain mediation effects estimates for a change in exposure from \(0\) to \(1\), assuming there is no exposure-mediator interaction and using the delta method to construct \(95\%\) confidence intervals.
In this case, a valid call to exactmed()
would be:
>
> results1 <- exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y',
+ a1 = 1, a0 = 0, interaction = FALSE
+ )
'exactmed' will compute unadjusted natural effects
>
> results1
$`Natural effects on OR scale`
2.5% 97.5% P.val
Estimate Std.error 2.04899 0.27469 1.57554 2.66472 8.7484e-08
Direct effect 0.88069 0.03097 0.82204 0.94352 0.00030218
Indirect effect 1.80452 0.23756 1.39414 2.33571 7.3256e-06
Total effect
$`Natural effects on RR scale`
2.5% 97.5% P.val
Estimate Std.error 1.30183 0.06489 1.18065 1.43543 1.2130e-07
Direct effect 0.96248 0.01014 0.94281 0.98257 0.00028454
Indirect effect 1.25298 0.06379 1.13399 1.38446 9.4328e-06
Total effect
$`Natural effects on RD scale`
2.5% 97.5% P.val
Estimate Std.error 0.16514 0.03021 0.10593 0.22435 4.5850e-08
Direct effect -0.02672 0.00732 -0.04107 -0.01238 0.00026137
Indirect effect 0.13842 0.03048 0.07868 0.19815 5.5775e-06
Total effect
$`Controlled direct effects (m=0)`
2.5% 97.5% P.val
Estimate Std.error 2.07495 0.28607 1.58363 2.71870 1.1937e-07
OR scale 1.26843 0.05652 1.16234 1.38420 9.5079e-08
RR scale 0.15878 0.02892 0.10210 0.21546 4.0057e-08
RD scale
$`Controlled direct effects (m=1)`
2.5% 97.5% P.val
Estimate Std.error 2.07495 0.28607 1.58363 2.71870 1.1937e-07
OR scale 1.40138 0.09725 1.22317 1.60555 1.1566e-06
RR scale 0.17947 0.03343 0.11395 0.24499 7.9365e-08 RD scale
Mediation effects estimates adjusted for covariates are obtained through the use of the character vectors m_cov
and y_cov
, which contain the names of the covariates to be adjusted for in the mediator and outcome models, respectively. The following call to exactmed()
incorporates covariates C1
and C2
in both the mediator and outcome models:
>
> results2 <- exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,
+ m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'),
+ interaction = FALSE
+ )
>
> results2
$`Natural effects on OR scale`
2.5% 97.5% P.val
Estimate Std.error 1.91812 0.27925 1.44197 2.55150 7.6751e-06
Direct effect 0.99263 0.05626 0.88826 1.10927 0.89619
Indirect effect 1.90399 0.26714 1.44622 2.50664 4.4407e-06
Total effect
$`Natural effects on RR scale`
2.5% 97.5% P.val
Estimate Std.error 1.27053 0.06748 1.14492 1.40991 6.5375e-06
Direct effect 0.99782 0.01667 0.96568 1.03103 0.89593
Indirect effect 1.26776 0.06670 1.14354 1.40547 6.5043e-06
Total effect
$`Natural effects on RD scale`
2.5% 97.5% P.val
Estimate Std.error 0.15019 0.03272 0.08605 0.21432 4.4402e-06
Direct effect -0.00154 0.01178 -0.02463 0.02155 0.89604
Indirect effect 0.14865 0.03195 0.08602 0.21127 3.2871e-06
Total effect
$`Controlled direct effects (m=0)`
2.5% 97.5% P.val
Estimate Std.error 1.91814 0.27936 1.44182 2.55183 7.7378e-06
OR scale 1.26961 0.06625 1.14619 1.40633 4.7657e-06
RR scale 0.15000 0.03236 0.08657 0.21342 3.5640e-06
RD scale
$`Controlled direct effects (m=1)`
2.5% 97.5% P.val
Estimate Std.error 1.91814 0.27936 1.44182 2.55183 7.7378e-06
OR scale 1.27393 0.07778 1.13025 1.43587 7.3286e-05
RR scale 0.15087 0.03454 0.08318 0.21857 1.2534e-05 RD scale
exactmed()
also allows for the specification of two different sets of covariates in the mediator and outcome models. For example, the following specification of m_cov
and y_cov
means that the mediator model is adjusted for C1
and C2
, while the outcome model is adjusted for C1
only.
However, we advise against this practice unless it is known that excluded covariates are independent of the dependent variable (mediator or outcome) being modeled given the rest of covariates.
>
> results3 <- exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,
+ m_cov = c('C1', 'C2'), y_cov = c('C1'),
+ interaction = FALSE
+ )
>
> results3
$`Natural effects on OR scale`
2.5% 97.5% P.val
Estimate Std.error 2.07747 0.29467 1.57326 2.74328 2.5397e-07
Direct effect 0.88888 0.04509 0.80476 0.98180 0.020229
Indirect effect 1.84663 0.25710 1.40563 2.42600 1.0554e-05
Total effect
$`Natural effects on RR scale`
2.5% 97.5% P.val
Estimate Std.error 1.29629 0.06554 1.17400 1.43133 2.8560e-07
Direct effect 0.96677 0.01378 0.94013 0.99416 0.017754
Indirect effect 1.25321 0.06518 1.13176 1.38771 1.4271e-05
Total effect
$`Natural effects on RD scale`
2.5% 97.5% P.val
Estimate Std.error 0.16572 0.03132 0.10433 0.22710 1.2175e-07
Direct effect -0.02409 0.01021 -0.04410 -0.00409 0.018258
Indirect effect 0.14162 0.03174 0.07941 0.20384 8.1377e-06
Total effect
$`Controlled direct effects (m=0)`
2.5% 97.5% P.val
Estimate Std.error 2.08515 0.29879 1.57458 2.76128 2.9258e-07
OR scale 1.28132 0.06150 1.16628 1.40771 2.4082e-07
RR scale 0.16264 0.03059 0.10269 0.22259 1.0544e-07
RD scale
$`Controlled direct effects (m=1)`
2.5% 97.5% P.val
Estimate Std.error 2.08515 0.29879 1.57458 2.76128 2.9258e-07
OR scale 1.36112 0.09014 1.19544 1.54976 3.2297e-06
RR scale 0.17702 0.03443 0.10954 0.24450 2.7259e-07 RD scale
By default, the adjusted
parameter is TRUE
. If the adjusted
parameter is set to FALSE
, exactmed()
ignores the values of the vectors m_cov
and y_cov
, and computes unadjusted (crude) effect estimates as in the first example above:
>
> results4 <- exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,
+ m_cov = c('C1', 'C2'), y_cov = c('C1'),
+ adjusted = FALSE, interaction = FALSE
+ )
'exactmed' will compute unadjusted natural effects
>
> results4
$`Natural effects on OR scale`
2.5% 97.5% P.val
Estimate Std.error 2.04899 0.27469 1.57554 2.66472 8.7484e-08
Direct effect 0.88069 0.03097 0.82204 0.94352 0.00030218
Indirect effect 1.80452 0.23756 1.39414 2.33571 7.3256e-06
Total effect
$`Natural effects on RR scale`
2.5% 97.5% P.val
Estimate Std.error 1.30183 0.06489 1.18065 1.43543 1.2130e-07
Direct effect 0.96248 0.01014 0.94281 0.98257 0.00028454
Indirect effect 1.25298 0.06379 1.13399 1.38446 9.4328e-06
Total effect
$`Natural effects on RD scale`
2.5% 97.5% P.val
Estimate Std.error 0.16514 0.03021 0.10593 0.22435 4.5850e-08
Direct effect -0.02672 0.00732 -0.04107 -0.01238 0.00026137
Indirect effect 0.13842 0.03048 0.07868 0.19815 5.5775e-06
Total effect
$`Controlled direct effects (m=0)`
2.5% 97.5% P.val
Estimate Std.error 2.07495 0.28607 1.58363 2.71870 1.1937e-07
OR scale 1.26843 0.05652 1.16234 1.38420 9.5079e-08
RR scale 0.15878 0.02892 0.10210 0.21546 4.0057e-08
RD scale
$`Controlled direct effects (m=1)`
2.5% 97.5% P.val
Estimate Std.error 2.07495 0.28607 1.58363 2.71870 1.1937e-07
OR scale 1.40138 0.09725 1.22317 1.60555 1.1566e-06
RR scale 0.17947 0.03343 0.11395 0.24499 7.9365e-08 RD scale
To perform an adjusted mediation analysis allowing for exposure-mediator interaction (by default TRUE
) and using bootstrap based on \(100\) resamples with initial random seed \(= 1991\) to construct \(97\%\) confidence intervals, one should call exactmed()
as follows:
>
> results5 <- exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,
+ m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'),
+ boot = TRUE, nboot = 100, bootseed = 1991, confcoef = 0.97
+ )
>
> results5
$`Natural effects on OR scale`
1.5% 98.5%
Estimate Std.error 2.24870 0.37516 1.68812 3.32099
Direct effect 0.88856 0.07284 0.75874 1.07384
Indirect effect 1.99810 0.28480 1.51926 2.71363
Total effect
$`Natural effects on RR scale`
1.5% 98.5%
Estimate Std.error 1.33700 0.07272 1.20829 1.48229
Direct effect 0.96726 0.02151 0.92984 1.02289
Indirect effect 1.29323 0.06769 1.16111 1.44939
Total effect
$`Natural effects on RD scale`
1.5% 98.5%
Estimate Std.error 0.18403 0.03365 0.12012 0.25767
Direct effect -0.02390 0.01591 -0.05367 0.01544
Indirect effect 0.16013 0.03112 0.09568 0.22870
Total effect
$`Controlled direct effects (m=0)`
1.5% 98.5%
Estimate Std.error 2.61843 0.55335 1.82091 4.25427
OR scale 1.41151 0.09319 1.25366 1.59634
RR scale 0.21741 0.04099 0.14197 0.30413
RD scale
$`Controlled direct effects (m=1)`
1.5% 98.5%
Estimate Std.error 1.30748 0.30771 0.79211 2.15864
OR scale 1.10061 0.09454 0.92150 1.35743
RR scale 0.06150 0.05285 -0.05274 0.18411 RD scale
In the situation where we believe that we are facing a problem of separation or quasi-separation, Firth’s penalization can be used by setting the Firth
parameter to TRUE
(Firth penalized mediation analysis). If this is the case, Firth’s penalization is applied to both the mediator model and the outcome model.
The Firth
parameter implements Firth’s penalization to reduce the bias of the regression coefficients estimators under scarce or sparse data:
>
> results6 <- exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,
+ m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), Firth = TRUE,
+ boot = TRUE, nboot = 100, bootseed = 1991, confcoef = 0.97
+ )
>
> results6
$`Natural effects on OR scale`
1.5% 98.5%
Estimate Std.error 2.23246 0.36864 1.68036 3.28209
Direct effect 0.88991 0.07197 0.76166 1.07288
Indirect effect 1.98668 0.28090 1.51385 2.69182
Total effect
$`Natural effects on RR scale`
1.5% 98.5%
Estimate Std.error 1.33452 0.07217 1.20680 1.47867
Direct effect 0.96751 0.02135 0.93035 1.02269
Indirect effect 1.29116 0.06719 1.16007 1.44614
Total effect
$`Natural effects on RD scale`
1.5% 98.5%
Estimate Std.error 0.18263 0.03343 0.11919 0.25573
Direct effect -0.02367 0.01576 -0.05316 0.01527
Indirect effect 0.15896 0.03093 0.09498 0.22710
Total effect
$`Controlled direct effects (m=0)`
1.5% 98.5%
Estimate Std.error 2.59835 0.54383 1.81264 4.20506
OR scale 1.40883 0.09257 1.25203 1.59214
RR scale 0.21597 0.04077 0.14098 0.30221
RD scale
$`Controlled direct effects (m=1)`
1.5% 98.5%
Estimate Std.error 1.30556 0.30467 0.79391 2.14590
OR scale 1.10034 0.09396 0.92202 1.35523
RR scale 0.06124 0.05249 -0.05229 0.18295 RD scale
The following call to exactmed()
returns mediation effects when the values of the covariates C1
and C2
are \(0.1\) and \(0.4\), respectively, assuming an exposure-mediator interaction and using the delta method to construct \(95\%\) confidence intervals:
>
> results7 <- exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,
+ m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'),
+ m_cov_cond = c(C1 = 0.1,C2 = 0.4), y_cov_cond = c(C1 = 0.1, C2 = 0.4)
+ )
>
> results7
$`Natural effects on OR scale`
2.5% 97.5% P.val
Estimate Std.error 1.86610 0.27725 1.39466 2.49690 2.6816e-05
Direct effect 0.89347 0.06371 0.77694 1.02747 0.11413480
Indirect effect 1.66730 0.25143 1.24065 2.24066 0.00069917
Total effect
$`Natural effects on RR scale`
2.5% 97.5% P.val
Estimate Std.error 1.37432 0.10503 1.18315 1.59639 3.1743e-05
Direct effect 0.95099 0.03022 0.89357 1.01210 0.11377623
Indirect effect 1.30697 0.10409 1.11809 1.52777 0.00077525
Total effect
$`Natural effects on RD scale`
2.5% 97.5% P.val
Estimate Std.error 0.15465 0.03625 0.08360 0.22571 1.9913e-05
Direct effect -0.02783 0.01759 -0.06230 0.00664 0.11360218
Indirect effect 0.12683 0.03702 0.05427 0.19938 0.00061244
Total effect
$`Controlled direct effects (m=0)`
2.5% 97.5% P.val
Estimate Std.error 2.61843 0.52125 1.77253 3.86803 1.3290e-06
OR scale 1.63173 0.16066 1.34536 1.97906 6.5954e-07
RR scale 0.23603 0.04712 0.14369 0.32838 5.4583e-07
RD scale
$`Controlled direct effects (m=1)`
2.5% 97.5% P.val
Estimate Std.error 1.30748 0.28346 0.85486 1.99974 0.21622
OR scale 1.14677 0.12957 0.91897 1.43104 0.22549
RR scale 0.06689 0.05389 -0.03873 0.17251 0.21448 RD scale
Common adjustment covariates in vectors m_cov
and y_cov
must have the same values; otherwise, the execution of the exactmed()
function is aborted and an error message is displayed in the R console. Example:
> exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,
+ m_cov = c('C1','C2'), y_cov = c('C1', 'C2'),
+ m_cov_cond = c(C1 = 0.3, C2 = 0.4), y_cov_cond = c(C1 = 0.1, C2 = 0.4)
+ )
Error in .check_input_param(data = data, a = a, m = m, y = y, a1 = a1, : Covariate C1 has two different values specified
If the covariates specified in m_cov_cond
(y_cov_cond
) constitute some proper subset of m_cov
(y_cov
) then the other covariates are set to their sample-specific mean levels. Hence, the call
>
> results8 <- exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,
+ m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'),
+ m_cov_cond = c(C1 = 0.1), y_cov_cond = c(C1 = 0.1)
+ )
is equivalent to:
>
> mc2 <- mean(datamed$C2)
> mc2
1] -0.04769712
[>
> results9 <- exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,
+ m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'),
+ m_cov_cond = c(C1 = 0.1, C2 = mc2), y_cov_cond = c(C1 = 0.1, C2 = mc2)
+ )
This can be checked by comparing the two outputs:
>
> identical(results8, results9)
1] TRUE [
With this in mind, an error is easily predicted if one makes this call:
> exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,
+ m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'),
+ m_cov_cond = c(C1 = 0.1), y_cov_cond = c(C1 = 0.1, C2 = 0.4)
+ )
Error in .check_input_param(data = data, a = a, m = m, y = y, a1 = a1, : Covariate C2 has two different values specified (one implicitly)
exactmed()
also allows for categorical covariates. Covariates of this type must appear in the data frame as factor, character or logical columns. To illustrate how exactmed()
works with categorical covariates, we replace the covariate C1
in the dataset datamed
by a random factor column:
>
> cate <- factor(sample(c("a", "b", "c"), nrow(datamed), replace =TRUE))
> datamed$C1 <- cate
It is possible to estimate mediation effects at specific values of categorical covariates using the input parameters m_cov_cond
and y_cov_cond
. Note that if the targeted covariates are both numerical and categorical, the above parameters require to be list-type vectors, instead of atomic vectors as when covariates are only numerical.
Hence, if one wants to estimate mediation effects at level ‘a’ for C1
and at value \(0.4\) for C2
, assuming an exposure-mediator interaction and using the delta method to construct \(95\%\) confidence intervals, exactmed()
should be called as follows:
>
> results10 <- exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,
+ m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'),
+ m_cov_cond = list(C1 = 'a', C2 = 0.4), y_cov_cond = list(C1 = 'a', C2 = 0.4)
+ )
>
> results10
$`Natural effects on OR scale`
2.5% 97.5% P.val
Estimate Std.error 1.93666 0.26913 1.47490 2.54298 1.9726e-06
Direct effect 0.80707 0.05473 0.70663 0.92179 0.0015722
Indirect effect 1.56302 0.22095 1.18478 2.06201 0.0015806
Total effect
$`Natural effects on RR scale`
2.5% 97.5% P.val
Estimate Std.error 1.28784 0.07186 1.15443 1.43667 5.7977e-06
Direct effect 0.93157 0.02153 0.89031 0.97474 0.0021622
Indirect effect 1.19971 0.07055 1.06911 1.34627 0.0019593
Total effect
$`Natural effects on RD scale`
2.5% 97.5% P.val
Estimate Std.error 0.15482 0.03215 0.09180 0.21784 1.4724e-06
Direct effect -0.04740 0.01505 -0.07691 -0.01790 0.0016367
Indirect effect 0.10742 0.03377 0.04123 0.17361 0.0014684
Total effect
$`Controlled direct effects (m=0)`
2.5% 97.5% P.val
Estimate Std.error 2.61643 0.50061 1.79824 3.80690 4.9841e-07
OR scale 1.39356 0.09416 1.22072 1.59089 9.0296e-07
RR scale 0.21365 0.04027 0.13473 0.29258 1.1236e-07
RD scale
$`Controlled direct effects (m=1)`
2.5% 97.5% P.val
Estimate Std.error 1.37503 0.28624 0.91436 2.06781 0.12605
OR scale 1.14656 0.10575 0.95694 1.37375 0.13812
RR scale 0.07787 0.05103 -0.02214 0.17789 0.12699 RD scale
If one does not specify a value for the categorical covariate C1
, exactmed()
computes the effects by assigning each dummy variable (created internally by exactmed()
) to a value equal to the proportion of observations in the corresponding category (equivalent to setting each dummy variable to its mean value):
>
> results11 <- exactmed(
+ data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,
+ m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'),
+ m_cov_cond = c(C2 = 0.4), y_cov_cond = c(C2 = 0.4)
+ )
>
> results11
$`Natural effects on OR scale`
2.5% 97.5% P.val
Estimate Std.error 2.01817 0.28018 1.53741 2.64928 4.2356e-07
Direct effect 0.79861 0.05731 0.69383 0.91921 0.00172510
Indirect effect 1.61173 0.22105 1.23183 2.10879 0.00050099
Total effect
$`Natural effects on RR scale`
2.5% 97.5% P.val
Estimate Std.error 1.31391 0.07090 1.18203 1.46049 4.2158e-07
Direct effect 0.92786 0.02211 0.88552 0.97223 0.0016799
Indirect effect 1.21912 0.06962 1.09003 1.36350 0.0005212
Total effect
$`Natural effects on RD scale`
2.5% 97.5% P.val
Estimate Std.error 0.16525 0.03184 0.10286 0.22765 2.0946e-07
Direct effect -0.04990 0.01579 -0.08084 -0.01896 0.00157225
Indirect effect 0.11536 0.03280 0.05107 0.17964 0.00043667
Total effect
$`Controlled direct effects (m=0)`
2.5% 97.5% P.val
Estimate Std.error 2.61643 0.50061 1.79824 3.80690 4.9841e-07
OR scale 1.40827 0.09257 1.23803 1.60191 1.9076e-07
RR scale 0.21668 0.04026 0.13777 0.29559 7.3675e-08
RD scale
$`Controlled direct effects (m=1)`
2.5% 97.5% P.val
Estimate Std.error 1.37503 0.28624 0.91436 2.06781 0.12605
OR scale 1.15094 0.10884 0.95622 1.38531 0.13713
RR scale 0.07836 0.05129 -0.02217 0.17889 0.12656 RD scale