As of now, semhelpinghands
has four groups of functions: manipulating parameter estimates tables, comparing results from different methods, bootstrapping, and … others.
This article will only introduce very briefly the groups. Some of them will be introduced in details in forthcoming article dedicated to them.
Let’s load the package first, and also load lavaan
.
In using lavaan
, I prefer reading the output of parameterEstimates()
, which is compact and clear to me. I sometimes would like to organize the rows and columns in ways meaningful to a particular research questions. This can certainly be done using base R or dplyr
. However, I am lazy and want to be able to do things using just one or two functions, with just one or two arguments.
This is a sample dataset for illustration, dvs_ivs
, with 3 predictors (x1
, x2
, and x3
), 3 outcome variables (y1
y2
, and y3
), and a group variable (gp
).
First a single sample model:
data(dvs_ivs)
mod <-
"
y1 ~ x1 + x2 + x3
y2 ~ x1 + x3
y3 ~ y2 + x2
"
fit <- sem(model = mod, data = dvs_ivs,
fixed.x = FALSE)
The parameter estimates tables:
est <- parameterEstimates(fit)
est
#> lhs op rhs est se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 0.206 0.087 2.354 0.019 0.034 0.377
#> 2 y1 ~ x2 0.381 0.093 4.100 0.000 0.199 0.563
#> 3 y1 ~ x3 0.162 0.088 1.852 0.064 -0.009 0.334
#> 4 y2 ~ x1 0.149 0.105 1.425 0.154 -0.056 0.354
#> 5 y2 ~ x3 0.230 0.105 2.187 0.029 0.024 0.437
#> 6 y3 ~ y2 0.044 0.106 0.420 0.675 -0.163 0.251
#> 7 y3 ~ x2 0.295 0.121 2.445 0.014 0.059 0.532
#> 8 y1 ~~ y1 0.695 0.098 7.071 0.000 0.502 0.888
#> 9 y2 ~~ y2 1.013 0.143 7.071 0.000 0.732 1.294
#> 10 y3 ~~ y3 1.206 0.171 7.071 0.000 0.872 1.540
#> 11 y1 ~~ y3 -0.027 0.092 -0.292 0.770 -0.206 0.153
#> 12 x1 ~~ x1 0.926 0.131 7.071 0.000 0.669 1.183
#> 13 x1 ~~ x2 0.118 0.088 1.339 0.180 -0.055 0.291
#> 14 x1 ~~ x3 0.000 0.092 -0.001 0.999 -0.180 0.180
#> 15 x2 ~~ x2 0.828 0.117 7.071 0.000 0.598 1.057
#> 16 x2 ~~ x3 0.073 0.087 0.840 0.401 -0.098 0.244
#> 17 x3 ~~ x3 0.912 0.129 7.071 0.000 0.659 1.165
A two-sample model is also fitted to the dataset:
This is the parameter estimates table:
est_gp <- parameterEstimates(fit_gp)
est_gp
#> lhs op rhs block group est se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 1 1 0.210 0.149 1.404 0.160 -0.083 0.502
#> 2 y1 ~ x2 1 1 0.243 0.138 1.761 0.078 -0.027 0.513
#> 3 y1 ~ x3 1 1 0.246 0.144 1.710 0.087 -0.036 0.527
#> 4 y2 ~ x1 1 1 0.036 0.179 0.199 0.842 -0.315 0.386
#> 5 y2 ~ x3 1 1 0.217 0.176 1.232 0.218 -0.128 0.561
#> 6 y3 ~ y2 1 1 0.096 0.120 0.802 0.423 -0.139 0.332
#> 7 y3 ~ x2 1 1 0.372 0.135 2.754 0.006 0.107 0.637
#> 8 y1 ~~ y1 1 1 0.754 0.157 4.796 0.000 0.446 1.062
#> 9 y2 ~~ y2 1 1 1.148 0.239 4.796 0.000 0.679 1.617
#> 10 y3 ~~ y3 1 1 0.787 0.164 4.796 0.000 0.466 1.109
#> 11 y1 ~~ y3 1 1 0.010 0.114 0.085 0.932 -0.213 0.232
#> 12 x1 ~~ x1 1 1 0.798 0.166 4.796 0.000 0.472 1.125
#> 13 x1 ~~ x2 1 1 0.223 0.132 1.694 0.090 -0.035 0.481
#> 14 x1 ~~ x3 1 1 0.121 0.121 1.001 0.317 -0.116 0.358
#> 15 x2 ~~ x2 1 1 0.938 0.196 4.796 0.000 0.555 1.321
#> 16 x2 ~~ x3 1 1 0.139 0.131 1.061 0.289 -0.118 0.397
#> 17 x3 ~~ x3 1 1 0.825 0.172 4.796 0.000 0.488 1.162
#> 18 y1 ~1 1 1 -0.295 0.137 -2.148 0.032 -0.564 -0.026
#> 19 y2 ~1 1 1 0.027 0.168 0.163 0.870 -0.302 0.357
#> 20 y3 ~1 1 1 0.233 0.131 1.775 0.076 -0.024 0.490
#> 21 x1 ~1 1 1 0.282 0.132 2.141 0.032 0.024 0.540
#> 22 x2 ~1 1 1 -0.067 0.143 -0.469 0.639 -0.347 0.213
#> 23 x3 ~1 1 1 -0.123 0.134 -0.915 0.360 -0.385 0.140
#> 24 y1 ~ x1 2 2 0.258 0.102 2.530 0.011 0.058 0.458
#> 25 y1 ~ x2 2 2 0.484 0.121 3.998 0.000 0.247 0.721
#> 26 y1 ~ x3 2 2 0.121 0.104 1.165 0.244 -0.082 0.324
#> 27 y2 ~ x1 2 2 0.226 0.127 1.774 0.076 -0.024 0.475
#> 28 y2 ~ x3 2 2 0.261 0.129 2.016 0.044 0.007 0.514
#> 29 y3 ~ y2 2 2 -0.003 0.162 -0.020 0.984 -0.322 0.315
#> 30 y3 ~ x2 2 2 0.291 0.190 1.527 0.127 -0.082 0.664
#> 31 y1 ~~ y1 2 2 0.568 0.109 5.196 0.000 0.354 0.783
#> 32 y2 ~~ y2 2 2 0.882 0.170 5.196 0.000 0.549 1.214
#> 33 y3 ~~ y3 2 2 1.412 0.272 5.196 0.000 0.880 1.945
#> 34 y1 ~~ y3 2 2 0.050 0.122 0.410 0.682 -0.189 0.289
#> 35 x1 ~~ x1 2 2 1.020 0.196 5.196 0.000 0.635 1.405
#> 36 x1 ~~ x2 2 2 0.042 0.117 0.361 0.718 -0.187 0.271
#> 37 x1 ~~ x3 2 2 -0.099 0.137 -0.719 0.472 -0.367 0.170
#> 38 x2 ~~ x2 2 2 0.722 0.139 5.196 0.000 0.450 0.994
#> 39 x2 ~~ x3 2 2 0.013 0.115 0.110 0.912 -0.212 0.238
#> 40 x3 ~~ x3 2 2 0.985 0.190 5.196 0.000 0.613 1.356
#> 41 y1 ~1 2 2 0.007 0.104 0.068 0.946 -0.196 0.211
#> 42 y2 ~1 2 2 -0.050 0.129 -0.392 0.695 -0.303 0.202
#> 43 y3 ~1 2 2 -0.333 0.163 -2.042 0.041 -0.652 -0.013
#> 44 x1 ~1 2 2 0.101 0.137 0.738 0.460 -0.168 0.371
#> 45 x2 ~1 2 2 0.092 0.116 0.797 0.425 -0.134 0.319
#> 46 x3 ~1 2 2 -0.065 0.135 -0.482 0.630 -0.330 0.200
So, these are what semhelpinghands
has for now:
add_sig()
Despite the controversy over null hypothesis significance testing, we still need to report them, for now. They are there, but I have to decode them mentally. Here comes add_sig()
:
add_sig(est)
#> lhs op rhs est sig se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 0.206 * 0.087 2.354 0.019 0.034 0.377
#> 2 y1 ~ x2 0.381 *** 0.093 4.100 0.000 0.199 0.563
#> 3 y1 ~ x3 0.162 0.088 1.852 0.064 -0.009 0.334
#> 4 y2 ~ x1 0.149 0.105 1.425 0.154 -0.056 0.354
#> 5 y2 ~ x3 0.230 * 0.105 2.187 0.029 0.024 0.437
#> 6 y3 ~ y2 0.044 0.106 0.420 0.675 -0.163 0.251
#> 7 y3 ~ x2 0.295 * 0.121 2.445 0.014 0.059 0.532
#> 8 y1 ~~ y1 0.695 *** 0.098 7.071 0.000 0.502 0.888
#> 9 y2 ~~ y2 1.013 *** 0.143 7.071 0.000 0.732 1.294
#> 10 y3 ~~ y3 1.206 *** 0.171 7.071 0.000 0.872 1.540
#> 11 y1 ~~ y3 -0.027 0.092 -0.292 0.770 -0.206 0.153
#> 12 x1 ~~ x1 0.926 *** 0.131 7.071 0.000 0.669 1.183
#> 13 x1 ~~ x2 0.118 0.088 1.339 0.180 -0.055 0.291
#> 14 x1 ~~ x3 0.000 0.092 -0.001 0.999 -0.180 0.180
#> 15 x2 ~~ x2 0.828 *** 0.117 7.071 0.000 0.598 1.057
#> 16 x2 ~~ x3 0.073 0.087 0.840 0.401 -0.098 0.244
#> 17 x3 ~~ x3 0.912 *** 0.129 7.071 0.000 0.659 1.165
If bootstrapping was used, it also supports using the confidence intervals, which may yield results different from the p-values when bootstrapping confidence intervals are used (but same results in this example):
add_sig(est, use = c("pvalue", "ci"))
#> lhs op rhs est sig ci se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 0.206 * Sig 0.087 2.354 0.019 0.034 0.377
#> 2 y1 ~ x2 0.381 *** Sig 0.093 4.100 0.000 0.199 0.563
#> 3 y1 ~ x3 0.162 0.088 1.852 0.064 -0.009 0.334
#> 4 y2 ~ x1 0.149 0.105 1.425 0.154 -0.056 0.354
#> 5 y2 ~ x3 0.230 * Sig 0.105 2.187 0.029 0.024 0.437
#> 6 y3 ~ y2 0.044 0.106 0.420 0.675 -0.163 0.251
#> 7 y3 ~ x2 0.295 * Sig 0.121 2.445 0.014 0.059 0.532
#> 8 y1 ~~ y1 0.695 *** Sig 0.098 7.071 0.000 0.502 0.888
#> 9 y2 ~~ y2 1.013 *** Sig 0.143 7.071 0.000 0.732 1.294
#> 10 y3 ~~ y3 1.206 *** Sig 0.171 7.071 0.000 0.872 1.540
#> 11 y1 ~~ y3 -0.027 0.092 -0.292 0.770 -0.206 0.153
#> 12 x1 ~~ x1 0.926 *** Sig 0.131 7.071 0.000 0.669 1.183
#> 13 x1 ~~ x2 0.118 0.088 1.339 0.180 -0.055 0.291
#> 14 x1 ~~ x3 0.000 0.092 -0.001 0.999 -0.180 0.180
#> 15 x2 ~~ x2 0.828 *** Sig 0.117 7.071 0.000 0.598 1.057
#> 16 x2 ~~ x3 0.073 0.087 0.840 0.401 -0.098 0.244
#> 17 x3 ~~ x3 0.912 *** Sig 0.129 7.071 0.000 0.659 1.165
It also works on the standardized solution:
std <- standardizedSolution(fit)
add_sig(std)
#> lhs op rhs est.std sig se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 0.208 * 0.087 2.394 0.017 0.038 0.378
#> 2 y1 ~ x2 0.364 *** 0.083 4.364 0.000 0.200 0.527
#> 3 y1 ~ x3 0.163 0.087 1.871 0.061 -0.008 0.333
#> 4 y2 ~ x1 0.138 0.096 1.438 0.150 -0.050 0.326
#> 5 y2 ~ x3 0.212 * 0.095 2.236 0.025 0.026 0.397
#> 6 y3 ~ y2 0.041 0.097 0.420 0.674 -0.149 0.231
#> 7 y3 ~ x2 0.237 * 0.094 2.517 0.012 0.053 0.422
#> 8 y1 ~~ y1 0.767 *** 0.074 10.369 0.000 0.622 0.912
#> 9 y2 ~~ y2 0.936 *** 0.047 19.799 0.000 0.844 1.029
#> 10 y3 ~~ y3 0.941 *** 0.046 20.649 0.000 0.852 1.031
#> 11 y1 ~~ y3 -0.029 0.100 -0.292 0.770 -0.225 0.167
#> 12 x1 ~~ x1 1.000 0.000 NA NA 1.000 1.000
#> 13 x1 ~~ x2 0.135 0.098 1.377 0.169 -0.057 0.328
#> 14 x1 ~~ x3 0.000 0.100 -0.001 0.999 -0.196 0.196
#> 15 x2 ~~ x2 1.000 0.000 NA NA 1.000 1.000
#> 16 x2 ~~ x3 0.084 0.099 0.849 0.396 -0.110 0.279
#> 17 x3 ~~ x3 1.000 0.000 NA NA 1.000 1.000
Note: But be careful about the interpretation of the p-values in the standardized solution, which are based on the delta method. See vignette("standardizedSolution_boot_ci")
.
See the help page of add_sig()
for other options.
filter_by()
Its purpose is very simple, selecting rows by commonly used columns: operators (op
), “dependent variables” (lhs
), and “independent variables” (rhs
).
filter_by(est, op = "~")
#> lhs op rhs est se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 0.206 0.087 2.354 0.019 0.034 0.377
#> 2 y1 ~ x2 0.381 0.093 4.100 0.000 0.199 0.563
#> 3 y1 ~ x3 0.162 0.088 1.852 0.064 -0.009 0.334
#> 4 y2 ~ x1 0.149 0.105 1.425 0.154 -0.056 0.354
#> 5 y2 ~ x3 0.230 0.105 2.187 0.029 0.024 0.437
#> 6 y3 ~ y2 0.044 0.106 0.420 0.675 -0.163 0.251
#> 7 y3 ~ x2 0.295 0.121 2.445 0.014 0.059 0.532
It also supports filtering by group using group labels:
filter_by(est_gp, op = "~", group = "gp1", fit = fit_gp)
#> lhs op rhs block group est se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 1 1 0.210 0.149 1.404 0.160 -0.083 0.502
#> 2 y1 ~ x2 1 1 0.243 0.138 1.761 0.078 -0.027 0.513
#> 3 y1 ~ x3 1 1 0.246 0.144 1.710 0.087 -0.036 0.527
#> 4 y2 ~ x1 1 1 0.036 0.179 0.199 0.842 -0.315 0.386
#> 5 y2 ~ x3 1 1 0.217 0.176 1.232 0.218 -0.128 0.561
#> 6 y3 ~ y2 1 1 0.096 0.120 0.802 0.423 -0.139 0.332
#> 7 y3 ~ x2 1 1 0.372 0.135 2.754 0.006 0.107 0.637
See the help page of filter_by()
for other options.
group_by_dvs()
and group_by_ivs()
Sometimes I want the conventional iv-column-dv-row or dv-column-iv-row format. That’s what group_by_dvs()
and group_by_ivs()
are for:
group_by_dvs(est)
#> iv est_y1 est_y2 est_y3
#> x1 x1 0.206 0.149 --
#> x2 x2 0.381 -- 0.295
#> x3 x3 0.162 0.230 --
#> y2 y2 -- -- 0.044
group_by_ivs(est)
#> dv est_x1 est_x2 est_x3 est_y2
#> y1 y1 0.206 0.381 0.162 --
#> y2 y2 0.149 -- 0.230 --
#> y3 y3 -- 0.295 -- 0.044
They also supports extracting another column:
group_by_dvs(est, col_name = "pvalue")
#> iv pvalue_y1 pvalue_y2 pvalue_y3
#> x1 x1 0.019 0.154 --
#> x2 x2 0.000 -- 0.014
#> x3 x3 0.064 0.029 --
#> y2 y2 -- -- 0.675
group_by_ivs(est, col_name = "pvalue")
#> dv pvalue_x1 pvalue_x2 pvalue_x3 pvalue_y2
#> y1 y1 0.019 0.000 0.064 --
#> y2 y2 0.154 -- 0.029 --
#> y3 y3 -- 0.014 -- 0.675
See the help page of group_by_dvs()
and group_by_ivs()
for other options.
group_by_groups()
In multiple sample models, one common task is comparing results across groups. I wrote group_by_groups()
for this task, to compare results side-by-side:
group_by_groups(est_gp)
#> lhs op rhs est_1 est_2
#> 1 y1 ~ x1 0.210 0.258
#> 2 y1 ~ x2 0.243 0.484
#> 3 y1 ~ x3 0.246 0.121
#> 4 y2 ~ x1 0.036 0.226
#> 5 y2 ~ x3 0.217 0.261
#> 6 y3 ~ x2 0.372 0.291
#> 7 y3 ~ y2 0.096 -0.003
#> 8 x1 ~~ x1 0.798 1.020
#> 9 x1 ~~ x2 0.223 0.042
#> 10 x1 ~~ x3 0.121 -0.099
#> 11 x2 ~~ x2 0.938 0.722
#> 12 x2 ~~ x3 0.139 0.013
#> 13 x3 ~~ x3 0.825 0.985
#> 14 y1 ~~ y1 0.754 0.568
#> 15 y1 ~~ y3 0.010 0.050
#> 16 y2 ~~ y2 1.148 0.882
#> 17 y3 ~~ y3 0.787 1.412
#> 18 x1 ~1 0.282 0.101
#> 19 x2 ~1 -0.067 0.092
#> 20 x3 ~1 -0.123 -0.065
#> 21 y1 ~1 -0.295 0.007
#> 22 y2 ~1 0.027 -0.050
#> 23 y3 ~1 0.233 -0.333
It also supports extracting several columns:
group_by_groups(est_gp,
col_names = c("est", "pvalue"))
#> lhs op rhs est_1 est_2 pvalue_1 pvalue_2
#> 1 y1 ~ x1 0.210 0.258 0.160 0.011
#> 2 y1 ~ x2 0.243 0.484 0.078 0.000
#> 3 y1 ~ x3 0.246 0.121 0.087 0.244
#> 4 y2 ~ x1 0.036 0.226 0.842 0.076
#> 5 y2 ~ x3 0.217 0.261 0.218 0.044
#> 6 y3 ~ x2 0.372 0.291 0.006 0.127
#> 7 y3 ~ y2 0.096 -0.003 0.423 0.984
#> 8 x1 ~~ x1 0.798 1.020 0.000 0.000
#> 9 x1 ~~ x2 0.223 0.042 0.090 0.718
#> 10 x1 ~~ x3 0.121 -0.099 0.317 0.472
#> 11 x2 ~~ x2 0.938 0.722 0.000 0.000
#> 12 x2 ~~ x3 0.139 0.013 0.289 0.912
#> 13 x3 ~~ x3 0.825 0.985 0.000 0.000
#> 14 y1 ~~ y1 0.754 0.568 0.000 0.000
#> 15 y1 ~~ y3 0.010 0.050 0.932 0.682
#> 16 y2 ~~ y2 1.148 0.882 0.000 0.000
#> 17 y3 ~~ y3 0.787 1.412 0.000 0.000
#> 18 x1 ~1 0.282 0.101 0.032 0.460
#> 19 x2 ~1 -0.067 0.092 0.639 0.425
#> 20 x3 ~1 -0.123 -0.065 0.360 0.630
#> 21 y1 ~1 -0.295 0.007 0.032 0.946
#> 22 y2 ~1 0.027 -0.050 0.870 0.695
#> 23 y3 ~1 0.233 -0.333 0.076 0.041
If the fit object is used, it can print group labels:
group_by_groups(fit_gp,
col_names = c("est", "pvalue"))
#> lhs op rhs est_gp1 est_gp2 pvalue_gp1 pvalue_gp2
#> 1 y1 ~ x1 0.210 0.258 0.160 0.011
#> 2 y1 ~ x2 0.243 0.484 0.078 0.000
#> 3 y1 ~ x3 0.246 0.121 0.087 0.244
#> 4 y2 ~ x1 0.036 0.226 0.842 0.076
#> 5 y2 ~ x3 0.217 0.261 0.218 0.044
#> 6 y3 ~ x2 0.372 0.291 0.006 0.127
#> 7 y3 ~ y2 0.096 -0.003 0.423 0.984
#> 8 x1 ~~ x1 0.798 1.020 0.000 0.000
#> 9 x1 ~~ x2 0.223 0.042 0.090 0.718
#> 10 x1 ~~ x3 0.121 -0.099 0.317 0.472
#> 11 x2 ~~ x2 0.938 0.722 0.000 0.000
#> 12 x2 ~~ x3 0.139 0.013 0.289 0.912
#> 13 x3 ~~ x3 0.825 0.985 0.000 0.000
#> 14 y1 ~~ y1 0.754 0.568 0.000 0.000
#> 15 y1 ~~ y3 0.010 0.050 0.932 0.682
#> 16 y2 ~~ y2 1.148 0.882 0.000 0.000
#> 17 y3 ~~ y3 0.787 1.412 0.000 0.000
#> 18 x1 ~1 0.282 0.101 0.032 0.460
#> 19 x2 ~1 -0.067 0.092 0.639 0.425
#> 20 x3 ~1 -0.123 -0.065 0.360 0.630
#> 21 y1 ~1 -0.295 0.007 0.032 0.946
#> 22 y2 ~1 0.027 -0.050 0.870 0.695
#> 23 y3 ~1 0.233 -0.333 0.076 0.041
See the help page of group_by_groups()
for other options.
group_by_models()
This is inspired by the proposal Rönkkö proposed in a GitHub issue for semTools
. I want something simple for a quick overview and so I wrote group_by_models()
.
Suppose this is the other model fitted:
mod2 <-
"
y1 ~ x1 + x2 + x3
y2 ~ x1 + x2
y3 ~ y2 + x1
"
fit2 <- sem(model = mod2, data = dvs_ivs,
fixed.x = FALSE)
est2 <- parameterEstimates(fit2)
est2
#> lhs op rhs est se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 0.209 0.087 2.394 0.017 0.038 0.381
#> 2 y1 ~ x2 0.387 0.093 4.171 0.000 0.205 0.569
#> 3 y1 ~ x3 0.162 0.088 1.853 0.064 -0.009 0.334
#> 4 y2 ~ x1 0.176 0.106 1.655 0.098 -0.032 0.384
#> 5 y2 ~ x2 -0.209 0.112 -1.864 0.062 -0.430 0.011
#> 6 y3 ~ y2 0.023 0.109 0.210 0.834 -0.190 0.236
#> 7 y3 ~ x1 -0.157 0.117 -1.339 0.181 -0.388 0.073
#> 8 y1 ~~ y1 0.695 0.098 7.071 0.000 0.502 0.888
#> 9 y2 ~~ y2 1.026 0.145 7.071 0.000 0.741 1.310
#> 10 y3 ~~ y3 1.254 0.177 7.071 0.000 0.906 1.601
#> 11 y1 ~~ y3 -0.028 0.093 -0.299 0.765 -0.211 0.155
#> 12 x1 ~~ x1 0.926 0.131 7.071 0.000 0.669 1.183
#> 13 x1 ~~ x2 0.118 0.088 1.339 0.180 -0.055 0.291
#> 14 x1 ~~ x3 0.000 0.092 -0.001 0.999 -0.180 0.180
#> 15 x2 ~~ x2 0.828 0.117 7.071 0.000 0.598 1.057
#> 16 x2 ~~ x3 0.073 0.087 0.840 0.401 -0.098 0.244
#> 17 x3 ~~ x3 0.912 0.129 7.071 0.000 0.659 1.165
These two models have no nested relationships. To compare the estimates, group_by_models()
can be used:
group_by_models(list(Model1 = est,
Model2 = est2))
#> lhs op rhs est_Model1 est_Model2
#> 1 y1 ~ x1 0.206 0.209
#> 2 y1 ~ x2 0.381 0.387
#> 3 y1 ~ x3 0.162 0.162
#> 4 y2 ~ x1 0.149 0.176
#> 5 y2 ~ x2 NA -0.209
#> 6 y2 ~ x3 0.230 NA
#> 7 y3 ~ x1 NA -0.157
#> 8 y3 ~ x2 0.295 NA
#> 9 y3 ~ y2 0.044 0.023
#> 10 x1 ~~ x1 0.926 0.926
#> 11 x1 ~~ x2 0.118 0.118
#> 12 x1 ~~ x3 0.000 0.000
#> 13 x2 ~~ x2 0.828 0.828
#> 14 x2 ~~ x3 0.073 0.073
#> 15 x3 ~~ x3 0.912 0.912
#> 16 y1 ~~ y1 0.695 0.695
#> 17 y1 ~~ y3 -0.027 -0.028
#> 18 y2 ~~ y2 1.013 1.026
#> 19 y3 ~~ y3 1.206 1.254
It can also compare several columns:
group_by_models(list(Model1 = est,
Model2 = est2),
col_names = c("est", "pvalue"))
#> lhs op rhs est_Model1 est_Model2 pvalue_Model1 pvalue_Model2
#> 1 y1 ~ x1 0.206 0.209 0.019 0.017
#> 2 y1 ~ x2 0.381 0.387 0.000 0.000
#> 3 y1 ~ x3 0.162 0.162 0.064 0.064
#> 4 y2 ~ x1 0.149 0.176 0.154 0.098
#> 5 y2 ~ x2 NA -0.209 NA 0.062
#> 6 y2 ~ x3 0.230 NA 0.029 NA
#> 7 y3 ~ x1 NA -0.157 NA 0.181
#> 8 y3 ~ x2 0.295 NA 0.014 NA
#> 9 y3 ~ y2 0.044 0.023 0.675 0.834
#> 10 x1 ~~ x1 0.926 0.926 0.000 0.000
#> 11 x1 ~~ x2 0.118 0.118 0.180 0.180
#> 12 x1 ~~ x3 0.000 0.000 0.999 0.999
#> 13 x2 ~~ x2 0.828 0.828 0.000 0.000
#> 14 x2 ~~ x3 0.073 0.073 0.401 0.401
#> 15 x3 ~~ x3 0.912 0.912 0.000 0.000
#> 16 y1 ~~ y1 0.695 0.695 0.000 0.000
#> 17 y1 ~~ y3 -0.027 -0.028 0.770 0.765
#> 18 y2 ~~ y2 1.013 1.026 0.000 0.000
#> 19 y3 ~~ y3 1.206 1.254 0.000 0.000
See the help page of group_by_models()
for other options.
sort_by()
The function sort_by()
can be used to sort rows using (a) common fields such as "op"
, "lhs"
, and "rhs"
, (b) operators such as "~"
and "~~"
. It can be used on the output of some other functions that manipulate a parameter estimates table.
out <- group_by_groups(est_gp,
col_names = c("est", "pvalue"))
out <- filter_by(out, op = c("~", "~~"))
sort_by(out, by = c("op", "rhs"))
#> lhs op rhs est_1 est_2 pvalue_1 pvalue_2
#> 1 y1 ~ x1 0.210 0.258 0.160 0.011
#> 2 y2 ~ x1 0.036 0.226 0.842 0.076
#> 3 y1 ~ x2 0.243 0.484 0.078 0.000
#> 4 y3 ~ x2 0.372 0.291 0.006 0.127
#> 5 y1 ~ x3 0.246 0.121 0.087 0.244
#> 6 y2 ~ x3 0.217 0.261 0.218 0.044
#> 7 y3 ~ y2 0.096 -0.003 0.423 0.984
#> 8 x1 ~~ x1 0.798 1.020 0.000 0.000
#> 9 x1 ~~ x2 0.223 0.042 0.090 0.718
#> 10 x2 ~~ x2 0.938 0.722 0.000 0.000
#> 11 x1 ~~ x3 0.121 -0.099 0.317 0.472
#> 12 x2 ~~ x3 0.139 0.013 0.289 0.912
#> 13 x3 ~~ x3 0.825 0.985 0.000 0.000
#> 14 y1 ~~ y1 0.754 0.568 0.000 0.000
#> 15 y2 ~~ y2 1.148 0.882 0.000 0.000
#> 16 y1 ~~ y3 0.010 0.050 0.932 0.682
#> 17 y3 ~~ y3 0.787 1.412 0.000 0.000
Some functions, such as group_by_models()
, will automatically call sort_by()
to sort the results.
The default order should be acceptable in most cases, but can also be customized. See the help page of sort_by()
on customizing the order.
Though not officially supported, piping using |>
can be used with most of the functions that manipulate parameter estimates tables. For example:
est_gp |>
add_sig() |>
group_by_groups(col_names = c("est", "pvalue", "sig"),
group_first = FALSE) |>
filter_by(op = c("~"))
#> lhs op rhs est_1 pvalue_1 sig_1 est_2 pvalue_2 sig_2
#> 1 y1 ~ x1 0.210 0.160 0.258 0.011 *
#> 2 y1 ~ x2 0.243 0.078 0.484 0.000 ***
#> 3 y1 ~ x3 0.246 0.087 0.121 0.244
#> 4 y2 ~ x1 0.036 0.842 0.226 0.076
#> 5 y2 ~ x3 0.217 0.218 0.261 0.044 *
#> 6 y3 ~ x2 0.372 0.006 ** 0.291 0.127
#> 7 y3 ~ y2 0.096 0.423 -0.003 0.984
Though I believe the choice of the estimation method should be justified, suppose we want to assess the sensitivity of the parameter estimate results to the methods used, compare_estimators()
can be used as a quick way to compare results from different methods.
out <- compare_estimators(fit,
estimator = c("ML", "GLS", "MLR"))
group_by_models(out, col_names = c("se", "pvalue"))
#> lhs op rhs se_ML se_GLS se_MLR pvalue_ML pvalue_GLS pvalue_MLR
#> 1 y1 ~ x1 0.087 0.094 0.067 0.019 0.028 0.002
#> 2 y1 ~ x2 0.093 0.100 0.091 0.000 0.000 0.000
#> 3 y1 ~ x3 0.088 0.089 0.084 0.064 0.067 0.053
#> 4 y2 ~ x1 0.105 0.112 0.107 0.154 0.148 0.165
#> 5 y2 ~ x3 0.105 0.107 0.123 0.029 0.028 0.061
#> 6 y3 ~ x2 0.121 0.133 0.111 0.014 0.019 0.008
#> 7 y3 ~ y2 0.106 0.116 0.108 0.675 0.654 0.681
#> 8 x1 ~~ x1 0.131 0.129 0.136 0.000 0.000 0.000
#> 9 x1 ~~ x2 0.088 0.090 0.077 0.180 0.202 0.122
#> 10 x1 ~~ x3 0.092 0.092 0.102 0.999 0.840 0.999
#> 11 x2 ~~ x2 0.117 0.114 0.127 0.000 0.000 0.000
#> 12 x2 ~~ x3 0.087 0.089 0.091 0.401 0.414 0.423
#> 13 x3 ~~ x3 0.129 0.131 0.111 0.000 0.000 0.000
#> 14 y1 ~~ y1 0.098 0.100 0.119 0.000 0.000 0.000
#> 15 y1 ~~ y3 0.092 0.093 0.104 0.770 0.793 0.797
#> 16 y2 ~~ y2 0.143 0.139 0.141 0.000 0.000 0.000
#> 17 y3 ~~ y3 0.171 0.167 0.171 0.000 0.000 0.000
It simply refits the models for each estimator and returns the results. They can then be treated as different “models” and processed by group_by_models()
.
se_ratios()
is a wrapper of group_by_models()
used to compare the standard errors by different estimators in the output of compare_estimators()
:
se_ratios(out, reference = "ML")
#> lhs op rhs se_ML se_GLS se_MLR ratio_ML ratio_GLS ratio_MLR
#> 1 y1 ~ x1 0.087 0.094 0.067 1 1.072 0.765
#> 2 y1 ~ x2 0.093 0.100 0.091 1 1.073 0.977
#> 3 y1 ~ x3 0.088 0.089 0.084 1 1.013 0.958
#> 4 y2 ~ x1 0.105 0.112 0.107 1 1.068 1.027
#> 5 y2 ~ x3 0.105 0.107 0.123 1 1.011 1.165
#> 6 y3 ~ x2 0.121 0.133 0.111 1 1.101 0.918
#> 7 y3 ~ y2 0.106 0.116 0.108 1 1.099 1.020
#> 8 x1 ~~ x1 0.131 0.129 0.136 1 0.985 1.036
#> 9 x1 ~~ x2 0.088 0.090 0.077 1 1.015 0.867
#> 10 x1 ~~ x3 0.092 0.092 0.102 1 0.999 1.114
#> 11 x2 ~~ x2 0.117 0.114 0.127 1 0.974 1.084
#> 12 x2 ~~ x3 0.087 0.089 0.091 1 1.015 1.049
#> 13 x3 ~~ x3 0.129 0.131 0.111 1 1.012 0.859
#> 14 y1 ~~ y1 0.098 0.100 0.119 1 1.015 1.210
#> 15 y1 ~~ y3 0.092 0.093 0.104 1 1.015 1.134
#> 16 y2 ~~ y2 0.143 0.139 0.141 1 0.973 0.987
#> 17 y3 ~~ y3 0.171 0.167 0.171 1 0.979 1.005
See the help page of compare_estimators()
for other options.
One issue with the standardized solution is the confidence intervals. They are based on the delta method even if se = "boot"
is used. For indirect effects, for which bootstrap confidence intervals are commonly used, the confidence intervals for them in the standardized solution are not what usually reported other tools for mediation. There are some other powerful tools on the Internet and the [Google Group for lavaan], see this thread, to address this problem. I wrote standardizedSolution_boot_ci()
not to replace them (it obviously can’t), but to address a very specific case I usually encounter myself: Generating the bootstrap confidence intervals for standardized estimates based on the bootstrap estimates already generated by se = "boot"
.
Please see vignette("standardizedSolution_boot_ci")
for an illustration on standardizedSolution_boot_ci()
.
lavaan::lavaan()
and its wrappers suc has lavaan::sem()
and lavaan::cfa()
allow users to set several options using an estimator
: ML
, GLS
, WLSMV
, and others. However, it is not easy to remember what the options set for an estimator. Instead of finding them from the output of summary()
, this function shows some of them in one table for a quick overview. This is an example:
data(dvs_ivs)
mod <-
"
y1 ~ x1 + x2 + x3
y2 ~ x1 + x3
y3 ~ y2 + x2
"
fit_default <- sem(model = mod, data = dvs_ivs)
show_more_options(fit_default)
#> Options Call Actual
#> Estimator(s) default ML
#> Standard Error (SE) default standard
#> Model Test Statistic(s) default standard
#> How Missing Data is Handled default listwise
#> Information Matrix (for SE) default expected
#> Information Matrix (for Model Test) default expected
#> Mean Structure default No
#> 'x' Fixed default TRUE
fit_MLR <- sem(model = mod, data = dvs_ivs, estimator = "MLR")
show_more_options(fit_MLR)
#> Options Call Actual
#> Estimator(s) MLR ML
#> Standard Error (SE) default robust.huber.white
#> Model Test Statistic(s) default yuan.bentler.mplus
#> How Missing Data is Handled default listwise
#> Information Matrix (for SE) default observed
#> Information Matrix (for Model Test) default observed
#> Mean Structure default No
#> 'x' Fixed default TRUE
fit_MLR_fiml <- sem(model = mod, data = dvs_ivs, estimator = "MLR",
missing = "fiml")
show_more_options(fit_MLR_fiml)
#> Options Call Actual
#> Estimator(s) MLR ML
#> Standard Error (SE) default robust.huber.white
#> Model Test Statistic(s) default yuan.bentler.mplus
#> How Missing Data is Handled fiml ml
#> Information Matrix (for SE) default observed
#> Information Matrix (for Model Test) default observed
#> Mean Structure default Yes
#> 'x' Fixed default TRUE
In structural equation modeling, closed-form solution is rare and optimization (minimization) is used to find the/a solution. Out of curiosity and teaching, I wrote a function to capture the minimization history so I can examine it or even plot the process. The function, record_history()
, is still in early development but should work for now for common simple scenarios. Please refer the help page of record_history()
to learn more about it.
In lavaan
, I rarely need to manually add covariances between exogenous variables (defined in a loose sense: variables appear on the right-hand side but not on the left-hand side of ~
). However, I came into a situation in which lavaan
will not do this (for good reasons). For example, when a covariance between a residual term and an exogenous variable is set to free. I wrote two simple functions, add_exo_cov()
and auto_exo_cov()
, for this purpose. Please refer to their help pages for further information.