cobalt
News and
UpdatesAdded support in bal.plot()
for negative weights
with type = "density"
.
Added support for ps.cont()
objects from the
twangContinuous
package. ps.cont
objects from
WeightIt
are no longer supported.
Major documentation overhaul. More arguments are explained at
help("bal.tab")
and a new package help page can be found at
help("cobalt-package")
.
The function call is no longer included in the
bal.tab()
results for objects from
twang
.
Fixed a bug when some predictors were binary in some clusters and continuous in others. Variables now have a stable type across partitions.
Fixed a bug where binary variables were not being correctly
processed when using the formula
interface.
When using poly
, orthogonal polynomials can be
requested by setting orth = TRUE
.
Improved appearance of conditional examples in
pkgdown
site.
Removed mlogit
from Suggests.
Returned sbw
to Suggests.
Updated the logo, thanks to Ben Stillerman.
When pairwise = FALSE
with binary or multi-category
treatments, the balance statistics now refer to the difference between
each group and the original full sample, unadjusted except possibly by
s.weights
. Previously, they referred to the difference
between each group and the combined adjusted sample.
When subclassification is used and some units are discarded,
bal.tab()
now reports the number of discarded units along
with with the number of units in each subclass in the sample sizes
table.(#59)
Fixed several bugs when using love.plot()
with
subclassification that were caused by the last update. Thanks to Mario
Lawes for pointing them out.
Fixed a bug in how get.w()
computed weights for
Match
objects resulting from Matching::Match()
with estimand = "ATE"
. Results now agree with
Matching::MatchBalance()
.
Fixed a bug that would occur when using cobalt
functions without attaching the package (e.g.,
cobalt::bal.tab()
). (#53)
Fixed a bug that would occur with ordinal treatments.
Added better support for negative weights.
Fixed typos (#54, many identified and fixed by @jessecambon).
Added support for objects from the new version of
MatchThem
.
Fixed a bug and improved speed when using
match.strata
.
Returned cem
to Suggests.
Added ability to display threshold summaries with multiply imputed datasets, clustered datasets, multi-category treatments, and longitudinal treatments.
Added pairwise
argument for binary treatments. When
set to FALSE
, bal.tab()
will display balance
between each treatment group and the full sample (i.e., the target
population). This functionality already existed for multi-category
treatments; indeed, for binary treatments, it works by treating the
treatment as multi-category.
Added two new stats
options in
bal.tab()
and love.plot()
for continuous
treatments: "mean.diffs.target"
(abbreviated as
"m"
) and "ks.statistics.target"
(abbreviated
as "ks"
). These compute (standardized) mean differences and
KS statistics between the weighted and unweighted samples to ensure the
weighted sample is representative of the original population. These
statistics are only computed for the adjusted sample (i.e., they will
not appear in the absence of adjustment).
With subclassification methods, the arguments
which.subclass
and subclass.summary
have been
added to display balance on individual subclasses and control output of
the balance across subclasses summary. These arguments replace the
disp.subclass
argument, which can still be used.
When using bal.plot()
with clustered or multiply
imputed data, the which.cluster
and which.imp
arguments can be set to .none
to display balance ignoring
cluster membership and combining across imputations.
Changed processing of the print()
method. Now there
is only one print()
method (print.bal.tab()
)
for all bal.tab
objects. Processing is a little smoother
and some printing bugs have been fixed. "as.is"
can no
longer be supplied to keep the print setting as-is; simply omit the
corresponding argument to use the options as specified in the call to
bal.tab()
.
An additional argument, disp.call
, can be supplied
to bal.tab()
and print.bal.tab()
to control
printing of the call
component of the input object, which
contains the original function call. Set to FALSE
to hide
the call. This option is documented in ?display_options
and
can also be set using set.cobalt.options()
.
The balance table component of bal.tab
objects is
smaller because some extraneous columns are no longer produced. In
particular, if no threshold is requested, no threshold columns will be
produced. This does not affect display, but makes it easier to extract
balance statistics from bal.tab
objects (e.g., for
exporting as a table). This does mean that previously saved
bal.tab
objects produced by earlier versions of
cobalt
will not be able to be printed correctly.
Fixed a bug where bal.plot()
would incorrectly
process 2-level factor variables (#48).
Fixed a bug where love.plot()
would not display
variables in the correct order when using aggregation and setting
var.order = NULL
. Thanks to Florian Kaiser.
Fixed a bug in love.plot()
where the color of points
could be incorrect.
Fixed a bug in love.plot()
where samples were not
always displayed in the right order. Now they are displayed in the same
order they are in bal.tab()
.
Fixed a bug in love.plot()
when the weight names had
spaces in them.
Added an error message when not all clusters contain all treatment levels. Thanks to Rachel Visontay.
Fixed a bug when supplying the weights
argument as a
list of supported objects (e.g., weightit
objects) if they
were unnamed. Samples are more conveniently named.
Fixed a bug in col_w_mean()
,
col_w_smd()
, and friends that occurred when few nonzero
weights were present. Now an informative error is thrown.
Updates to documentation.
Sampling weights now function correctly with subclassification.
Fixed a bug in print.bal.tab()
when no units were
unmatched but some were discarded.
Fixed an issue with the version number for gridExtra
in DESCRIPTION
. (#47)
cem
removed from Suggests because it has been
removed from CRAN.
Updated to support MatchIt
4.0.0, which includes
sampling weights and improved processing of the covariates.
Fixed bugs in processing functions in formulas, including
rms
functions and poly()
. (#40)
Fixed a bug in how KS statistics were computed with
col_w_ks()
. Results now agree with those from
MatchIt
and twang
.
Fixed bugs in processing small and partially empty subclasses.
In functions that compute weights from matching strata (e.g.,
get.w()
for some types of objects), an
estimand
argument can be supplied to choose which formula
is used to compute the weights. Subclass propensity scores are computed
as the number of treated units in each subclass, and then stabilized
weights are computed from those propensity scores using the standard
formulas.
Effective sample sizes now print only up to two digits (believe me, you don’t need three) and print more cleanly with whole numbers.
Fixed a bug due to new version of sbw
.
Minor improvements to error messages and documentation.
Fixed a bug where int
and poly
were
ignored with binary and continuous treatments.
Fixed a bug where subclass balance statistics were incorrectly computed. Thanks to Mario Lawes.
Improved processing of inappropriately given S4 objects.
Removed bal.tab
methods for atomic vectors (which
were undocumented). The errors they would provide when inappropriately
supplied were unhelpful.
Fixed a bug with backports
1.1.7 not running
correctly.
Fixed a bug with str2expression
for R versions below
3.6.0. Thanks to @kthohr and @jimmyg909.
When data is segmented (i.e., with a multi-category or
longitudinal treatment or when clusters or multiple imputations are
specified), the balance summary across segments will not be computed or
displayed when individual segment balance is requested. See
?display_options
to see the defaults for the different
segment types, some of which have changed.
Updated some warnings.
Added support for Matchby
objects resulting from a
call to Matchby()
in the Matching
package.
These function identically to Match
objects.
When using formula
inputs, interaction terms (e.g.,
X1 * X2
) will now correctly be resolved and displayed as an
interaction term. This makes it easier to check balance on specific
interactions rather than having to set int = TRUE
or create
a separate interaction variable in the data. Interaction terms specified
in this way will be ignored by int
and poly
when they are used. When using var.names
with
love.plot()
, changing the names of the base components of
the interaction will also change their name in the interaction term,
consistent with int
behavior. If a formula
was
supplied in the input object to bal.tab()
or other
functions, terms in that formula will now also be included in balance
reports.
Arguments to addl
can now be specified as a
one-sided formula (e.g., ~ X1 + X2 * X3
). This makes it
easy to take advantage of the above changes to the formula interface to
add additional interaction terms. The formula will look at all available
datasets in the conditioning object or supplied to
bal.tab()
and at the global environment. If supplying a
single variable that exists in the global environment, it makes sense to
supply it as a formula (e.g., addl = ~ X1
) rather than as
just the variable (e.g., addl = X1
). Doing the former will
retain the name of the variable. The same can be done with
distance
. If variables in addl
are perfectly
correlated with or have the same name as supplied covariates, those
variables will be removed from addl
.
If only one argument is provided to f.build()
(e.g.,
f.build("x")
), it will be treated as the right-hand-side of
the formula with no left-hand-side (e.g., the above will evaluate to
~ x
).
Fixed bug that caused match.strata
input to be
ignored.
Improved processing and error reporting when using the default
bal.tab()
method.
Speed improvements due to changes in how formulas are processed
(now using model.matrix()
directly rather than
splitfactor()
to process factors) and other small fixes.
This is what enables the above changes to the formula
capabilities.
Improved documentation for weightitMSM
objects from
WeightIt
and CBMSM
objects from
CBPS
.
Fixed bug when printing bal.tab
objects with
continuous treatments.
Fixed bug when using multi-category treatments with numbers as the level names.
Fixed bug when using mnps
objects from
twang
with multiple stop methods.
Fixed bug when requesting means or standard deviations with segmented data.
Fixed bug where the x-axis in love.plot
was always
“Standardized Mean Differences” even when it wasn’t supposed to be for
stats = "mean.diffs"
.
rlang
is now in IMPORTS.
General speed and stability improvements.
Added support for sbwcau
objects from
sbw
. See Appendix 1 or ?bal.tab.sbw
for an
example.
Added support for cem.match
objects from
cem
. See Appendix 1 or ?bal.tab.cem.match
for
an example.
Added stats
argument to bal.tab()
and
print()
to replace disp.v.ratio
and
disp.ks
. This argument functions similarly to how it does
in love.plot()
; for example, to request mean differences
and variance ratios, one can enter stats = c("m", "v")
. One
consequence of this is that it is possible to request statistics that
don’t include mean differences. See ?display_options
for
more details. The old arguments still work (and probably always will)
but you should use stats
instead. The goal here was to
unify syntax across bal.tab()
, print()
, and
love.plot()
. A new help page specifically for the
stats
argument can be viewed at
?balance.stats
.
Added thresholds
argument to bal.tab()
to replace m.threshold
, v.threshold
, etc. This
argument functions similarly to how it does in love.plot()
;
for example, to request thresholds for mean differences and variance
ratios, one can enter thresholds = c(m = .1, v = 2)
. The
old arguments still work (and probably always will) but you should use
thresholds
instead. The goal here was to unify syntax
across bal.tab()
, print()
, and
love.plot()
.
Added disp.means
option to bal.plot
to
display the mean of the covariate as a line on density plots and
histograms.
Added "hedges"
as an option to
s.d.denom
. This will compute the standardized mean
difference using the formula for the small sample-corrected Hedge’s G as
described in the What Works Clearinghouse Procedures Handbook.
With multi-category treatments when
pairwise = FALSE
, rather than computing balance between
each treatment group and the other treatment groups, balance is now
computed between each treatment group and the entire sample.
In print()
, the arguments
disp.m.threshold
, disp.v.threshold
,
disp.ks.threshold
, and disp.r.threshold
, which
could be set to FALSE
to prevent the corresponding balance
thresholds and summaries from being printed, have been replaced with
disp.thresholds
. Named entries can be set to
FALSE
. The goal here was to unify syntax across
bal.tab()
and print()
.
A new balance statistic, the overlapping coefficient (OVL), is
allowed with binary and multi-category treatments. This is described in
Belitser et al. (2011) and Franklin et al. (2014) for assessing balance.
Generally, for each covariate, the overlapping coefficient is the area
of the probability density functions for each sample that overlap. Here
I follow Franklin et al. (2014) and report 1 - (OVL) so that values
close to zero indicate good balance (i.e., completely overlapping
distributions) and values close to 1 indicate poor balance (i.e.,
completely non-overlapping distributions). To estimate and display the
OVL, set include "ovl"
in the stats
argument
in a call to bal.tab()
or love.plot()
(or you
can use the old syntax by setting disp.ovl = TRUE
). The
balance threshold can be requested by including "ovl"
in
the thresholds
argument (or you can use the old syntax by
using the ovl.threshold
argument).
Spearman correlations can be requested for continuous treatments
by adding "sp"
to the stats
argument.
The argument weights
can now be supplied to any
bal.tab()
call to request balance on additional weights
beyond the weights from the object on which bal.tab()
is
called. This argument takes a named list, where each element is a vector
of weights, the name of a variable containing weights in an available
dataset, or an object with a get.w()
method (e.g., the
output of another preprocessing function). This should make it easier to
compare balancing methods without having to specify the covariates and
treatment using the formula
or data.frame
methods.
ggplot2
version 3.3.0 is required, which removes
some warnings and makes it so ggstance
doesn’t need to be
imported.
When there are more than 900 variables to compute balance
statistics on in bal.tab
(which can happen quickly when
int = TRUE
and categorical variables have many categories),
to avoid major slowdowns, checks for redundancy of variables are
forgone. This will dramatically increase the speed of
bal.tab
in these scenarios. This option can be changed with
the cobalt
option "remove_perfect_col"
which
can be set to TRUE
or or FALSE
. Set to
FALSE
to improve speed at the expense of possibly having
redundant variables appear.
Fixed a bug when using the default bal.tab
method
with objects containing longitudinal treatments.
Fixed a bug when using bal.tab
with continuous
treatments and clusters.
Fixed a bug in love.plot()
when using
subclassification.
Fixed a bug when using bal.tab
with longitudinal
treatments and multiple sets of weights.
Fixed a bug when using col_w_ovl()
. OVL values are
now more accurate.
Speedups and other small fixes.
Major Updates
Added support for mimids
and wimids
objects from MatchThem
.
Major restructuring so that clusters, longitudinal treatments,
multi-category treatments, and multiply imputed data can all be used
with each other. These are layers in the following order: clusters, time
points, treatment categories, and imputations. Summaries across these
layers are handled slightly differently from how they used to be;
importantly, summaries are not nested, and only the lowest layer present
can have a summary. For example, if multiply imputed data is used with
multi-category treatments, there will be a summary across imputations
(the lowest layer) but not across treatment pairs.
love.plot
allows multiple forms of faceting and aggregating
and is extremely flexible in this regard.
Major changes to appearance of bal.plot()
to be more
in line with love.plot()
, including new grid
and position
options to control the presence of the grid
and the position of the legend.
Formula interfaces now accept poly(x, .)
and other
matrix-generating functions of variables, including the
rms
-class-generating functions from the rms
package (e.g., pol()
, rcs()
, etc.) (the
rms
package must be loaded to use these latter ones) and
the basis
-class-generating functions from the
splines
package (i.e., bs()
and
ns()
). A bug in an early version of this was found by @ahinton-mmc.
Minor Updates and Bug Fixes
s.d.denom
and
estimand
s.d.denom
can now use the name of a treatment rather
than just "treated"
or "control"
. In addition,
s.d.denom
can be "weighted"
to use the
weighted sample’s standardization factors, an option available for
continuous treatments, too.
Improved guessing of the estimand when not provided.
Estimands besides ATT can now be used with subclasses. The
estimand can be inferred from the provided subclasses. Works with
match.strata
as well, which function like subclasses. In
addition, it is not always assumed that MatchIt
objects are
targeting the ATT, for example, with subclassification or
calipers.
bal.plot
Added sample.names
argument in bal.plot
in response to this
post on Cross Validated.
Added functionality to the which
argument in
bal.plot
, allowing more specificity when multiple sets of
weights are used.
Added type = "ecdf"
option to bal.plot
for categorical treatments with continuous covariates to display
empirical cumulative density plots as an alternative to density
plots.
When using bal.plot
with continuous treatments and
continuous covariates, the points are shaded based on their weights;
this behavior is controlled by the new alpha.weight
argument, which replaces the functionality of size.weight
(which was kind of ugly and not very informative) and is
TRUE
by default. Now it’s more apparent which points are
influential in the weighted sample. In addition, a line illustrating the
unweighted covariate mean is present.
The default of the grid
argument is now
FALSE
in bal.plot()
and
love.plot()
. Previously it was TRUE
. This make
the plots cleaner at the outset.
Other improvements
Added new function col_w_cov()
to compute
treatment-covariate covariances (i.e., unstandardized correlations) for
continuous treatments. continuous
and binary
can be set to "raw"
in bal.tab()
and
std
can be set to FALSE
in
col_w_cov()
to request treatment-covariate covariances
instead of correlations. col_w_corr()
is now a wrapper for
col_w_cov()
with std = TRUE
. To get more
functionality out of the std
argument (e.g., to standardize
the covariances for some covariates but not others), use
col_w_cov()
.
Balance summary functions (e.g., col_w_sd()
,
col_w_smd()
, etc.) process binary variables slightly
differently. If bin.vars
is missing, the function will
figure out which variables are binary. If NULL
, it will be
assumed no variables are binary. Entering values for
bin.vars
can be done more flexibly. When a factor variable
is supplied as part of mat
and is split internally by
splitfactor()
, extra values will be automatically added to
bin.vars
with the newly created dummies considered binary
variables.
Bug fixes when binary factor treatments are used, thanks to Moaath Mustafa Ali.
bal.tab()
no longer tells you whether it assumes
matching or weighting when certain non-package-related methods are
used.
Improvements to assessment of subclass balance. For binary
treatments, balance statistics other than mean differences can now be
requested. The across-subclass balance summary uses subclassification
weights (processed in the same way match.strata
is) instead
of simply taking a weighted average across subclasses (which is not
valid for non-additive statistics like variance ratios or KS
statistics). For continuous treatments, a balance summary across
subclasses can now be produced. This uses a weighted average of the
subclass-specific balance statistics.
The default in love.plot()
for abs
is
now to be whatever it is in the (implicit) call to
bal.tab()
, which is usually FALSE
. Previously
abs
was not aligned between love.plot()
and
bal.tab()
.
s.weights
can now be manually supplied to methods
that usually come with their own sampling weights, such as
twang
and WeightIt
.
Speedup of splitfactor()
.
splitfactor()
now has a split.with
option to split one or more vectors in concert with the data set being
split.
splitfactor()
and unsplitfactor()
are a
little smarter and more in sync.
All functions work better inside other functions like
lapply()
or purrr::map()
, thanks to @the-Zian.
Updates to the vignettes; Appendix 2 is particularly different.
Other bug fixes and performance improvements here and there.
Added vignette for use of love.plot
.
Changed grid
version requirement.
Updated README.
Fixed bugs that would occur when using love.plot()
with various combinations of var.order
, multiple
stats
, and agg.fun = "range"
.
Fixed bugs that would occur when using bal.tab()
with objects from the Matching
package. Calculated
statistics are now the same as those generated using
Matching::MatchBalance
. Changes based on updates to
get.w.Match()
.
Added balance summary functions col_w_mean
,
col_w_sd
, col_w_smd
, col_w_vr
,
col_w_ks
, col_w_ovl
, and
col_w_corr
. These make it easier to get quick, simple
summaries of balance without calling bal.tab
, for example,
for use in programming other functions. Some of these are now used
inside bal.tab
to increase speed and simplify internal
syntax.
Other small bug fixes.
Added the ability to display balance on multiple measures (e.g.,
mean differences, variance ratios, KS statistics) at the same time with
love.plot()
.
Bug fixes that make bal.tab()
and
love.plot()
more usable within other functions and
especially when called with do.call()
.
Made it easier to get proper bal.tab
output when
using matchit()
with an argument to distance
(in the call to matchit()
). Include the original dataset in
the data
argument of bal.tab()
to get the
variables to display correctly.
Changed the default shape in love.plot()
to
"circle"
, which is a solid circle. I found this a prettier
alternative to the open circle, especially on Windows. To get back open
circles you set shapes = "circle filled"
(yes, that is a
bit confusing).
Added ability to hide the gridlines easily in
love.plot()
.
Changed the calculation of standard deviations (and standardized
differences in proportion) for binary variables to be more in line with
recommendations, as noted by @mbloechl05. Note this will make these
values different from those in MatchIt::summary
by a small
amount.
The KS statistic is now computed for binary variables. It is simply the difference in proportion.
Allowed some methods to accept mids
objects (the
output of a call to mice::mice()
) in the data
argument to supply multiply imputed data. This essentially replaces
data = complete(imp.out, "long"), imp = ".imp"
with
data = imp.put
, assuming imp.out
is a
mids
object.
Other bug fixes and improvements.
Changes to some bal.tab
defaults: quick
is now set to TRUE
by default. Adjusted and unadjusted
means, standard deviations, and mean differences will always be
computed, regardless of quick
. Variance ratios and KS
statistics will only be computed if quick = FALSE
or
disp.v.ratio
or disp.ks
, respectively, are
TRUE
.
Variance ratios now respond to abs
. When
abs = FALSE
, the default in bal.tab
, the
variance is ratio is the variance of the treated (1) divided by the
variance of the control (0). When abs = TRUE
, the numerator
of the variance ratio is the larger variance and the denominator is the
smaller variance, which was the old behavior. v.threshold
still responds as if abs
was set to TRUE
, just
like with mean differences. Any time variance ratios are aggregated
(e.g., across imputations or clusters), the “mean” variance ratio is the
geometric mean to account for the asymmetry in the ratios.
love.plot
has several changes that make it much more
user-friendly. First, rather than supplying a bal.tab
object to love.plot
, you can simply supply the arguments
that would have gone into the bal.tab()
call straight into
love.plot()
. Second, if quick = TRUE
(the new
default) and the first argument to love.plot()
is a call to
bal.tab()
(or arguments provided to bal.tab()
)
and stat
is set to "variance.ratios"
or
"ks.statistics"
, bal.tab()
will be re-called
with the corresponding disp
argument set to
TRUE
so that love.plot()
will display those
statistics regardless of quick
. This will not work if the
argument supplied to love.plot()
is a bal.tab
object. Third, because unadjusted mean differences are computed
regardless of quick
, there will never be a circumstance in
which only adjusted values will be displayed. If
quick = TRUE
, un = FALSE
, and
stat
is "variance.ratios"
or
"ks.statistics"
, un
will automatically be set
to TRUE
in the bal.tab()
re-call.
When using which.
arguments (e.g.,
which.cluster
, which.imp
, etc.), instead of
supplying NULL
and NA
, you can supply
.all
and .none
(not in quotes). This should
make them easier to use. Note that these new inputs are not variables;
they are keywords and are evaluated using nonstandard evaluation. If you
actually have objects with those names, they will be ignored.
Bugs in scoping related to the formula interface have been
solved, in particularly making bal.tab()
more usable within
other functions.
Fixed bug occurring when using matchit
objects
having set discard
to something other than
NULL
and reestimate = TRUE
in the call to
matchit()
. Thank you to Weiyi Xie for finding this
bug.
Fixed bug occurring when using balance thresholds with subclassification.
Fixed bug occurring when printing bal.tab
output for
continuous treatments with clusters.
Fixed bug occurring when using bal.tab()
on
mnps
objects with multiple stop methods.
Added poly
argument to bal.tab()
to
display polynomials of continuous covariates (e.g., squares, cubes,
etc.). This used to only be available with the int
argument, which also displayed all interactions. Now, the polynomials
can be requested separately. When int = TRUE
, squares of
the covariates will no longer be displayed; to replicate the old
behavior, set int = 2
, which is equivalent to
int = TRUE, poly = 2
.
Fixed a bug where using subset
would produce an
error.
Fixed a bug when using multiply imputed data with binary treatments that were factors or characters.
Updated the bal.tab
documentation to make it easier
to navigate to the right page.
Small documentation and syntax updates.
Added the hidden and undocumented argument center
to
bal.tab
, which, when set to TRUE
, centers the
covariates at the mean of the entire unadjusted sample prior to
computing interactions and polynomials.
Added set.cobalt.options
function to more easily set
the global options that can be used as defaults to some arguments. For
example, set.global.options(binary = "std")
makes it so
that standardized mean difference are always displayed for binary
covariates (in the present R session). The options can be retrieved with
get.cobalt.options
.
Several changes to bal.tab()
display options (i.e.,
imbalanced.only
, un
, disp.means
,
disp.v.ratio
, disp.ks
,
disp.bal.tab
, disp.subclass
, and parameters
related to the display of balance tables with multinomial treatments,
clusters, multiple imputations, and longitudinal treatments). First, the
named arguments have been removed from the method-specific functions in
order to clean them up and make it easier to add new functions, but they
are still available to be specified. Second, a help page devoted just to
these functions has been created, which can be accessed with
?options-display
. Third, global options for these arguments
can be set with options()
so they don’t need to be typed
each time. For example, if you wanted un = TRUE
all the
time, you could set options(cobalt_un = TRUE)
once and not
have to include it in the call to bal.tab()
.
Added disp.sds
option to display standard deviations
for each group in bal.tab()
. This works in all the same
places disp.means
does.
Added cluster.fun
and imp.fun
options
to request that only certain functions (e.g., mean or maximum) of the
balance statistics are displayed in the summary across
clusters/imputations. Previously this option was only available by call
print()
. These parameters are part of the display options
described above, so they are documented in ?options-display
and not in the bal.tab
help files.
Added factor_sep
and int_sep
options to
change the separators between variable names when factor variables and
interactions are displayed. This functionality had been available since
version 3.4.0 but was not documented. It is now documented in the new
display_options
help page.
In bal.tab()
, continuous
and
binary
can be specified with the global options
"cobalt_continuous"
and "cobalt_binary"
,
respectively, so that a global setting (e.g., to set
binary = "std"
to view standardized mean difference rather
than raw differences in proportion for binary variables) can be used
instead of specifying the argument each time in the call to
bal.tab()
.
Minor updates to f.build()
to process inputs more
flexibly. The left hand side can now be empty, and the variables on the
right hand side can now contain spaces.
Fixed a bug when logical treatments were used. Thanks to @victorn1.
Fixed a bug that would occur when a variable had only one value. Thanks to @victorn1.
Made it so the names of 0/1 and logical variables are not printed
with "_1"
appended to them. Thanks to @victorn1 for the
suggestion.
Major updates to the organization of the code and help files.
Certain functions have simplified syntax, relying more on
...
, and help pages have been shorted and consolidated for
some methods. In particular, the code and help documents for the
Matching
, optmatch
, ebal
, and
designmatch
methods of bal.tab()
have been
consolidated since they all rely on exactly the same syntax.
Fixed a bug that would occur when
imabalanced.only = TRUE
in bal.tab()
but all
variables were balanced.
Fixed a bug where the mean of a binary variable would be displayed as 1 minus its mean.
Fixed a bug that would occur when missingness patterns were the same for multiple variables.
Fixed a bug that would occur when a distance measure was to be
assessed with bal.tab()
and there were missing values in
the covariates (thanks to Laura Helmkamp).
Fixed a bug that would occur when estimand
was
supplied by the user when using the default
method of
bal.tab()
.
Fixed a bug where non-standard variable names (like
"I(age^2)"
) would cause an error.
Fixed a bug where treatment levels that had different numbers of characters would yield an error.
Added disp.means
option to bal.tab
with
continuous treatments.
Added default
method for bal.tab
so it
can be used with specially formatted output from other packages (e.g.,
from optweight
). bal.plot
should work with
these outputs too. This, of course, will never be completely bug-free
because infinite inputs are possible and cannot all be processed
perfectly. Don’t try to break this function :)
Fixed some bugs occurring when standardized mean differences are not finite, thanks to Noémie Kiefer.
Speed improvements in bal.plot
, especially with
multiple facets, and in bal.tab
.
Added new options to bal.plot
, including the ability
to display histograms rather than densities and mirrored rather than
overlapping plots. This makes it possible to make the popular mirrored
histogram plot for propensity scores. In addition, it’s now easier to
change the colors of the components of the plots.
Made behavior around binary variables with interactions more like
documentation, where interactions with both levels of the variable are
present (thanks to @victorn1). Also, replaced _
with *
as the delimiter between variable names in
interactions. For the old behavior, use int_sep = "_"
in
bal.tab
.
Expanded the flexibility of var.names
in
love.plot
so that replacing the name of a variable will
replace it everywhere it appears, including interactions. Thanks to
@victorn1 for the
suggestion.
Added var.names
function to extract and save
variable names from bal.tab
objects. This makes it a lot
easier to create replacement names for use in love.plot
.
Thanks to @victorn1
for the suggestion.
When weighted correlations are computed for continuous
treatments, the denominator of the correlation now uses the unweighted
standard deviations. See ?bal.tab
for the
rationale.
Added methods for objects from the designmatch
package.
Added methods for ps.cont
objects from the
WeightIt
package.
Fixed bugs resulting form changes to how formula inputs are handled.
Cleaned up some internal functions, also fixing some related bugs
Added subset
option in all bal.tab()
methods (and consequently in bal.plot()
) that allows users
to specify a subset of the data to assess balance on (i.e., instead of
the whole data set). This provides a workaround for methods were the
cluster
option isn’t allowed (e.g., longitudinal
treatments) but balance is desired on subsets of the data. However, in
most cases, cluster
with which.cluster
specified makes more sense.
Updated help files, in particular, more clearly documenting
methods for iptw
objects from twang
and
CBMSM
objects from CBPS
.
Added pretty printing with crayon
, inspired by Jacob
Long’s jtools
package
Added abs
option to bal.tab
to display
absolute values of statistics, which can be especially helpful for
aggregated output. This also affects how love.plot()
handles aggregated balance statistics.
Added support for data with missing covariates.
bal.tab()
will produce balance statistics for the
non-missing values and will automatically create a new variable
indicating whether the variable is missing or not and produce balance
statistics on this variable as well.
Fixed a bug when displaying maximum imbalances with subclassification.
Fixed a bug where the unadjusted statistics were not displayed
when using love.plot()
with subclasses. (Thanks to Megha
Joshi.)
Add the ability to display individual subclass balance using
love.plot()
with subclasses.
Under-the-hood changes to how weightit
objects are
handled.
Objects in the environment are now handled better by
bal.tab()
with the formula interface. The data
argument is now optional if all variables in the formula exist in the
environment.
Fixed a bug when using get.w()
(and
bal.tab()
) with mnps
objects from
twang
with only one stop method.
Fixed a bug when using bal.tab()
with
twang
objects that contained missing covariate
values.
Fixed a bug when using int = TRUE
in
bal.tab()
with few covariates.
Fixed a bug when variable names had special characters.
Added ability to check higher order polynomials by setting
int
to a number.
Changed behavior of bal.tab()
with multinomial
treatments and s.d.denom = "pooled"
to use the pooled
standard deviation from the entire sample, not just the paired
treatments.
Restored some vignettes that required
WeightIt
.
Edits to vignettes and help files to respond to missing packages. Some vignette items may not display if packages are (temporarily) unavailable.
Fixed issue with sampling weights in CBPS
objects.
(Thanks to @kkranker
on Github.)
Added more support for sampling weights in get.w()
and help files.
Added support for longitudinal treatments in
bal.tab()
, bal.plot()
, and
love.plot()
, including output from iptw()
in
twang
, CBMSM()
from CBPS
, and
weightitMSM()
from WeightIt
.
Added a vignette to explain use with longitudinal treatments.
Edits to help files.
Added ability to change density options in
bal.plot()
.
Added support for imp
in bal.tab()
for
weightit
objects.
Fixed bugs when limited variables were present. (One found and fixed by @sumtxt on Github.)
Fixed bug with multiple methods when weights were entered as a list.
Added full support for tibbles.
Examples for weightit
methods in documentation and
vignette now work.
Improved speed and performance.
Added pairwise
option for bal.tab()
with multinomial treatments.
Increased flexibility for displaying balance using
love.plot()
with clustered or multiply imputed
data.
Added imbalanced.only
and disp.bal.tab
options to bal.tab()
.
Fixes to the vignettes. Also, creation of a new vignette to simplify the main one.
Added support for multinomial treatments in
bal.tab()
, including output from CBPS
and
twang
.
Added support for weightit
objects from
WeightIt
, including for multinomial treatments.
Added support for ebalance.trim
objects from
ebal
.
Fixes to the vignette.
Fixes to splitfactor()
to handle tibbles
better.
Fixed bug when using bal.tab()
with multiply imputed
data without adjustment. Fixed bug when using s.weights
with the formula
method of bal.tab()
.
Added disp.ks
and ks.threshold
options
to bal.tab()
to display Kolmogorov-Smirnov statistics
before and after preprocessing.
Added support for sampling weights, which are applied to both
control and treated units, using option s.weights
in
bal.tab()
. Sampling weights are also now compatible with
the sampling weights in ps
objects from twang
;
the default is to apply the sampling weights before and after
adjustment, mimicking the behavior of bal.table()
in
twang
.
Changed behavior of bal.tab()
for ps
objects to allow displaying balance for more than one stop method at a
time, and to default to displaying balance for all available stop
methods. The full.stop.method
argument in
bal.tab()
has been renamed stop.method
, but
full.stop.method
still works. get.w()
for
ps
objects has also gone through some changes to be more
like twang
’s get.weights()
.
Added support in bal.tab()
and
bal.plot()
for subclassification with continuous
treatments.
Added support in splitfactor()
and
unsplitfactor()
for NA
values
Fixed a bug in love.plot()
caused when
var.order
was specified to be a sample that was not
present.
Added support in bal.tab()
, bal.plot()
,
and love.plot()
for examining balance on multiple weight
specifications at a time
Added new utilities splitfactor()
,
unsplitfactor()
, and get.w()
Added option in bal.plot()
to display points sized
by weights when treatment and covariate are continuous
Added which = "both"
option in
bal.plot()
to simultaneously display plots for both
adjusted and unadjusted samples; changed argument syntax to
accommodate
Allowed bal.plot()
to display balance for multiple
clusters and imputations simultaneously
Allowed bal.plot()
to display balance for multiple
subclasses simultaneously with which.sub
Fixes to love.plot()
to ensure adjusted points are
in front of unadjusted points; changed colors and shape defaults and
allowable values
Fixed bug where s.d.denom
and estimand
were not functioning correctly in bal.tab()
distance
, addl
, and
weights
can now be specified as lists of the usual
arguments
Added support for matching using the optmatch
package or by specifying matching strata.
Added full support (bal.tab()
,
love.plot()
, and bal.plot()
) for multiply
imputed data, including for clustered data sets.
Added support for multiple distance measures, including special
treatment in love.plot()
Adjusted specifications in love.plot()
for color and
shape of points, and added option to generate a line connecting the
points.
Adjusted love.plot()
display to perform better on
Windows.
Added capabilities for love.plot()
and
bal.plot()
to display plots for multiple groups at a
time
Added flexibility to f.build()
.
Updated bal.plot()
, giving the capability to view
multiple plots for subclassified or clustered data. Multinomial
treatments are also supported.
Created a new vignette for clustered and multiply imputed data
Speed improvements
Fixed a bug causing mislabelling of categorical variables
Changed calculation of weighted variance to be in line with
recommendations; CBPS
can now be used with standardized
weights
Added support for entropy balancing through the ebal
package.
Changed default color scheme of love.plot()
to be
black and white and added options for color, shape, and size of
points.
Added sample size calculations for continuous treatments.
Edits to the vignette.
Increased capabilities for cluster balance in
bal.tab()
and love.plot()
Increased information and decreased redundancy when assessing balance on interactions
Added quick
option for bal.tab()
to
increase speed
Added options for print()
Bug fixes
Speed improvements
Edits to the vignette
Added support for continuous treatment variables in
bal.tab()
, bal.plot()
, and
love.plot()
Added balance assessment within and across clusters
Other small performance changes to minimize errors and be more intuitive
Major revisions and adjustments to the vignette
Added a vignette.
Fixed error in bal.tab.Match()
that caused wrong
values and and warning messages when used.
Added new capabilities to bal.plot()
, including the
ability to view unadjusted sample distributions, categorical variables
as such, and the distance measure. Also updated documentation to reflect
these changes and make which.sub
more focal.
Allowed subclasses to be different from simply 1:S by treating them like factors once input is numerical
Changed column names in Balance table output to fit more compactly, and updated documentation to reflect these changes.
Other small performance changes to minimize errors and be more intuitive.