generate()
errors when columns are named
x
(#431).visualize
when passed
generate()
d infer_dist
objects that had not
been passed to hypothesize()
(#432).visualize
output to align with
the R 4.1.0+ graphics engine (#438).specify()
and wrapper functions now appropriately
handle ordered factors (#439).generate()
unexpected type
warnings to be more permissive—the warning will be raised less often
when type = "bootstrap"
(#425).stats::chisq.test
via ...
in calculate()
. Ellipses are now
always passed to the applicable base R hypothesis testing function, when
applicable (#414)!success
by default) is TRUE
. Core verbs have
warned without an explicit success
value already, and this
change makes behavior consistent with the functions being wrapped by
shorthand test wrappers (#440).stat = "ratio of means"
(#452).This release reflects the infer version accepted to the Journal of Open Source Software.
LICENSE
and LICENSE.md
files./figs/paper
.v1.0.0 is the first major release of the {infer} package! By and
large, the core verbs specify()
,
hypothesize()
, generate()
, and
calculate()
will interface as they did before. This release
makes several improvements to behavioral consistency of the package and
introduces support for theory-based inference as well as
randomization-based inference with multiple explanatory variables.
A major change to the package in this release is a set of standards
for behavorial consistency of calculate()
(#356). Namely,
the package will now
stat
argument isn’t well-defined for the variables
specify()
d%>%
gss specify(response = hours) %>%
calculate(stat = "diff in means")
#> Error: A difference in means is not well-defined for a
#> numeric response variable (hours) and no explanatory variable.
or
%>%
gss specify(college ~ partyid, success = "degree") %>%
calculate(stat = "diff in props")
#> Error: A difference in proportions is not well-defined for a dichotomous categorical
#> response variable (college) and a multinomial categorical explanatory variable (partyid).
hypothesize()
to calculate()
an observed statistic# supply mu = 40 when it's not needed
%>%
gss specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "mean")
#> Message: The point null hypothesis `mu = 40` does not inform calculation of
#> the observed statistic (a mean) and will be ignored.
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 41.4
and
# don't hypothesize `p` when it's needed
%>%
gss specify(response = sex, success = "female") %>%
calculate(stat = "z")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 -1.16
#> Warning message:
#> A z statistic requires a null hypothesis to calculate the observed statistic.
#> Output assumes the following null value: `p = .5`.
or
# don't hypothesize `p` when it's needed
%>%
gss specify(response = partyid) %>%
calculate(stat = "Chisq")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 334.
#> Warning message:
#> A chi-square statistic requires a null hypothesis to calculate the observed statistic.
#> Output assumes the following null values: `p = c(dem = 0.2, ind = 0.2, rep = 0.2, other = 0.2, DK = 0.2)`.
To accommodate this behavior, a number of new calculate
methods were added or improved. Namely:
calculate()
with stat = "t"
by
passing mu
to the calculate()
method for
stat = "t"
to allow for calculation of t
statistics for one numeric variable with hypothesized meancalculate()
to allow lowercase aliases for
stat
arguments (#373).calculate()
for to allow for programmatic
calculation of statisticsThis behavorial consistency also allowed for the implementation of
observe()
, a wrapper function around
specify()
, hypothesize()
, and
calculate()
, to calculate observed statistics. The function
provides a shorthand alternative to calculating observed statistics from
data:
# calculating the observed mean number of hours worked per week
%>%
gss observe(hours ~ NULL, stat = "mean")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 41.4
# equivalently, calculating the same statistic with the core verbs
%>%
gss specify(response = hours) %>%
calculate(stat = "mean")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 41.4
# calculating a t statistic for hypothesized mu = 40 hours worked/week
%>%
gss observe(hours ~ NULL, stat = "t", null = "point", mu = 40)
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 2.09
# equivalently, calculating the same statistic with the core verbs
%>%
gss specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "t")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 2.09
We don’t anticipate that these changes are “breaking” in the sense that code that previously worked will continue to, though it may now message or warn in a way that it did not used to or error with a different (and hopefully more informative) message.
This release also introduces a more complete and principled interface
for theoretical inference. While the package previously supplied some
methods for visualization of theory-based curves, the interface did not
provide any object that was explicitly a “null distribution” that could
be supplied to helper functions like get_p_value()
and
get_confidence_interval()
. The new interface is based on a
new verb, assume()
, that returns a null distribution that
can be interfaced in the same way that simulation-based null
distributions can be interfaced with.
As an example, we’ll work through a full infer pipeline for inference
on a mean using infer’s gss
dataset. Supposed that we
believe the true mean number of hours worked by Americans in the past
week is 40.
First, calculating the observed t
-statistic:
<- gss %>%
obs_stat specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "t")
obs_stat#> Response: hours (numeric)
#> Null Hypothesis: point
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 2.09
The code to define the null distribution is very similar to that
required to calculate a theorized observed statistic, switching out
calculate()
for assume()
and replacing
arguments as needed.
<- gss %>%
null_dist specify(response = hours) %>%
assume(distribution = "t")
null_dist #> A T distribution with 499 degrees of freedom.
This null distribution can now be interfaced with in the same way as a simulation-based null distribution elsewhere in the package. For example, calculating a p-value by juxtaposing the observed statistic and null distribution:
get_p_value(null_dist, obs_stat, direction = "both")
#> # A tibble: 1 x 1
#> p_value
#> <dbl>
#> 1 0.0376
…or visualizing the null distribution alone:
visualize(null_dist)
…or juxtaposing the two visually:
visualize(null_dist) +
shade_p_value(obs_stat, direction = "both")
Confidence intervals lie in data space rather than the standardized
scale of the theoretical distributions. Calculating a mean rather than
the standardized t
-statistic:
<- gss %>%
obs_mean specify(response = hours) %>%
calculate(stat = "mean")
The null distribution here just defines the spread for the standard error calculation.
<-
ci get_confidence_interval(
null_dist,level = .95,
point_estimate = obs_mean
)
ci#> # A tibble: 1 x 2
#> lower_ci upper_ci
#> <dbl> <dbl>
#> 1 40.1 42.7
Visualizing the confidence interval results in the theoretical distribution being recentered and rescaled to align with the scale of the observed data:
visualize(null_dist) +
shade_confidence_interval(ci)
Previous methods for interfacing with theoretical distributions are
superseded—they will continue to be supported, though documentation will
forefront the assume()
interface.
The 2016 “Guidelines for Assessment and Instruction in Statistics
Education” [1] state that, in introductory statistics courses,
“[s]tudents should gain experience with how statistical models,
including multivariable models, are used.” In line with this
recommendation, we introduce support for randomization-based inference
with multiple explanatory variables via a new fit.infer
core verb.
If passed an infer
object, the method will parse a
formula out of the formula
or response
and
explanatory
arguments, and pass both it and
data
to a stats::glm
call.
%>%
gss specify(hours ~ age + college) %>%
fit()
#> # A tibble: 3 x 2
#> term estimate
#> <chr> <dbl>
#> 1 intercept 40.6
#> 2 age 0.00596
#> 3 collegedegree 1.53
Note that the function returns the model coefficients as
estimate
rather than their associated
t
-statistics as stat
.
If passed a generate()
d object, the model will be fitted
to each replicate.
%>%
gss specify(hours ~ age + college) %>%
hypothesize(null = "independence") %>%
generate(reps = 100, type = "permute") %>%
fit()
#> # A tibble: 300 x 3
#> # Groups: replicate [100]
#> replicate term estimate
#> <int> <chr> <dbl>
#> 1 1 intercept 44.4
#> 2 1 age -0.0767
#> 3 1 collegedegree 0.121
#> 4 2 intercept 41.8
#> 5 2 age 0.00344
#> 6 2 collegedegree -1.59
#> 7 3 intercept 38.3
#> 8 3 age 0.0761
#> 9 3 collegedegree 0.136
#> 10 4 intercept 43.1
#> # … with 290 more rows
If type = "permute"
, a set of unquoted column names in
the data to permute (independently of each other) can be passed via the
variables
argument to generate
. It defaults to
only the response variable.
%>%
gss specify(hours ~ age + college) %>%
hypothesize(null = "independence") %>%
generate(reps = 100, type = "permute", variables = c(age, college)) %>%
fit()
#> # A tibble: 300 x 3
#> # Groups: replicate [100]
#> replicate term estimate
#> <int> <chr> <dbl>
#> 1 1 intercept 39.4
#> 2 1 age 0.0748
#> 3 1 collegedegree -2.98
#> 4 2 intercept 42.8
#> 5 2 age -0.0190
#> 6 2 collegedegree -1.83
#> 7 3 intercept 40.4
#> 8 3 age 0.0354
#> 9 3 collegedegree -1.31
#> 10 4 intercept 40.9
#> # … with 290 more rows
This feature allows for more detailed exploration of the effect of disrupting the correlation structure among explanatory variables on outputted model coefficients.
Each of the auxillary functions get_p_value()
,
get_confidence_interval()
, visualize()
,
shade_p_value()
, and
shade_confidence_interval()
have methods to handle
fit()
output! See their help-files for example usage. Note
that shade_*
functions now delay evaluation until they are
added to an existing ggplot (e.g. that outputted by
visualize()
) with +
.
generate()
type
type = "simulate"
has been renamed to the more evocative
type = "draw"
. We will continue to support
type = "simulate"
indefinitely, though supplying that
argument will now prompt a message notifying the user of its preferred
alias. (#233, #390)specify()
will now drop unused factor levels and message
that it has done so. (#374, #375, #397, #380)two.sided
as an acceptable alias for
two_sided
for the direction
argument in
get_p_value()
and shade_p_value()
. (#355)We don’t anticipate that any changes made in this release are “breaking” in the sense that code that previously worked will continue to, though it may now message or warn in a way that it did not used to or error with a different (and hopefully more informative) message. If you currently teach or research with infer, we recommend re-running your materials and noting any changes in messaging and warning.
GENERATION_TYPES
object is now fully deprecated, and
arguments that were relocated from visualize()
to
shade_p_value()
and
shade_confidence_interval()
are now fully deprecated in
visualize()
. If supplied a deprecated argument,
visualize()
will warn the user and ignore the
argument.prop
argument to
rep_slice_sample()
as an alternative to the n
argument for specifying the proportion of rows in the supplied data to
sample per replicate (#361, #362, #363). This changes order of arguments
of rep_slice_sample()
(in order to be more aligned with
dplyr::slice_sample()
) which might break code if it didn’t
use named arguments (like rep_slice_sample(df, 5, TRUE)
).
To fix this, use named arguments (like
rep_slice_sample(df, 5, replicate = TRUE)
).[1]: GAISE College Report ASA Revision Committee, “Guidelines for Assessment and Instruction in Statistics Education College Report 2016,” http://www.amstat.org/education/gaise.
rep_sample_n()
no longer errors when supplied a
prob
argument (#279)rep_slice_sample()
, a light wrapper around
rep_sample_n()
, that more closely resembles
dplyr::slice_sample()
(the function that supersedes
dplyr::sample_n()
) (#325)success
, correct
, and
z
argument to prop_test()
(#343, #347,
#353)get_confidence_interval()
now uses column names
(‘lower_ci’ and ‘upper_ci’) in output that are consistent with other
infer functionality (#317).get_confidence_interval()
can now produce
bias-corrected confidence intervals by setting
type = "bias-corrected"
. Thanks to @davidbaniadam for the initial
implementation (#237, #318)!chi_squared
and anova
(#268)hypothesize()
(hypothesise()
) (#271)order
argument (#275, #281)gss
dataset used
in examples (#282)stat = "ratio of props"
and
stat = "odds ratio"
to calculate
(#285)prop_test()
, a tidy interface to
prop.test()
(#284, #287)visualize()
for compatibility with
ggplot2
v3.3.0 (#289)dplyr
v1.0.0generate()
when response variable is named
x
(#299)two-sided
and two sided
as aliases for
two_sided
for the direction
argument in
get_p_value()
and shade_p_value()
(#302)t_test()
and t_stat()
ignoring the
order
argument (#310)shade_confidence_interval()
now plots vertical lines
starting from zero (previously - from the bottom of a plot) (#234).shade_p_value()
now uses “area under the curve”
approach to shading (#229).chisq_test()
to take arguments in a
response/explanatory format, perform goodness of fit tests, and default
to the approximation approach (#241).chisq_stat()
to do goodness of fit (#241).hypothesize()
clearer by adding the
options for the point null parameters to the function signature
(#242).infer
class more systematically (#219).vdiffr
for plot testing (#221).get_pvalue()
and
visualize()
more aligned (#205).p_value()
(use get_p_value()
instead) (#180).conf_int()
(use
get_confidence_interval()
instead) (#180).visualize()
(use new functions
shade_p_value()
and
shade_confidence_interval()
instead) (#178).shade_p_value()
- {ggplot2}-like layer function to add
information about p-value region to visualize()
output. Has
alias shade_pvalue()
.shade_confidence_interval()
- {ggplot2}-like layer
function to add information about confidence interval region to
visualize()
output. Has alias shade_ci()
.NULL
value in left hand side of formula in
specify()
(#156) and type
in
generate()
(#157).set_params()
(#165).calculate()
to not depend on order of
p
for type = "simulate"
(#122).visualize()
to not depend on
method and data volume.visualize()
work for “One sample t” theoretical
type with method = "both"
.stat = "sum"
and stat = "count"
options to calculate()
(#50).t_stat()
to use ...
so
var.equal
worksvar.equal = TRUE
for
specify() %>% calculate(stat = "t")
paste()
handling (#155)conf_int
logical argument and
conf_level
argument to t_test()
shade_color
argument in
visualize()
to be pvalue_fill
instead since
fill color for confidence intervals is also added nowvisualize()
direction = "between"
to get the green shadingconf_int()
function for computing
confidence interval provided a simulation-based method with a
stat
variable
get_ci()
and get_confidence_interval()
are
aliases for conf_int()
get_ci()
insteadp_value()
function for computing p-value
provided a simulation-based method with a stat
variable
get_pvalue()
is an alias for
p_value()
get_pvalue()
insteadparams
being set in hypothesize
with
specify() %>% calculate()
shortcuttype
argument automatically in
generate()
based on specify()
and
hypothesize()
type
is given differently than
expectedspecify() %>% calculate()
for getting
observed statistics.
visualize()
works with either a 1x1 data frame or a
vector for its obs_stat
argumentstat = "t"
workingcalculate()
into smaller functions to reduce
complexitymu
is given in
hypothesize()
but stat = "median"
is provided
in calculate()
and other similar mis-specificationschisq_stat()
and t_stat()
to match
with specify() %>% calculate()
framework
formula
order
argument to t_stat()
t_test()
by passing
in the mu
argument to t.test
from
hypothesize()
pkgdown
page to include ToDo’s using {dplyr}
example!!
instead of UQ()
since
UQ()
is deprecated in {rlang} 0.2.0CONDUCT.md
,
CONTRIBUTING.md
, and TO-DO.md
t_test()
and
chisq_test()
that use a formula interface and provide an
intuitive wrapper to t.test()
and
chisq.test()
stat = "z"
and stat = "t"
optionsvisualize()
to prescribe
colors to shade and use for observed statistics and theoretical density
curvesvisualize()
if number of unique values for generated
statistics is smallmethod = "theoretical"
method = "randomization"
to
method = "simulation"
visualize()
alone
and as overlay with current implementations being
order
argument in calculate()
specify()
.pkgdown
site materials