library(scrutiny)
Granularity-related inconsistency of means mapped to error repeats, or GRIMMER, is a test for the mathematical consistency of reported means or proportions with the corresponding standard deviations (SDs) and sample sizes (Anaya 2016; Allard 2018).
GRIMMER builds up on GRIM (Brown and Heathers 2017). Indeed, the elegant Analytic-GRIMMER algorithm (Allard 2018) implemented here tests for GRIM-consistency before conducting its own unique tests.
This vignette covers scrutiny’s implementation of the GRIMMER test.
It’s an adapted version of the GRIM
vignette because both the tests themselves and their implementations
in scrutiny are very similar. If you are familiar with scrutiny’s
grim_*()
functions, much of the present vignette will seem
quite natural to you.
The vignette has the following sections — to get started, though, you only need the first one:
The basic grimmer()
function and a specialized
mapping function, grimmer_map()
.
The audit()
method for summarizing
grimmer_map()
’s results.
The visualization function grim_plot()
, which also
works for GRIMMER.
Testing numeric sequences with
grimmer_map_seq()
.
Handling unknown group sizes with
grimmer_map_total_n()
.
grimmer()
To test if a reported mean of 7.3 on a granular scale is GRIMMER-consistent with an SD of 2.51 and a sample size of 12, run this:
grimmer(x = "7.3", sd = "2.51", n = 12)
#> 7.3
#> FALSE
Note that x
, the reported mean, needs to be a string.
The reason is that strings preserve trailing zeros, which can be crucial
for GRIMMER-testing. Numeric values don’t, and even converting them to
strings won’t help. A workaround for larger numbers of such values,
restore_zeros()
, is discussed in
vignette("wrangling")
.
grimmer()
has some further parameters, but all of them
can be used from within grimmer_map()
. The other parameters
will be discussed in that context because grimmer_map()
is
often the more useful function in practice. Furthermore, although
grimmer()
is vectorized, grimmer_map()
is
safer and more convenient for testing multiple combinations of means,
SDs, and sample sizes.
grimmer_map()
If you want to GRIMMER-test more than a handful of cases, the
recommended way is to enter them into a data frame and to run
grimmer_map()
on the data frame. Two different ways to do
that are discussed in vignette("wrangling")
, but here, I
will only describe an easily accessible solution for a single table.
Copy summary data from a PDF file and paste them into
tibble::tribble()
, which is available via scrutiny:
<- tribble(
flying_pigs1 ~x, ~sd,
"8.9", "2.81",
"2.6", "2.05",
"7.2", "2.89",
"3.6", "3.11",
"9.2", "7.13",
"10.4", "2.53",
"7.3", "3.14"
%>%
) mutate(n = 25)
Use RStudio’s multiple cursors to draw quotation marks around all the
x
and sd
values, and to set commas at the end.
See vignette("wrangling")
, section With copy and
paste, if you are not sure how to do that.
Now, simply run grimmer_map()
on that data frame:
grimmer_map(flying_pigs1)
#> # A tibble: 7 × 5
#> x sd n consistency reason
#> <chr> <chr> <dbl> <lgl> <chr>
#> 1 8.9 2.81 25 FALSE GRIMMER inconsistent (test 3)
#> 2 2.6 2.05 25 TRUE Passed all
#> 3 7.2 2.89 25 TRUE Passed all
#> 4 3.6 3.11 25 FALSE GRIMMER inconsistent (test 3)
#> 5 9.2 7.13 25 TRUE Passed all
#> 6 10.4 2.53 25 TRUE Passed all
#> 7 7.3 3.14 25 TRUE Passed all
The x
and n
columns are the same as in the
input. By default, the number of items
composing the mean
is assumed to be 1. The main result, consistency
, is the
GRIMMER consistency of the former three columns.
The reason
column says why a set of values was
inconsistent. To be GRIMMER-consistent, a value set needs to pass four
separate tests: the three GRIMMER tests by Allard
(2018) and the more basic GRIM test. Here, the two inconsistent
values passed GRIM as well as the first two GRIMMER tests, but failed
the third one. All consistent value sets are marked as
"Passed all"
in the "reason"
column.
If a mean is composed of multiple items, set the items
parameter to that number. Below are hypothetical means of a three-items
scale. With the single-item default, half of these are wrongly flagged
as inconsistent:
<- tribble(
jpap_1 ~x, ~sd,
"5.90", "2.19",
"5.71", "1.42",
"3.50", "1.81",
"3.82", "2.43",
"4.61", "1.92",
"5.24", "2.51",
%>%
) mutate(n = 40)
%>%
jpap_1 grimmer_map() # default is wrong here!
#> # A tibble: 6 × 5
#> x sd n consistency reason
#> <chr> <chr> <dbl> <lgl> <chr>
#> 1 5.90 2.19 40 TRUE Passed all
#> 2 5.71 1.42 40 FALSE GRIM inconsistent
#> 3 3.50 1.81 40 TRUE Passed all
#> 4 3.82 2.43 40 TRUE Passed all
#> 5 4.61 1.92 40 FALSE GRIM inconsistent
#> 6 5.24 2.51 40 FALSE GRIM inconsistent
Yet, all of them are consistent if the correct number of items is stated:
%>%
jpap_1 grimmer_map(items = 3)
#> # A tibble: 6 × 5
#> x sd n consistency reason
#> <chr> <chr> <dbl> <lgl> <chr>
#> 1 5.90 2.19 120 TRUE Passed all
#> 2 5.71 1.42 120 TRUE Passed all
#> 3 3.50 1.81 120 TRUE Passed all
#> 4 3.82 2.43 120 TRUE Passed all
#> 5 4.61 1.92 120 TRUE Passed all
#> 6 5.24 2.51 120 TRUE Passed all
It is also possible to include an items
column in the
data frame instead:
<- tribble(
jpap_2 ~x, ~sd, ~items,
"6.92", "2.19", 1,
"3.48", "1.42", 1,
"1.59", "1.81", 2,
"2.61", "2.43", 2,
"4.04", "1.92", 3,
"4.50", "2.51", 3,
%>%
) mutate(n = 30)
%>%
jpap_2 grimmer_map()
#> # A tibble: 6 × 5
#> x sd n consistency reason
#> <chr> <chr> <dbl> <lgl> <chr>
#> 1 6.92 2.19 30 FALSE GRIM inconsistent
#> 2 3.48 1.42 30 FALSE GRIM inconsistent
#> 3 1.59 1.81 60 FALSE GRIM inconsistent
#> 4 2.61 2.43 60 FALSE GRIM inconsistent
#> 5 4.04 1.92 90 TRUE Passed all
#> 6 4.50 2.51 90 TRUE Passed all
The scrutiny package provides infrastructure for reconstructing
rounded numbers. All of that can be commanded from within
grimmer()
and grimmer_map()
. Several
parameters allow for stating the precise way in which the original
numbers have supposedly been rounded.
First and foremost is rounding
. It takes a string with
the rounding procedure’s name, which leads to the number being rounded
in either of these ways:
"up"
or "down"
from 5. Note that
SAS, SPSS, Stata, Matlab, and Excel round "up"
from 5,
whereas Python used to round "down"
from 5."even"
using base R’s own
round()
."up_from"
or "down_from"
some
number, which then needs to be specified via the threshold
parameter."ceiling"
or "floor"
at the
respective decimal place."trunc"
or away from zero
with "anti_trunc"
.The default, "up_or_down"
, allows for numbers rounded
either "up"
or "down"
from 5 when
GRIMMER-testing; and likewise for "up_from_or_down_from"
and "ceiling_or_floor"
. For more about these procedures,
see documentation for round()
, round_up()
, and
round_ceiling()
. These include all of the above ways of
rounding.
Points 3 to 5 above list some quite obscure options that were only
included to cover a wide spectrum of possible rounding procedures. The
same is true for the threshold
and symmetric
parameters, so these aren’t discussed here any further. Learn more about
scrutiny’s infrastructure for rounding at
vignette("rounding")
.
By default, grimmer()
and grimmer_map()
accept values rounded either up or down from 5. If you have reason to
impose stricter assumptions on the way x
and
sd
were rounded, specify rounding
accordingly.
It might still be important to account for the different ways in
which numbers can be rounded, if only to demonstrate that some given
results are robust to those variable decisions. To err on the side of
caution, the default for rounding
is the permissive
"up_or_down"
.
audit()
Following up on a call to grimmer_map()
, the generic
function audit()
summarizes GRIMMER test results:
%>%
flying_pigs1 grimmer_map() %>%
audit()
#> # A tibble: 1 × 7
#> incons_cases all_cases incons_rate fail_grim fail_test1 fail_test2 fail_test3
#> <int> <int> <dbl> <int> <int> <int> <int>
#> 1 2 7 0.286 0 0 0 2
These columns are —
incons_cases
: number of GRIMMER-inconsistent value
sets.
all_cases
: total number of value sets.
incons_rate
: proportion of GRIMMER-inconsistent
value sets.
fail_grim
, fail_test1
,
fail_test2
, fail_test3
: number of value sets
failing the GRIM test or one of the three GRIMMER tests,
respectively.
grim_plot()
GRIMMER does not currently have a dedicated visualization function in
scrutiny. However, grim_plot()
will accept the output of
grimmer_map()
just as well as that from
grim_map()
:
<- tribble(
jpap_5 ~x, ~sd, ~n,
"7.19", "1.19", 54,
"4.56", "2.56", 66,
"0.42", "1.29", 59,
"1.31", "3.50", 57,
"3.48", "3.65", 66,
"4.27", "2.86", 61,
"6.21", "2.15", 62,
"3.11", "3.17", 50,
"5.39", "2.37", 68,
"5.66", "1.11", 44,
)
%>%
jpap_5 grimmer_map() %>%
grim_plot()
#> → Also visualizing 3 GRIMMER inconsistencies.
However, grim_plot()
will fail with any object not
returned by either of these two functions:
grim_plot(mtcars)
#> Error in `grim_plot()`:
#> ! `data` is not `grim_map()` or `grimmer_map()` output.
#> ✖ `grim_plot()` needs GRIM or GRIMMER test results.
#> ℹ The only exception is an "empty" plot that shows the background raster but no
#> empirical test results. Create such a plot by setting `show_data` to `FALSE`.
See the GRIM
vignette section on grim_plot()
for more
information.
grimmer_map_seq()
GRIMMER analysts might be interested in a mean or percentage value’s
numeric neighborhood. Suppose you found multiple GRIMMER inconsistencies
as in out example pigs5
data. You might wonder whether they
are due to small reporting or computing errors.
Use grimmer_map_seq()
to GRIMMER-test the values
surrounding the reported means and sample sizes:
<- grimmer_map_seq(pigs5)
out_seq1
out_seq1#> # A tibble: 180 × 7
#> x sd n consistency reason case var
#> <chr> <chr> <dbl> <lgl> <chr> <int> <chr>
#> 1 7.17 5.30 38 FALSE GRIM inconsistent 1 x
#> 2 7.18 5.30 38 TRUE Passed all 1 x
#> 3 7.19 5.30 38 FALSE GRIM inconsistent 1 x
#> 4 7.20 5.30 38 FALSE GRIM inconsistent 1 x
#> 5 7.21 5.30 38 TRUE Passed all 1 x
#> 6 7.23 5.30 38 FALSE GRIM inconsistent 1 x
#> 7 7.24 5.30 38 TRUE Passed all 1 x
#> 8 7.25 5.30 38 FALSE GRIM inconsistent 1 x
#> 9 7.26 5.30 38 TRUE Passed all 1 x
#> 10 7.27 5.30 38 FALSE GRIM inconsistent 1 x
#> # … with 170 more rows
#> # ℹ Use `print(n = ...)` to see more rows
audit_seq()
As this output is a little unwieldy, run audit_seq()
on
the results:
audit_seq(out_seq1)
#> # A tibble: 6 × 17
#> x sd n consi…¹ hits_…² hits_x hits_sd hits_n diff_x diff_…³ diff_…⁴
#> <chr> <chr> <dbl> <lgl> <int> <int> <int> <int> <dbl> <dbl> <dbl>
#> 1 7.22 5.30 38 FALSE 8 4 0 4 1 2 -1
#> 2 5.23 2.55 35 FALSE 16 2 10 4 3 3 -3
#> 3 2.57 2.57 30 FALSE 11 1 8 2 3 3 NA
#> 4 6.77 2.18 33 FALSE 6 4 0 2 1 2 -1
#> 5 7.01 6.68 35 FALSE 4 4 0 0 1 2 -1
#> 6 3.14 5.32 33 FALSE 9 4 0 5 1 1 -2
#> # … with 6 more variables: diff_sd <dbl>, diff_sd_up <dbl>, diff_sd_down <dbl>,
#> # diff_n <dbl>, diff_n_up <dbl>, diff_n_down <dbl>, and abbreviated variable
#> # names ¹consistency, ²hits_total, ³diff_x_up, ⁴diff_x_down
#> # ℹ Use `colnames()` to see all variable names
Here is what the output columns mean:
x
and n
are the original inputs,
reconstructed and tested for consistency
here.
hits
is the number of GRIMMER-consistent value
combinations found within the specified dispersion
range.
diff_x
reports the absolute difference between
x
and the next consistent dispersed value (in dispersion
steps, not the actual numeric difference). diff_x_up
and
diff_x_down
report the difference to the next higher or
lower consistent value, respectively.
diff_n
, diff_n_up
, and
diff_n_down
do the same for n
.
The default for dispersion
is 1:5
, for five
steps up and down. When the dispersion
sequence gets
longer, the number of hits tends to increase:
<- grimmer_map_seq(pigs5, dispersion = 1:10)
out_seq2 audit_seq(out_seq2)
#> # A tibble: 6 × 17
#> x sd n consi…¹ hits_…² hits_x hits_sd hits_n diff_x diff_…³ diff_…⁴
#> <chr> <chr> <dbl> <lgl> <int> <int> <int> <int> <dbl> <dbl> <dbl>
#> 1 7.22 5.30 38 FALSE 15 8 0 7 1 2 -1
#> 2 5.23 2.55 35 FALSE 32 6 19 7 3 3 -3
#> 3 2.57 2.57 30 FALSE 24 3 16 5 3 3 NA
#> 4 6.77 2.18 33 FALSE 11 7 0 4 1 2 -1
#> 5 7.01 6.68 35 FALSE 8 8 0 0 1 2 -1
#> 6 3.14 5.32 33 FALSE 14 7 0 7 1 1 -2
#> # … with 6 more variables: diff_sd <dbl>, diff_sd_up <dbl>, diff_sd_down <dbl>,
#> # diff_n <dbl>, diff_n_up <dbl>, diff_n_down <dbl>, and abbreviated variable
#> # names ¹consistency, ²hits_total, ³diff_x_up, ⁴diff_x_down
#> # ℹ Use `colnames()` to see all variable names
It’s curious what happens when we plot the output of
grimmer_map_seq()
. Like regular GRIM or GRIMMER plots,
however, it does give us a sense of how many tested values are
consistent:
grim_plot(out_seq1)
#> → Also visualizing 4 GRIMMER inconsistencies.
The crosses appear because grimmer_map_seq()
creates
sequences around both x
and n
. Restrict this
process to any one of these with the var
argument:
<- grimmer_map_seq(pigs5, var = "x")
out_seq1_only_x <- grimmer_map_seq(pigs5, var = "n")
out_seq1_only_n
grim_plot(out_seq1_only_x)
#> → Also visualizing 1 GRIMMER inconsistency.
grim_plot(out_seq1_only_n)
#> → Also visualizing 1 GRIMMER inconsistency.
grimmer_map_total_n()
Unfortunately, some studies that report group averages don’t report
the corresponding group sizes — only a total sample size. This makes any
direct GRIMMER-testing impossible because only x
values are
known, not n
values. All that is feasible here in terms of
GRIMMER is to take a number around half the total sample size, go up and
down from it, and check which hypothetical group sizes are
consistent with the reported group means.
grimmer_map_total_n()
semi-automates this process,
motivated by a recent GRIM analysis (Bauer and
Francis 2021).
Here is an example:
<- tibble::tribble(
jpap_6 ~x1, ~x2, ~sd1, ~sd2, ~n,
"3.43", "5.28", "1.09", "2.12", 70,
"2.97", "4.42", "0.43", "1.65", 65
)
<- grimmer_map_total_n(jpap_6)
out_total_n
out_total_n#> # A tibble: 48 × 9
#> x sd n n_change consistency both_consistent reason case dir
#> <chr> <chr> <dbl> <dbl> <lgl> <lgl> <chr> <int> <chr>
#> 1 3.43 1.09 35 0 FALSE FALSE GRIMMER i… 1 forth
#> 2 5.28 2.12 35 0 FALSE FALSE GRIM inco… 1 forth
#> 3 3.43 1.09 34 -1 FALSE FALSE GRIM inco… 1 forth
#> 4 5.28 2.12 36 1 FALSE FALSE GRIMMER i… 1 forth
#> 5 3.43 1.09 33 -2 FALSE FALSE GRIM inco… 1 forth
#> 6 5.28 2.12 37 2 FALSE FALSE GRIM inco… 1 forth
#> 7 3.43 1.09 32 -3 FALSE FALSE GRIM inco… 1 forth
#> 8 5.28 2.12 38 3 FALSE FALSE GRIM inco… 1 forth
#> 9 3.43 1.09 31 -4 FALSE FALSE GRIM inco… 1 forth
#> 10 5.28 2.12 39 4 FALSE FALSE GRIMMER i… 1 forth
#> # … with 38 more rows
#> # ℹ Use `print(n = ...)` to see more rows
audit_total_n(out_total_n)
#> # A tibble: 2 × 10
#> x1 x2 sd1 sd2 n hits_total hits_forth hits_back scenar…¹ hit_r…²
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 3.43 5.28 1.09 2.12 70 1 1 0 12 0.0833
#> 2 2.97 4.42 0.43 1.65 65 1 0 1 12 0.0833
#> # … with abbreviated variable names ¹scenarios_total, ²hit_rate
See the GRIM vignette, section Handling unknown group sizes with
grim_map_total_n()
, for a more comprehensive case
study. It uses grim_map_total_n()
, which is the same as
grimmer_map_total_n()
but only for GRIM.