Latin square experiment

Model

We consider a latin square experiment, with a 9x9 latin square design, following the procedure of Fuglstad et al. (2020). We assume we have the following model: \[ y_{i,j} = \alpha + \beta \cdot k[i,j] + a_i + b_j + c_{k[i,j]} + \varepsilon_{i,j}, \quad i,j = 1, \dots, 9, \] where

\(a\) is an intercept with \(\alpha \sim \mathcal{N}(0, 1000^2)\),
\(b \cdot k[i,j]\) is a linear effect of treatment and \(\beta \sim \mathcal{N}(0, 1000^2)\),
\(\mathbf{a} = (a_1, \dots, a_9)^\top \sim \mathcal{N}_9(\mathbf{0}, \sigma_{a}^2 \mathbf{I}_9)\) is a row effect,
\(\mathbf{b} = (b_1, \dots, b_9)^\top \sim \mathcal{N}_9(\mathbf{0}, \sigma_{b}^2 \mathbf{I}_9)\) is a column effect,
\(\mathbf{\varepsilon} = (\varepsilon_{1,1},\varepsilon_{1,2} \dots, \varepsilon_{9,9})^\top \sim \mathcal{N}_{81}(\mathbf{0}, \sigma_{\varepsilon}^2 \mathbf{I}_{81})\) is residual noise,
the treatment effect consists of:
- a smooth signal \(\mathbf{c}^{(1)} = (c_1^{(1)}, \dots, c_9^{(1)}) \sim (\mathbf{0}, \sigma_{c^{(1)}}^2 \mathbf{Q}_{\mathrm{RW2}}^{-1})\) where \(\sigma_{c^{(1)}}^2\) is the variance and \(\mathbf{Q}_{\mathrm{RW2}}^{-1}\) is the covariance matrix describing the intrinsic second-order random walk, and
- random noise \(\mathbf{c}^{(2)} = (c_1^{(2)}, \dots, c_9^{(2)}) \sim \mathcal{N}_9(\mathbf{0}, \sigma_{c^{(2)}}^2 \mathbf{I}_9)\).

We remove implicit intercept and linear effect by requiring \(\sum_{i=1}^9 c_i^{(1)} = 0\) and \(\sum_{i=1}^9 i c_i^{(1)} = 0\).

The model is specified as:

formula <- y ~ lin + mc(row) + mc(col) + mc(iid, constr = TRUE) +
  mc(rw2, model = "rw2", constr = TRUE, lin_constr = TRUE)

We use the dataset latin_square in makemyprior, and present three priors. We do not carry out inference, as it takes time and will slow down the compilation of the vignettes by a lot, but include code so the user can carry out the inference themselves.

Prior 1

We want to avoid overfitting of the model, and use a prior with shrinkage towards the residuals in the top split with median of \(0.25\). We do not have any preference for the attribution of the row, column and treatment effects, and use an ignorant Dirichlet prior for the middle split. In the bottom split we again we want to avoid overfitting, and use a prior with shrinkage towards the unstructured treatment effect and a median corresponding to 75% unstructured treatment effect. We do not want to say anything about the scale of the total variance, and use the default Jeffreys’ prior.

prior1 <- make_prior(
  formula, latin_data,
  prior = list(tree = "s1 = (iid, rw2); s2 = (row, col, s1); s3 = (s2, eps)",
               w = list(s1 = list(prior = "pc1", param = 0.75),
                        s2 = list(prior = "dirichlet"),
                        s3 = list(prior = "pc0", param = 0.25))))

prior1
#> Model: y ~ lin + mc(row) + mc(col) + mc(iid, constr = TRUE) + mc(rw2, 
#>     model = "rw2", constr = TRUE, lin_constr = TRUE)
#> Tree structure: iid_rw2 = (iid,rw2); row_col_iid_rw2 = (row,col,iid_rw2); eps_row_col_iid_rw2 = (eps,row_col_iid_rw2)
#> 
#> Weight priors:
#>  w[iid/iid_rw2] ~ PC1(0.75)
#>  (w[row/row_col_iid_rw2], w[col/row_col_iid_rw2]) ~ Dirichlet(3)
#>  w[eps/eps_row_col_iid_rw2] ~ PC1(0.75)
#> Total variance priors:
#>  V[eps_row_col_iid_rw2] ~ Jeffreys'

plot_prior(prior1) # or plot(prior)

plot_tree_structure(prior1)

Inference can be carried out by running:

posterior1 <- inference_stan(prior1, iter = 15000, warmup = 5000,
                            seed = 1, init = "0", chains = 1)
plot_posterior_stan(posterior1, param = "prior", prior = TRUE)

For inference with INLA we need to include the linear constraint in a different way:

formula_inla <- y ~ lin + mc(row) + mc(col) + mc(iid, constr = TRUE) +
  mc(rw2, model = "rw2", constr = TRUE, extraconstr = list(A = matrix(1:9, 1, 9), e = matrix(0, 1, 1)))
prior1_inla <- make_prior(
  formula_inla, latin_data,
  prior = list(tree = "s1 = (iid, rw2); s2 = (row, col, s1); s3 = (s2, eps)",
               w = list(s1 = list(prior = "pc1", param = 0.75),
                        s2 = list(prior = "dirichlet"),
                        s3 = list(prior = "pc0", param = 0.25))))

posterior1_inla <- inference_inla(prior1_inla)
plot_posterior_stdev(posterior1_inla)

Prior 2

We can imagine we do not not want to include the residual effect in the tree with the row, column and treatment effects, and assume we have prior knowledge that the latent variance \(\sigma_{a}^2 + \sigma_{b}^2 + \sigma_{c^{(1)}}^2 + \sigma_{c^{(2)}}^2\) is not not larger than \(0.25\), and use a PC prior for variance that fulfills \(\text{P}(\text{total st.dev.} > sqrt(0.2)) = 0.05\). We assume we have knowledge saying that the same is true for the residual variance. We assume the latent variance is distributed in the same way as in Prior 1.

prior2 <- make_prior(
  formula, latin_data,
  prior = list(tree = "s1 = (iid, rw2); s2 = (row, col, s1); (eps)",
               w = list(s1 = list(prior = "pc1", param = 0.75),
                        s2 = list(prior = "dirichlet")),
               V = list(s2 = list(prior = "pc", param = c(sqrt(0.2), 0.05)),
                        eps = list(prior = "pc", param = c(sqrt(0.2), 0.05)))))

prior2
#> Model: y ~ lin + mc(row) + mc(col) + mc(iid, constr = TRUE) + mc(rw2, 
#>     model = "rw2", constr = TRUE, lin_constr = TRUE)
#> Tree structure: iid_rw2 = (iid,rw2); row_col_iid_rw2 = (row,col,iid_rw2); (eps)
#> 
#> Weight priors:
#>  w[iid/iid_rw2] ~ PC1(0.75)
#>  (w[row/row_col_iid_rw2], w[col/row_col_iid_rw2]) ~ Dirichlet(3)
#> Total variance priors:
#>  sqrt(V)[row_col_iid_rw2] ~ PC0(0.447213595499958, 0.05)
#> Independent variance priors:
#>  sigma[eps] ~ PC0(0.447213595499958, 0.05)

plot_prior(prior2) # or plot(prior2)

plot_tree_structure(prior2)

Inference can be carried out by running:

posterior2 <- inference_stan(prior2, iter = 15000, warmup = 5000,
                            seed = 1, init = "0", chains = 1)
plot_posterior_stan(posterior2)

Prior 3

In the last prior, we use component-wise priors on row, column, treatment and residual variance, but use a PC prior with shrinkage towards the unstructured treatment effect to avoid overfitting. We use a strong prior, and say that that the all variances are not much larger than \(0.1\).

prior3 <- make_prior(
  formula, latin_data,
  prior = list(tree = "(row); (col); (iid); (rw2); (eps)",
               V = list(
                 row = list(prior = "pc", param = c(sqrt(0.1), 0.05)),
                 col = list(prior = "pc", param = c(sqrt(0.1), 0.05)),
                 iid = list(prior = "pc", param = c(sqrt(0.1), 0.05)),
                 rw2 = list(prior = "pc", param = c(sqrt(0.1), 0.05)),
                 eps = list(prior = "pc", param = c(sqrt(0.1), 0.05))
               )))

prior3
#> Model: y ~ lin + mc(row) + mc(col) + mc(iid, constr = TRUE) + mc(rw2, 
#>     model = "rw2", constr = TRUE, lin_constr = TRUE)
#> Tree structure: (row); (col); (iid); (rw2); (eps)
#> 
#> Independent variance priors:
#>  sigma[row] ~ PC0(0.316227766016838, 0.05)
#>  sigma[col] ~ PC0(0.316227766016838, 0.05)
#>  sigma[iid] ~ PC0(0.316227766016838, 0.05)
#>  sigma[rw2] ~ PC0(0.316227766016838, 0.05)
#>  sigma[eps] ~ PC0(0.316227766016838, 0.05)

plot_prior(prior3) # or plot(prior3)

plot_tree_structure(prior3)

Inference can be carried out by running:

posterior3 <- inference_stan(prior3, iter = 15000, warmup = 5000,
                            seed = 1, init = "0", chains = 1)
plot_posterior_stan(posterior3)

sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Big Sur 11.3.1
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] makemyprior_1.1.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_1.1.1  xfun_0.30         bslib_0.3.1       shinyjs_2.1.0    
#>  [5] purrr_0.3.4       splines_4.1.0     lattice_0.20-44   colorspace_2.0-3 
#>  [9] vctrs_0.3.8       generics_0.1.1    htmltools_0.5.2   yaml_2.3.5       
#> [13] utf8_1.2.2        rlang_1.0.2       jquerylib_0.1.4   pillar_1.7.0     
#> [17] later_1.3.0       glue_1.6.2        lifecycle_1.0.1   stringr_1.4.0    
#> [21] munsell_0.5.0     gtable_0.3.0      htmlwidgets_1.5.4 visNetwork_2.1.0 
#> [25] evaluate_0.15     labeling_0.4.2    knitr_1.37        fastmap_1.1.0    
#> [29] httpuv_1.6.5      fansi_1.0.2       highr_0.9         Rcpp_1.0.8.2     
#> [33] xtable_1.8-4      scales_1.1.1      promises_1.2.0.1  jsonlite_1.8.0   
#> [37] farver_2.1.0      mime_0.12         ggplot2_3.3.5     digest_0.6.29    
#> [41] stringi_1.7.6     dplyr_1.0.7       shiny_1.7.1       grid_4.1.0       
#> [45] cli_3.2.0         tools_4.1.0       magrittr_2.0.2    sass_0.4.0       
#> [49] tibble_3.1.6      crayon_1.5.0      pkgconfig_2.0.3   ellipsis_0.3.2   
#> [53] MASS_7.3-54       Matrix_1.3-3      rmarkdown_2.13    rstudioapi_0.13  
#> [57] R6_2.5.1          compiler_4.1.0

Latin square experiment

Ingeborg Gullikstad Hem (ingeborg.hem@ntnu.no)

Model

Prior 1

Prior 2

Prior 3

References