library(makemyprior)
We use a model from Hem et al. (2021); a genomic wheat breeding model with three genetic effects: additive, dominance and epistasis The model is: \[ y_i = \mu + a_i + d_i + x_i + \varepsilon_i, \ i = 1, \dots, 100, \]
where * \(\mu\) is an intercept with a \(\mathcal{N}(0, 1000^2)\) prior, * \(\mathbf{a} = (a_1, \dots, a_{100}) \sim \mathcal{N}_{100}(\mathbf{0}, \sigma_{\mathrm{a}}^2 \mathbf{A})\) is the additive effect, * \(\mathbf{d} = (d_1, \dots, d_{100}) \sim \mathcal{N}_{100}(\mathbf{0}, \sigma_{\mathrm{a}}^2 \mathbf{D})\) is the dominance effect, * \(\mathbf{x} = (x_1, \dots, x_{100}) \sim \mathcal{N}_{100}(\mathbf{0}, \sigma_{\mathrm{a}}^2 \mathbf{X})\) is the epistasis effect, and * \(\varepsilon_i\) is the residual effect with variance \(\sigma_{\varepsilon}^2\).
The covariance matrices \(\mathbf{A}\), \(\mathbf{D}\) and \(\mathbf{X}\) are computed from the single nucleotide polymorphism (SNP) matrix with thousands of genetic markers.
We use the dataset wheat_breeding
in makemyprior
, which consists of indexes for the effects and precision matrices for each of the genetic effects, and present three priors.
We scale the precision matrices so the corresponding covariance matrices have typical variance equal to 1 with scale_precmat
.
<- wheat_data
wheat_data_scaled $Q_a <- scale_precmat(wheat_data$Q_a)
wheat_data_scaled$Q_d <- scale_precmat(wheat_data$Q_d)
wheat_data_scaled$Q_x <- scale_precmat(wheat_data$Q_x)
wheat_data_scaled
<- y ~
formula mc(a, model = "generic0", Cmatrix = Q_a, constr = TRUE) +
mc(d, model = "generic0", Cmatrix = Q_d, constr = TRUE) +
mc(x, model = "generic0", Cmatrix = Q_x, constr = TRUE)
We do not carry out inference, as it takes time and will slow down the compilation of the vignettes by a lot, but include code so the user can run the inference themselves.
We do not want to say anything about the total (also called phenotypic) variance. We have expert knowledge saying that we want shrinkage towards the residual effect to avoid overfitting, and the heritability, which is the amount of phenotypic variance attributed to the genetic effects, is around 0.25. The additive, dominance and epistasis effects get about (85, 10, 5)% of the genetic variance according to the expert. The expert is quite sure about these choices, and we use a concentration parameter \(c\), saying how much of the density mass of the prior such that \(\mathrm{Prob}(\mathrm{logit}(1/4) < \mathrm{logit}(\omega_*) - \mathrm{logit}(m) < \mathrm{logit}(3/4)) = c\) for a variance proportion \(\omega_*\).
This information can be implemented as a prior as follows:
<- make_prior(formula, wheat_data_scaled, prior = list(
prior1 tree = "s1 = (d, x); s2 = (a, s1); s3 = (s2, eps)",
w = list(s1 = list(prior = "pcM", param = c(0.67, 0.8)),
s2 = list(prior = "pcM", param = c(0.85, 0.8)),
s3 = list(prior = "pc0", param = 0.25))))
prior1#> Model: y ~ mc(a, model = "generic0", Cmatrix = Q_a, constr = TRUE) +
#> mc(d, model = "generic0", Cmatrix = Q_d, constr = TRUE) +
#> mc(x, model = "generic0", Cmatrix = Q_x, constr = TRUE)
#> Tree structure: d_x = (d,x); a_d_x = (a,d_x); eps_a_d_x = (eps,a_d_x)
#>
#> Weight priors:
#> w[d/d_x] ~ PCM(0.67, 0.8)
#> w[a/a_d_x] ~ PCM(0.85, 0.8)
#> w[eps/eps_a_d_x] ~ PC1(0.75)
#> Total variance priors:
#> V[eps_a_d_x] ~ Jeffreys'
plot_prior(prior1) # or plot(prior1)
plot_tree_structure(prior1)
Note that we do not fit the model in this vignette, as it takes some time.
<- inference_stan(prior1, iter = 15000, warmup = 5000,
posterior1 chains = 1, seed = 1)
plot_posterior_stan(posterior1, param = "prior", prior = TRUE)
For inference with INLA:
<- inference_inla(prior1)
posterior1_inla plot_posterior_stdev(posterior1_inla)
We have a good intuition on the absolute magnitude of the residual and genetic effects, but since the genetic effects are confounded, we want to use a prior saying that about 80% of the genetic variance is additive effect, and shrinkage towards the additive, and and be ignorant about the division of dominance and epistasis. The residual and genetic effects are both assumed to not be much larger than 3.
<- make_prior(formula, wheat_data_scaled, prior = list(
prior2 tree = "s1 = (d, x); s2 = (a, s1); (eps)",
w = list(s1 = list(prior = "dirichlet"),
s2 = list(prior = "pc1", param = c(0.8))),
V = list(s2 = list(prior = "pc", param = c(3, 0.05)),
eps = list(prior = "pc", param = c(3, 0.05)))))
prior2#> Model: y ~ mc(a, model = "generic0", Cmatrix = Q_a, constr = TRUE) +
#> mc(d, model = "generic0", Cmatrix = Q_d, constr = TRUE) +
#> mc(x, model = "generic0", Cmatrix = Q_x, constr = TRUE)
#> Tree structure: d_x = (d,x); a_d_x = (a,d_x); (eps)
#>
#> Weight priors:
#> w[d/d_x] ~ Dirichlet(2)
#> w[a/a_d_x] ~ PC1(0.8)
#> Total variance priors:
#> sqrt(V)[a_d_x] ~ PC0(3, 0.05)
#> Independent variance priors:
#> sigma[eps] ~ PC0(3, 0.05)
plot_prior(prior2)
plot_tree_structure(prior2)
Again we do not fit the model in this vignette, as it takes some time.
<- inference_stan(prior2, iter = 15000, warmup = 5000,
posterior2 chains = 1, seed = 1)
plot_posterior_stan(posterior2, param = "prior", prior = TRUE)
sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Big Sur 11.3.1
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] makemyprior_1.1.0
#>
#> loaded via a namespace (and not attached):
#> [1] tidyselect_1.1.1 xfun_0.30 bslib_0.3.1 shinyjs_2.1.0
#> [5] purrr_0.3.4 splines_4.1.0 lattice_0.20-44 colorspace_2.0-3
#> [9] vctrs_0.3.8 generics_0.1.1 htmltools_0.5.2 yaml_2.3.5
#> [13] utf8_1.2.2 rlang_1.0.2 jquerylib_0.1.4 pillar_1.7.0
#> [17] later_1.3.0 glue_1.6.2 lifecycle_1.0.1 stringr_1.4.0
#> [21] munsell_0.5.0 gtable_0.3.0 htmlwidgets_1.5.4 visNetwork_2.1.0
#> [25] evaluate_0.15 labeling_0.4.2 knitr_1.37 fastmap_1.1.0
#> [29] httpuv_1.6.5 fansi_1.0.2 highr_0.9 Rcpp_1.0.8.2
#> [33] xtable_1.8-4 scales_1.1.1 promises_1.2.0.1 jsonlite_1.8.0
#> [37] farver_2.1.0 mime_0.12 ggplot2_3.3.5 digest_0.6.29
#> [41] stringi_1.7.6 dplyr_1.0.7 shiny_1.7.1 grid_4.1.0
#> [45] cli_3.2.0 tools_4.1.0 magrittr_2.0.2 sass_0.4.0
#> [49] tibble_3.1.6 crayon_1.5.0 pkgconfig_2.0.3 ellipsis_0.3.2
#> [53] MASS_7.3-54 Matrix_1.3-3 rmarkdown_2.13 rstudioapi_0.13
#> [57] R6_2.5.1 compiler_4.1.0