To try out some simulations that don’t match the canonical covariance matrices and illustrate how the data driven matrices help.
Here the function simple_sims_2
simulates data in five conditions with just two types of effect:
shared effects only in the first two conditions; and
shared effects only in the last three conditions.
Run 1-by-1 to add the strong signals and ED covariances.
data = mash_set_data(simdata$Bhat, simdata$Shat)
m.1by1 = mash_1by1(data)
strong = get_significant_results(m.1by1)
U.c = cov_canonical(data)
U.pca = cov_pca(data,5,strong)
U.ed = cov_ed(data,U.pca,strong)
# Computes covariance matrices based on extreme deconvolution,
# initialized from PCA.
m.c = mash(data, U.c)
m.ed = mash(data, U.ed)
m.c.ed = mash(data, c(U.c,U.ed))
m.true = mash(data, U.true)
print(get_loglik(m.c),digits = 10)
print(get_loglik(m.ed),digits = 10)
print(get_loglik(m.c.ed),digits = 10)
print(get_loglik(m.true),digits = 10)
The log-likelihood is much better from data-driven than canonical covariances. This is good! Indeed, here the data-driven fit is very slightly better fit than the true matrices, but only very slightly.