This is a simple example of simulation, inference, and prediction with the aphylo
R package.
library(aphylo)
## Loading required package: ape
# Parameter estimates
<- c(.05, .025)
psi <- c(.2, .1)
mu_d <- c(.1, .05)
mu_s <- .5 Pi
set.seed(223)
<- raphylo(n = 200, psi = psi, mu_d = mu_d, mu_s = mu_s, Pi = Pi)
x plot(x)
The simulation function generates an aphylo
class object which is simply a wrapper containing:
phylo
tree (from the ape package), andIf needed, we can export the data as follows:
# Edgelist describing parent->offspring relations
write.csv(x$tree, file = "tree.tree", row.names = FALSE)
# Tip annotations
<- with(x, rbind(tip.annotation, node.annotation))
ann write.csv(ann, file = "annotations.csv", row.names = FALSE)
# Event types
<- with(x, cbind(c(tip.type*NA, node.type)))
events rownames(events) <- 1:nrow(events)
write.csv(events, file = "events.csv", row.names = FALSE)
To fit the data, we can use MCMC as follows:
<- aphylo_mcmc(x ~ psi + mu_d + mu_s + Pi) ans
## Warning: While using multiple chains, a single initial point has been passed via
## `initial`: c(0.1, 0.05, 0.9, 0.5, 0.1, 0.05, 0.5). The values will be recycled.
## Ideally you would want to start each chain from different locations.
## Convergence has been reached with 10000 steps. Gelman-Rubin's R: 1.0184. (500 final count of samples).
ans
##
## ESTIMATION OF ANNOTATED PHYLOGENETIC TREE
##
## Call: aphylo_mcmc(model = x ~ psi + mu_d + mu_s + Pi)
## LogLik: -108.9842
## Method used: mcmc (10000 steps)
## # of Leafs: 200
## # of Functions 1
## # of Trees: 1
##
## Estimate Std. Err.
## psi0 0.0740 0.0572
## psi1 0.0552 0.0393
## mu_d0 0.2205 0.1191
## mu_d1 0.1382 0.1033
## mu_s0 0.1312 0.0443
## mu_s1 0.0629 0.0328
## Pi 0.3477 0.2595
For goodness-of-fit analysis, we have a couple of tools. We can compare the predicted values with the observed values:
plot(ans)
We can also take a look at the surface of the posterior function
plot_logLik(ans)
And we can also take a look at the prediction scores
<- prediction_score(ans)
ps # Printing ps
## Prediction score (H0: Observed = Random)
##
## N obs. : 399
## alpha(0, 1) : 0.40, 0.59
## Observed : 0.71 ***
## Random : 0.52
## P(<t) : 0.0000
## --------------------------------------------------------------------------------
## Values scaled to range between 0 and 1, 1 being best.
##
## Significance levels: *** p < .01, ** p < .05, * p < .10
## AUC 0.85.
## MAE 0.29.
plot(ps) # and plotting