library(arenar)
apartments <- DALEX::apartments
head(apartments)
Let's compare three models: GLM and GBMs with 100 and 500 trees. For each we create explainer from DALEX package.
library(gbm)
library(DALEX)
library(dplyr)
model_gbm100 <- gbm(m2.price ~ ., data = apartments, n.trees = 100)
expl_gbm100 <- explain(
model_gbm100,
data = apartments,
y = apartments$m2.price,
label = "gbm [100 trees]"
)
model_gbm500 <- gbm(m2.price ~ ., data = apartments, n.trees = 500)
expl_gbm500 <- explain(
model_gbm500,
data = apartments,
y = apartments$m2.price,
label = "gbm [500 trees]"
)
model_glm <- glm(m2.price ~ ., data = apartments)
expl_glm <- explain(model_glm, data = apartments, y = apartments$m2.price)
Plots for static Arena are pre-caluclated and it takes time and file size. For example we will take only apartments from 2009 or newer. Random sample is also good.
observations <- apartments %>% filter(construction.year >= 2009)
# Observations' names are taken from rownames
rownames(observations) <- paste0(
observations$district,
" ",
observations$surface,
"m2 "
)
arena <- create_arena() %>%
# Pushing explainers for each models
push_model(expl_gbm100) %>%
push_model(expl_gbm500) %>%
push_model(expl_glm) %>%
# Push dataframe of observations
push_observations(observations) %>%
# Upload calculated arena files to Gist and open Arena in browser
upload_arena()
There are two ways of add new observations or new models without recalcualating already generated plots. Let's add apartments built in 2008. It's similar for models.
observations2 <- apartments %>% filter(construction.year == 2008)
# Observations' names are taken from rownames
rownames(observations2) <- paste0(
observations2$district,
" ",
observations2$surface,
"m2 "
)
We can add observations to already existing arena object and call
arena_upload()
.
arena %>%
push_observations(observations2) %>%
upload_arena()
Sometimes we don't want to close Arena session and just add data. There is
argument in arena_upload
function to do that. Remember to append new arena
object and to push all models and all observations that are required to plots
you want to append.
create_arena() %>%
push_observations(arena_push_observations2) %>%
push_model(expl_glm) %>%
push_model(expl_gbm100) %>%
push_model(expl_gbm500) %>%
upload_arena(append_data = TRUE)