Let’s compare three models: GLM and GBMs with 100 and 500 trees. For each we create explainer from DALEX package.
library(gbm)
library(DALEX)
library(dplyr)
model_gbm100 <- gbm(m2.price ~ ., data = apartments, n.trees = 100)
expl_gbm100 <- explain(
model_gbm100,
data = apartments,
y = apartments$m2.price,
label = "gbm [100 trees]"
)
model_gbm500 <- gbm(m2.price ~ ., data = apartments, n.trees = 500)
expl_gbm500 <- explain(
model_gbm500,
data = apartments,
y = apartments$m2.price,
label = "gbm [500 trees]"
)
model_glm <- glm(m2.price ~ ., data = apartments)
expl_glm <- explain(model_glm, data = apartments, y = apartments$m2.price)
Plots for static Arena are pre-caluclated and it takes time and file size. For example we will take only apartments from 2009 or newer. Random sample is also good.
There are two ways of add new observations or new models without recalcualating already generated plots. Let’s add apartments built in 2008. It’s similar for models.
observations2 <- apartments %>% filter(construction.year == 2008)
# Observations' names are taken from rownames
rownames(observations2) <- paste0(
observations2$district,
" ",
observations2$surface,
"m2 "
)
We can add observations to already existing arena object and call
arena_upload()
.
Sometimes we don’t want to close Arena session and just add data.
There is argument in arena_upload
function to do that.
Remember to append new arena object and to push all models and all
observations that are required to plots you want to append.