The shapper
is an R package which ports the
shap
python library in R. For details and examples see shapper repository on
github and shapper website.
SHAP (SHapley Additive exPlanations) is a method to explain predictions of any machine learning model. For more details about this method see shap repository on github.
The example usage is presented on the titanic
dataset
form the R package DALEX
.
library("DALEX")
titanic_train <- titanic[,c("survived", "class", "gender", "age", "sibsp", "parch", "fare", "embarked")]
titanic_train$survived <- factor(titanic_train$survived)
titanic_train$gender <- factor(titanic_train$gender)
titanic_train$embarked <- factor(titanic_train$embarked)
titanic_train <- na.omit(titanic_train)
head(titanic_train)
library("randomForest")
set.seed(123)
model_rf <- randomForest(survived ~ . , data = titanic_train)
model_rf
Let’s assume that we want to explain the prediction of a particular observation (male, 8 years old, traveling 1-st class embarked at C, without parents and siblings.
new_passanger <- data.frame(
class = factor("1st", levels = c("1st", "2nd", "3rd", "deck crew", "engineering crew", "restaurant staff", "victualling crew")),
gender = factor("male", levels = c("female", "male")),
age = 8,
sibsp = 0,
parch = 0,
fare = 72,
embarked = factor("Cherbourg", levels = c("Belfast", "Cherbourg", "Queenstown", "Southampton"))
)
To use the function shap()
function (alias for
individual_variable_effect()
) we need four elements
The shap()
function can be used directly with these four
arguments, but for the simplicity here we are using the DALEX
package with preimplemented predict functions.
library("DALEX")
exp_rf <- explain(model_rf, data = titanic_train[,-1], y = as.numeric(titanic_train[,1])-1)
The explainer is an object that wraps up a model and meta-data. Meta data consists of, at least, the data set used to fit model and observations to explain.
And now it’s enough to generate SHAP attributions with explainer for RF model.