--- title: "How to use shapper for classification" author: "Alicja Gosiewska" date: "`r Sys.Date()`" output: html_document: toc: true toc_float: true number_sections: true vignette: > %\VignetteEngine{knitr::knitr} %\VignetteIndexEntry{How to use shapper for classification} %\usepackage[UTF-8]{inputenc} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE, eval = FALSE) ``` # Introduction The `shapper` is an R package which ports the `shap` python library in R. For details and examples see [shapper repository on github](https://github.com/ModelOriented/shapper) and [shapper website](https://modeloriented.github.io/shapper/). SHAP (SHapley Additive exPlanations) is a method to explain predictions of any machine learning model. For more details about this method see [shap repository on github](https://github.com/slundberg/shap). ## Python library shap To run shapper python library shap is required. It can be installed both by python or R. To install it throught R, you an use function `install_shap` from the `shapper` package. ```{r eval = FALSE} shapper::install_shap() ``` # Load data sets The example usage is presented on the `HR` dataset from the R package `DALEX`. For more details see [DALEX github repository](https://github.com/ModelOriented/DALEX). ```{r} library("DALEX") Y_train <- HR$status x_train <- HR[ , -6] ``` # Let's build models ```{r} library("randomForest") set.seed(123) model_rf <- randomForest(x = x_train, y = Y_train) library(rpart) model_tree <- rpart(status~. , data = HR) ``` # Here shapper starts First step is to create an explainer for each model. The explainer is an object that wraps up a model and meta-data. ```{r} library(shapper) p_function <- function(model, data) predict(model, newdata = data, type = "prob") ive_rf <- individual_variable_effect(model_rf, data = x_train, predict_function = p_function, new_observation = x_train[1:2,], nsamples = 50) ive_tree <- individual_variable_effect(model_tree, data = x_train, predict_function = p_function, new_observation = x_train[1:2,], nsamples = 50) ``` ```{r} ive_rf ``` # Plotting results ```{r} plot(ive_rf, bar_width = 4) ``` [](https://modeloriented.github.io/shapper/articles/shapper_classification_files/figure-html/plot1-1.png) To see only attributions use option `show_predicted = FALSE`. ```{r} plot(ive_rf, show_predicted = FALSE, bar_width = 4) ``` [](https://modeloriented.github.io/shapper/articles/shapper_classification_files/figure-html/unnamed-chunk-6-1.png) We can show many models on one grid. ```{r} plot(ive_rf, ive_tree, show_predicted = FALSE, bar_width = 4) ``` [](https://modeloriented.github.io/shapper/articles/shapper_classification_files/figure-html/unnamed-chunk-7-1.png) ## Let's filter data for plot ```{r} ive_rf_filtered <- ive_rf[ive_rf$`_ylevel_` =="fired", ] shapper:::plot.individual_variable_effect(ive_rf_filtered) ``` [](https://modeloriented.github.io/shapper/articles/shapper_classification_files/figure-html/unnamed-chunk-8-1.png)