Title: | LIME-Based Explanations with Interpretable Inputs Based on Ceteris Paribus Profiles |
---|---|
Description: | Local explanations of machine learning models describe, how features contributed to a single prediction. This package implements an explanation method based on LIME (Local Interpretable Model-agnostic Explanations, see Tulio Ribeiro, Singh, Guestrin (2016) <doi:10.1145/2939672.2939778>) in which interpretable inputs are created based on local rather than global behaviour of each original feature. |
Authors: | Przemyslaw Biecek [aut, cre], Mateusz Staniak [aut], Krystian Igras [ctb], Alicja Gosiewska [ctb], Harel Lustiger [ctb], Willy Tadema [ctb] |
Maintainer: | Przemyslaw Biecek <[email protected]> |
License: | GPL |
Version: | 0.5 |
Built: | 2024-11-08 02:56:59 UTC |
Source: | https://github.com/modeloriented/localmodel |
Since only binary features are used, the weight associated with an observation is simply exp(-{number of features that were changed compared to the original observation}). Kernels are meant to be used as an argument to individual_surrogate_model function. Other custom functions can be used. Such functions take two vectors and return a single number.
gaussian_kernel(explained_instance, simulated_instance)
gaussian_kernel(explained_instance, simulated_instance)
explained_instance |
explained instance |
simulated_instance |
new observation |
numeric
library(DALEX) library(randomForest) library(localModel) data('apartments') mrf <- randomForest(m2.price ~., data = apartments, ntree = 50) explainer <- explain(model = mrf, data = apartments[, -1]) model_lok <- individual_surrogate_model(explainer, apartments[5, -1], size = 500, seed = 17, kernel = gaussian_kernel) # In this case each simulated observation has weight # that is small when the distance from original observation is large, # so closer observation have more weight. model_lok plot(model_lok)
library(DALEX) library(randomForest) library(localModel) data('apartments') mrf <- randomForest(m2.price ~., data = apartments, ntree = 50) explainer <- explain(model = mrf, data = apartments[, -1]) model_lok <- individual_surrogate_model(explainer, apartments[5, -1], size = 500, seed = 17, kernel = gaussian_kernel) # In this case each simulated observation has weight # that is small when the distance from original observation is large, # so closer observation have more weight. model_lok plot(model_lok)
Kernels are meant to be used as an argument to individual_surrogate_model function. Other custom functions can be used. Such functions take two vectors and return a single number.
identity_kernel(explained_instance, simulated_instance)
identity_kernel(explained_instance, simulated_instance)
explained_instance |
explained instance |
simulated_instance |
new observation |
numeric
library(DALEX) library(randomForest) library(localModel) data('apartments') mrf <- randomForest(m2.price ~., data = apartments, ntree = 50) explainer <- explain(model = mrf, data = apartments[, -1]) model_lok <- individual_surrogate_model(explainer, apartments[5, -1], size = 500, seed = 17, kernel = identity_kernel) # In this case each simulated observation has equal weight # when explanation model (LASSO) is fitted. model_lok plot(model_lok)
library(DALEX) library(randomForest) library(localModel) data('apartments') mrf <- randomForest(m2.price ~., data = apartments, ntree = 50) explainer <- explain(model = mrf, data = apartments[, -1]) model_lok <- individual_surrogate_model(explainer, apartments[5, -1], size = 500, seed = 17, kernel = identity_kernel) # In this case each simulated observation has equal weight # when explanation model (LASSO) is fitted. model_lok plot(model_lok)
This function fits a LIME-type explanation of a single prediction. Interpretable binary features that describe the local impact of features on the prediction are created based on Ceteris Paribus Profiles. Thend, a new dataset of similar observations is created and black box model predictions (scores in case of classification) are calculated for this dataset and LASSO regression model is fitted to them. This way, explanations are simplified and include only the most important features. More details about the methodology can be found in the vignettes.
individual_surrogate_model( x, new_observation, size, seed = NULL, kernel = identity_kernel, sampling = "uniform", ... )
individual_surrogate_model( x, new_observation, size, seed = NULL, kernel = identity_kernel, sampling = "uniform", ... )
x |
an explainer created with the function DALEX::explain(). |
new_observation |
an observation to be explained. Columns in should correspond to columns in the data argument to x. |
size |
number of similar observation to be sampled. |
seed |
If not NULL, seed will be set to this value for reproducibility. |
kernel |
Kernel function which will be used to weight simulated observations. |
sampling |
Parameter that controls sampling while creating new observations. |
... |
Additional arguments that will be passed to ingredients::ceteris_paribus. |
data.frame of class local_surrogate_explainer
# Example based on apartments data from DALEX package. library(DALEX) library(randomForest) library(localModel) data('apartments') mrf <- randomForest(m2.price ~., data = apartments, ntree = 50) explainer <- explain(model = mrf, data = apartments[, -1]) model_lok <- individual_surrogate_model(explainer, apartments[5, -1], size = 500, seed = 17) model_lok plot(model_lok)
# Example based on apartments data from DALEX package. library(DALEX) library(randomForest) library(localModel) data('apartments') mrf <- randomForest(m2.price ~., data = apartments, ntree = 50) explainer <- explain(model = mrf, data = apartments[, -1]) model_lok <- individual_surrogate_model(explainer, apartments[5, -1], size = 500, seed = 17) model_lok plot(model_lok)
This package implements LIME-like explanation method (see Tulio Ribeiro, Singh, Guestrin (2016) <doi:10.1145/2939672.2939778>) in which interpretable inputs are created based on local rather than global behaviour of each original feature.#'
individual_surrogate_model
generates an explanation for a single prediction with
interpretable features based on Ceteris Paribus profiles.
plot.local_surrogate_explainer
plots the explanation.
Plot Ceteris Paribus Profile and discretization
plot_interpretable_feature(x, variable)
plot_interpretable_feature(x, variable)
x |
local_surrogate_explainer object |
variable |
chr, name of the variable to be plotted |
ggplot2 object
Generic plot function for local surrogate explainers
## S3 method for class 'local_surrogate_explainer' plot(x, ..., geom = "bar")
## S3 method for class 'local_surrogate_explainer' plot(x, ..., geom = "bar")
x |
object of class local_surrogate_explainer |
... |
other objects of class local_surrogate_explainer. If provided, models will be plotted in rows, response levels in columns. |
geom |
If "point", lines with points at the end will be plotted, if "bar", bars will be plotted and if "arrow", arrows. |
# Example based on apartments data from DALEX package. library(DALEX) library(randomForest) library(localModel) data('apartments') mrf <- randomForest(m2.price ~., data = apartments, ntree = 50) explainer <- explain(model = mrf, data = apartments[, -1]) model_lok <- individual_surrogate_model(explainer, apartments[5, -1], size = 500, seed = 17) model_lok plot(model_lok)
# Example based on apartments data from DALEX package. library(DALEX) library(randomForest) library(localModel) data('apartments') mrf <- randomForest(m2.price ~., data = apartments, ntree = 50) explainer <- explain(model = mrf, data = apartments[, -1]) model_lok <- individual_surrogate_model(explainer, apartments[5, -1], size = 500, seed = 17) model_lok plot(model_lok)
Generic print function for local surrogate explainers
## S3 method for class 'local_surrogate_explainer' print(x, ...)
## S3 method for class 'local_surrogate_explainer' print(x, ...)
x |
object of class local_surrogate_explainer |
... |
currently ignored |
# Example based on apartments data from DALEX package. library(DALEX) library(randomForest) library(localModel) data('apartments') mrf <- randomForest(m2.price ~., data = apartments, ntree = 50) explainer <- explain(model = mrf, data = apartments[, -1]) model_lok <- individual_surrogate_model(explainer, apartments[5, -1], size = 500, seed = 17) plot(model_lok) model_lok
# Example based on apartments data from DALEX package. library(DALEX) library(randomForest) library(localModel) data('apartments') mrf <- randomForest(m2.price ~., data = apartments, ntree = 50) explainer <- explain(model = mrf, data = apartments[, -1]) model_lok <- individual_surrogate_model(explainer, apartments[5, -1], size = 500, seed = 17) plot(model_lok) model_lok