localModel
package is
a variant of LIME. As in
LIME, the analysis consists of
Creating a new dataset X′ of m observations that are similarly in
some sense to the observation, for which we explain the prediction. This
dataset is usually built in terms of new, interpretable features rather
than original features. Size m
is a parameter to individual_surrogate_model
function.
Fitting black box model to the new dataset. This step requires a transformation back to the original input space and is non-trivial for numerical data.
Fitting an explanation model (a glass box) to the outputs of black box model. Usually, LASSO regression is used to make sure that explanations are simple enough.
In the next few paragraph, we will shortly describe how each of the
steps is performed in localModel
. All of them are
implemented by the individual_surrogate_model
function.
Interpretable features are created in a way that depends on the type of the predictor.
For categorical predictors, original dataset is used to obtain black box predictions. Then, the marginal relationship between the feature and response is modeled via decision tree with a single split to dichotomize the feature.
For numerical predictors, Ceteris Paribus profile is calculated and this marginal relationship is again dichotomized by a decision tree with maximum depth equal to 2.
For both types of predictors, the intepretable input is an indicator variable equal to 1 for value of feature that falls into the group of levels or interval chosen by the decision tree. Other values of the predictor are treated as a single level, a baseline.
Sampling new observations is done by
Creating m copies of the explained observation.
Iterating through these copies and in each step
individual_surrogate_model
function is called
with argument sampling = "uniform"
, each of these values is
changed to baseline, but when
sampling = "non-uniform"
, it is changed with probability
equal to the proportion of baseline values in the original
dataset.Fitting the black box model to new observation requires transforming
them into the original feature space. In localModel
, this
is done the following way. The original dataset is transformed into the
interpretable feature space. Based on this transformation, we know which
values of each feature are categorized as baseline and which as
the explained value. Then, for each simulated observation, and for each
feature, we pick a random value of this feature from the original
dataset that corresponds either to the baseline group or the
explained value. Black box model is fitted to these observations.
LASSO model with penalty chosen via cross-validation is used in
localModel
package. Optionally, observation can be weighted
according to their distance from the explained observation in the space
of interpretable features. Weighting is controlled via
kernel
parameter to
individual_surrogate_model
.
The resulting model can be plotted using generic plot
function. Models can be compared by passing several explainer object to
plot
.