Title: | SHAP Visualizations |
---|---|
Description: | Visualizations for SHAP (SHapley Additive exPlanations), such as waterfall plots, force plots, various types of importance plots, dependence plots, and interaction plots. These plots act on a 'shapviz' object created from a matrix of SHAP values and a corresponding feature dataset. Wrappers for the R packages 'xgboost', 'lightgbm', 'fastshap', 'shapr', 'h2o', 'treeshap', 'DALEX', and 'kernelshap' are added for convenience. By separating visualization and computation, it is possible to display factor variables in graphs, even if the SHAP values are calculated by a model that requires numerical features. The plots are inspired by those provided by the 'shap' package in Python, but there is no dependency on it. |
Authors: | Michael Mayer [aut, cre], Adrian Stando [ctb] |
Maintainer: | Michael Mayer <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.9.7 |
Built: | 2025-01-19 20:26:43 UTC |
Source: | https://github.com/modeloriented/shapviz |
Visualizations for SHAP (SHapley Additive exPlanations), such as waterfall plots, force plots, various types of importance plots, dependence plots, and interaction plots. These plots act on a 'shapviz' object created from a matrix of SHAP values and a corresponding feature dataset. Wrappers for the R packages 'xgboost', 'lightgbm', 'fastshap', 'shapr', 'h2o', 'treeshap', 'DALEX', and 'kernelshap' are added for convenience. By separating visualization and computation, it is possible to display factor variables in graphs, even if the SHAP values are calculated by a model that requires numerical features. The plots are inspired by those provided by the 'shap' package in Python, but there is no dependency on it.
Maintainer: Michael Mayer [email protected]
Other contributors:
Adrian Stando [email protected] [contributor]
Useful links:
Report bugs at https://github.com/ModelOriented/shapviz/issues
Use standard square bracket subsetting to select rows and/or columns of SHAP values, feature values, and SHAP interaction values of a "shapviz" object.
## S3 method for class 'shapviz' x[i, j, ...]
## S3 method for class 'shapviz' x[i, j, ...]
x |
An object of class "shapviz". |
i |
Row subsetting. |
j |
Column subsetting. |
... |
Currently unused. |
A new object of class "shapviz".
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) x <- shapviz(S, X, baseline = 4) x[1, "x"] x[1] x[c(FALSE, TRUE), ] x[, "x"]
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) x <- shapviz(S, X, baseline = 4) x[1, "x"] x[1] x[c(FALSE, TRUE), ] x[, "x"]
Rowbinds two "shapviz" objects using +
.
## S3 method for class 'shapviz' e1 + e2 ## S3 method for class 'mshapviz' e1 + e2
## S3 method for class 'shapviz' e1 + e2 ## S3 method for class 'mshapviz' e1 + e2
e1 |
The first object of class "shapviz". |
e2 |
The second object of class "shapviz". |
A new object of class "shapviz".
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1] s2 <- shapviz(S, X, baseline = 4)[2] s <- s1 + s2 s # mshapviz S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1L] s2 <- shapviz(S, X, baseline = 4)[2L] s <- mshapviz(c(shp1 = s1, shp2 = s2)) s + s
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1] s2 <- shapviz(S, X, baseline = 4)[2] s <- s1 + s2 s # mshapviz S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1L] s2 <- shapviz(S, X, baseline = 4)[2L] s <- mshapviz(c(shp1 = s1, shp2 = s2)) s + s
This function combines two or more (usually named) "shapviz" objects to an object of class "mshapviz".
## S3 method for class 'shapviz' c(...)
## S3 method for class 'shapviz' c(...)
... |
Any number of (optionally named) "shapviz" objects. |
A "mshapviz" object.
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1] s2 <- shapviz(S, X, baseline = 4)[2] s <- c(shp1 = s1, shp2 = s2) s
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1] s2 <- shapviz(S, X, baseline = 4)[2] s <- c(shp1 = s1, shp2 = s2) s
This function sums up SHAP values (or SHAP interaction values) of feature groups. Typical application: SHAP values have been generated by a model with one or multiple one-hot encoded variables, but the explanations should be done using the original factor.
collapse_shap(S, collapse = NULL, ...)
collapse_shap(S, collapse = NULL, ...)
S |
Either a (n x p) matrix of SHAP values or a (n x p x p) array of SHAP interaction values. |
collapse |
A named list of character vectors. Each vector specifies the feature names whose SHAP values need to be summed up. The names determine the resulting collapsed column/dimension names. |
... |
Currently unused. |
A matrix of SHAP values, or an array of SHAP interaction values.
S <- cbind( x = c(0.1, 0.1, 0.1), `age low` = c(0.2, -0.1, 0.1), `age mid` = c(0, 0.2, -0.2), `age high` = c(1, -1, 0) ) collapse <- list(age = c("age low", "age mid", "age high")) collapse_shap(S, collapse) # Arrays (as with SHAP interactions) S_inter <- array(1, dim = c(2, 4, 4), dimnames = list(NULL, letters[1:4], letters[1:4])) collapse_shap(S_inter, collapse = list(cd = c("c", "d"), ab = c("a", "b")))
S <- cbind( x = c(0.1, 0.1, 0.1), `age low` = c(0.2, -0.1, 0.1), `age mid` = c(0, 0.2, -0.2), `age high` = c(1, -1, 0) ) collapse <- list(age = c("age low", "age mid", "age high")) collapse_shap(S, collapse) # Arrays (as with SHAP interactions) S_inter <- array(1, dim = c(2, 4, 4), dimnames = list(NULL, letters[1:4], letters[1:4])) collapse_shap(S_inter, collapse = list(cd = c("c", "d"), ab = c("a", "b")))
Dimensions of "shapviz" Object
## S3 method for class 'shapviz' dim(x)
## S3 method for class 'shapviz' dim(x)
x |
An object of class "shapviz". |
A numeric vector of length two providing the number of rows and columns
of the SHAP matrix stored in x
.
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) x <- shapviz(S, X) dim(x) nrow(x) ncol(x)
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) x <- shapviz(S, X) dim(x) nrow(x) ncol(x)
This implies to use colnames(x)
to get the column names of the SHAP and feature
matrix (and optional SHAP interaction values).
## S3 method for class 'shapviz' dimnames(x)
## S3 method for class 'shapviz' dimnames(x)
x |
An object of class "shapviz". |
Dimnames of the SHAP matrix.
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) x <- shapviz(S, X, baseline = 4) dimnames(x) colnames(x)
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) x <- shapviz(S, X, baseline = 4) dimnames(x) colnames(x)
This implies colnames(x) <- ...
.
## S3 replacement method for class 'shapviz' dimnames(x) <- value
## S3 replacement method for class 'shapviz' dimnames(x) <- value
x |
An object of class "shapviz". |
value |
A list with rownames and column names compliant with SHAP matrix. |
Like x
, but with replaced dimnames.
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) x <- shapviz(S, X, baseline = 4) dimnames(x) <- list(1:2, c("a", "b")) dimnames(x) colnames(x) <- c("x", "y") colnames(x)
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) x <- shapviz(S, X, baseline = 4) dimnames(x) <- list(1:2, c("a", "b")) dimnames(x) colnames(x) <- c("x", "y") colnames(x)
Functions to extract SHAP values, feature values, the baseline, or SHAP interactions from a "(m)shapviz" object.
get_shap_values(object, ...) ## S3 method for class 'shapviz' get_shap_values(object, ...) ## S3 method for class 'mshapviz' get_shap_values(object, ...) ## Default S3 method: get_shap_values(object, ...) get_feature_values(object, ...) ## S3 method for class 'shapviz' get_feature_values(object, ...) ## S3 method for class 'mshapviz' get_feature_values(object, ...) ## Default S3 method: get_feature_values(object, ...) get_baseline(object, ...) ## S3 method for class 'shapviz' get_baseline(object, ...) ## S3 method for class 'mshapviz' get_baseline(object, ...) ## Default S3 method: get_baseline(object, ...) get_shap_interactions(object, ...) ## S3 method for class 'shapviz' get_shap_interactions(object, ...) ## S3 method for class 'mshapviz' get_shap_interactions(object, ...) ## Default S3 method: get_shap_interactions(object, ...)
get_shap_values(object, ...) ## S3 method for class 'shapviz' get_shap_values(object, ...) ## S3 method for class 'mshapviz' get_shap_values(object, ...) ## Default S3 method: get_shap_values(object, ...) get_feature_values(object, ...) ## S3 method for class 'shapviz' get_feature_values(object, ...) ## S3 method for class 'mshapviz' get_feature_values(object, ...) ## Default S3 method: get_feature_values(object, ...) get_baseline(object, ...) ## S3 method for class 'shapviz' get_baseline(object, ...) ## S3 method for class 'mshapviz' get_baseline(object, ...) ## Default S3 method: get_baseline(object, ...) get_shap_interactions(object, ...) ## S3 method for class 'shapviz' get_shap_interactions(object, ...) ## S3 method for class 'mshapviz' get_shap_interactions(object, ...) ## Default S3 method: get_shap_interactions(object, ...)
object |
Object to extract something. |
... |
Currently unused. |
get_shap_values()
returns the matrix of SHAP values,
get_feature_values()
the data.frame
of feature values,
get_baseline()
the numeric baseline value, and
get_shap_interactions()
the SHAP interactions of the input.
For objects of class "mshapviz", these functions return lists of those elements.
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) shp <- shapviz(S, X, baseline = 4) get_shap_values(shp)
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) shp <- shapviz(S, X, baseline = 4) get_shap_values(shp)
Formats a numeric vector in a way that its largest absolute value determines the number of digits after the decimal separator. This function is helpful in perfectly aligning numbers on plots. Does not use scientific formatting.
format_max(x, digits = 4L, ...)
format_max(x, digits = 4L, ...)
x |
A numeric vector to be formatted. |
digits |
Number of significant digits of the largest absolute value. |
... |
Further arguments passed to |
A character vector of formatted numbers.
x <- c(100, 1, 0.1) format_max(x) y <- c(100, 1.01) format_max(y) format_max(y, digits = 5)
x <- c(100, 1, 0.1) format_max(x) y <- c(100, 1.01) format_max(y) format_max(y, digits = 5)
Is object of class "mshapviz"?
is.mshapviz(object)
is.mshapviz(object)
object |
An R object. |
Returns TRUE
if object
has "mshapviz" among its classes,
and FALSE
otherwise.
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1] s2 <- shapviz(S, X, baseline = 4) x <- c(s1 = s1, s2 = s2) is.mshapviz(x) is.mshapviz(s1)
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1] s2 <- shapviz(S, X, baseline = 4) x <- c(s1 = s1, s2 = s2) is.mshapviz(x) is.mshapviz(s1)
Is object of class "shapviz"?
is.shapviz(object)
is.shapviz(object)
object |
An R object. |
Returns TRUE
if object
has "shapviz" among its classes,
and FALSE
otherwise.
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) shp <- shapviz(S, X) is.shapviz(shp) is.shapviz("a")
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) shp <- shapviz(S, X) is.shapviz(shp) is.shapviz("a")
The dataset contains information on 13,932 single-family homes sold in Miami-Dade County in 2016. Besides publicly available information, the dataset creator Steven C. Bourassa has added distance variables, aviation noise as well as latitude and longitude.
More information can be found open-access on https://www.mdpi.com/1595920.
The dataset can also be downloaded via miami <- OpenML::getOMLDataSet(43093)$data
.
miami
miami
A data frame with 13,932 rows and 17 columns:
unique identifier for each property. About 1% appear multiple times.
sale price ($)
land area (square feet)
floor area (square feet)
value of special features (e.g., swimming pools) ($)
distance to the nearest rail line (an indicator of noise) (feet)
distance to the ocean (feet)
distance to the nearest body of water (feet)
distance to the Miami central business district (feet)
distance to the nearest subcenter (feet)
distance to the nearest highway (an indicator of noise) (feet)
age of the structure
dummy variable for airplane noise exceeding an acceptable level
quality of the structure
sale month in 2016 (1 = jan)
Coordinates
This function combines a list of compatible "shapviz" objects to an object of class "mshapviz". The elements can be named.
mshapviz(object, ...)
mshapviz(object, ...)
object |
List of "shapviz" objects to be concatenated. |
... |
Not used. |
A "mshapviz" object.
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1L] s2 <- shapviz(S, X, baseline = 4)[2L] s <- mshapviz(c(shp1 = s1, shp2 = s2)) s
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1L] s2 <- shapviz(S, X, baseline = 4)[2L] s <- mshapviz(c(shp1 = s1, shp2 = s2)) s
Returns a vector of interaction strengths between variable v
and all other
variables, see Details.
potential_interactions( obj, v, nbins = NULL, color_num = TRUE, scale = FALSE, adjusted = FALSE )
potential_interactions( obj, v, nbins = NULL, color_num = TRUE, scale = FALSE, adjusted = FALSE )
obj |
An object of class "shapviz". |
v |
Variable name to calculate potential SHAP interactions for. |
nbins |
Into how many quantile bins should a numeric |
color_num |
Should other ("color") features |
scale |
Should adjusted R-squared be multiplied with the sample variance of
within-bin SHAP values? If |
adjusted |
Should adjusted R-squared be used? Default is |
If SHAP interaction values are available, the interaction strength
between feature v
and another feature v'
is measured by twice their
mean absolute SHAP interaction values.
Otherwise, we use a heuristic calculated as follows:
If v
is numeric, it is binned into nbins
bins.
Per bin, the SHAP values of v
are regressed onto v
, and the R-squared
is calculated. Rows with missing v'
are discarded.
The R-squared are averaged over bins, weighted by the number of
non-missing v'
values.
This measures how much variability in the SHAP values of v
is explained by v'
,
after accounting for v
.
Set scale = TRUE
to multiply the R-squared by the within-bin variance
of the SHAP values. This will put higher weight to bins with larger scatter.
Set color_num = FALSE
to not turn the values of the "color" feature v'
to numeric.
Finally, set adjusted = TRUE
to use adjusted R-squared.
The algorithm does not consider observations with missing v'
values.
A named vector of decreasing interaction strengths.
Prints "mshapviz" Object
## S3 method for class 'mshapviz' print(x, ...)
## S3 method for class 'mshapviz' print(x, ...)
x |
An object of class "mshapviz". |
... |
Further arguments passed from other methods. |
Invisibly, the input is returned.
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1] s2 <- shapviz(S, X, baseline = 4) x <- c(s1 = s1, s2 = s2) x
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1] s2 <- shapviz(S, X, baseline = 4) x <- c(s1 = s1, s2 = s2) x
Prints "shapviz" Object
## S3 method for class 'shapviz' print(x, ...)
## S3 method for class 'shapviz' print(x, ...)
x |
An object of class "shapviz". |
... |
Further arguments passed from other methods. |
Invisibly, the input is returned.
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) x <- shapviz(S, X, baseline = 4) x
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) x <- shapviz(S, X, baseline = 4) x
Rowbinds multiple "shapviz" objects based on the +
operator.
## S3 method for class 'shapviz' rbind(...) ## S3 method for class 'mshapviz' rbind(...)
## S3 method for class 'shapviz' rbind(...) ## S3 method for class 'mshapviz' rbind(...)
... |
Any number of "shapviz" or "mshapviz" objects. |
A new object of class "shapviz" or "mshapviz".
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1] s2 <- shapviz(S, X, baseline = 4)[2] s <- rbind(s1, s2) s # mshapviz S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1L] s2 <- shapviz(S, X, baseline = 4)[2L] s <- mshapviz(c(shp1 = s1, shp2 = s2)) rbind(s, s)
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1] s2 <- shapviz(S, X, baseline = 4)[2] s <- rbind(s1, s2) s # mshapviz S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) s1 <- shapviz(S, X, baseline = 4)[1L] s2 <- shapviz(S, X, baseline = 4)[2L] s <- mshapviz(c(shp1 = s1, shp2 = s2)) rbind(s, s)
This function creates an object of class "shapviz" from a matrix of SHAP values, or from a fitted model of type
XGBoost,
LightGBM, or
H2O.
Furthermore, shapviz()
can digest the results of
fastshap::explain()
,
shapr::explain()
,
treeshap::treeshap()
,
DALEX::predict_parts()
,
kernelshap::kernelshap()
,
kernelshap::permshap()
, and
kernelshap::additive_shap()
,
check the vignettes for examples.
shapviz(object, ...) ## Default S3 method: shapviz(object, ...) ## S3 method for class 'matrix' shapviz(object, X, baseline = 0, collapse = NULL, S_inter = NULL, ...) ## S3 method for class 'xgb.Booster' shapviz( object, X_pred, X = X_pred, which_class = NULL, collapse = NULL, interactions = FALSE, ... ) ## S3 method for class 'lgb.Booster' shapviz(object, X_pred, X = X_pred, which_class = NULL, collapse = NULL, ...) ## S3 method for class 'explain' shapviz(object, X = NULL, baseline = NULL, collapse = NULL, ...) ## S3 method for class 'treeshap' shapviz( object, X = object[["observations"]], baseline = 0, collapse = NULL, ... ) ## S3 method for class 'predict_parts' shapviz(object, ...) ## S3 method for class 'shapr' shapviz( object, X = as.data.frame(object$internal$data$x_explain), collapse = NULL, ... ) ## S3 method for class 'kernelshap' shapviz(object, X = object[["X"]], which_class = NULL, collapse = NULL, ...) ## S3 method for class 'H2OModel' shapviz( object, X_pred, X = as.data.frame(X_pred), collapse = NULL, background_frame = NULL, output_space = FALSE, output_per_reference = FALSE, ... )
shapviz(object, ...) ## Default S3 method: shapviz(object, ...) ## S3 method for class 'matrix' shapviz(object, X, baseline = 0, collapse = NULL, S_inter = NULL, ...) ## S3 method for class 'xgb.Booster' shapviz( object, X_pred, X = X_pred, which_class = NULL, collapse = NULL, interactions = FALSE, ... ) ## S3 method for class 'lgb.Booster' shapviz(object, X_pred, X = X_pred, which_class = NULL, collapse = NULL, ...) ## S3 method for class 'explain' shapviz(object, X = NULL, baseline = NULL, collapse = NULL, ...) ## S3 method for class 'treeshap' shapviz( object, X = object[["observations"]], baseline = 0, collapse = NULL, ... ) ## S3 method for class 'predict_parts' shapviz(object, ...) ## S3 method for class 'shapr' shapviz( object, X = as.data.frame(object$internal$data$x_explain), collapse = NULL, ... ) ## S3 method for class 'kernelshap' shapviz(object, X = object[["X"]], which_class = NULL, collapse = NULL, ...) ## S3 method for class 'H2OModel' shapviz( object, X_pred, X = as.data.frame(X_pred), collapse = NULL, background_frame = NULL, output_space = FALSE, output_per_reference = FALSE, ... )
object |
For XGBoost, LightGBM, and H2O, this is the fitted model used to
calculate SHAP values from |
... |
Parameters passed to other methods (currently only used by
the |
X |
Matrix or data.frame of feature values used for visualization.
Must contain at least the same column names as the SHAP matrix represented by
|
baseline |
Optional baseline value, representing the average response at the scale of the SHAP values. It will be used for plot methods that explain single predictions. |
collapse |
A named list of character vectors. Each vector specifies the feature names whose SHAP values need to be summed up. The names determine the resulting collapsed column/dimension names. |
S_inter |
Optional 3D array of SHAP interaction values.
If |
X_pred |
Data set as expected by the |
which_class |
In case of a multiclass or multioutput setting, which class/output (>= 1) to explain. Currently relevant for XGBoost, LightGBM, kernelshap, and permshap. |
interactions |
Should SHAP interactions be calculated (default is |
background_frame |
Background dataset for baseline SHAP or marginal SHAP. Only for H2O models. |
output_space |
If model has link function, this argument controls whether the
SHAP values should be linearly (= approximately) transformed to the original scale
(if |
output_per_reference |
Switches between different algorithms, see
|
Together with the main input, a data set X
of feature values is required,
used only for visualization. It can therefore contain character or factor
variables, even if the SHAP values were calculated from a purely numerical feature
matrix. In addition, to improve visualization, it can sometimes be useful to truncate
gross outliers, logarithmize certain columns, or replace missing values with an
explicit value.
SHAP values of dummy variables can be combined using the convenient
collapse
argument.
Multi-output models created from XGBoost, LightGBM, "kernelshap", or "permshap"
return a "mshapviz" object, containing a "shapviz" object per output.
An object of class "shapviz" with the following elements:
S
: Numeric matrix of SHAP values.
X
: data.frame
containing the feature values corresponding to S
.
baseline
: Baseline value, representing the average prediction at the
scale of the SHAP values.
S_inter
: Numeric array of SHAP interaction values (or NULL
).
shapviz(default)
: Default method to initialize a "shapviz" object.
shapviz(matrix)
: Creates a "shapviz" object from a matrix of SHAP values.
shapviz(xgb.Booster)
: Creates a "shapviz" object from an XGBoost model.
shapviz(lgb.Booster)
: Creates a "shapviz" object from a LightGBM model.
shapviz(explain)
: Creates a "shapviz" object from fastshap::explain()
.
shapviz(treeshap)
: Creates a "shapviz" object from treeshap::treeshap()
.
shapviz(predict_parts)
: Creates a "shapviz" object from DALEX::predict_parts()
.
shapviz(shapr)
: Creates a "shapviz" object from shapr::explain()
.
shapviz(kernelshap)
: Creates a "shapviz" object from an object of class 'kernelshap'. This includes
results of kernelshap()
, permshap()
, and additive_shap()
.
shapviz(H2OModel)
: Creates a "shapviz" object from an H2O model.
sv_importance()
, sv_dependence()
, sv_dependence2D()
, sv_interaction()
,
sv_waterfall()
, sv_force()
, collapse_shap()
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) shapviz(S, X, baseline = 4) # XGBoost models X_pred <- data.matrix(iris[, -1]) dtrain <- xgboost::xgb.DMatrix(X_pred, label = iris[, 1], nthread = 1) fit <- xgboost::xgb.train(list(nthread = 1), data = dtrain, nrounds = 10) # Will use numeric matrix "X_pred" as feature matrix x <- shapviz(fit, X_pred = X_pred) x sv_dependence(x, "Species") # Will use original values as feature matrix x <- shapviz(fit, X_pred = X_pred, X = iris) sv_dependence(x, "Species") # "X_pred" can also be passed as xgb.DMatrix, but only if X is passed as well! x <- shapviz(fit, X_pred = dtrain, X = iris) # Multiclass setting params <- list(objective = "multi:softprob", num_class = 3, nthread = 1) X_pred <- data.matrix(iris[, -5]) dtrain <- xgboost::xgb.DMatrix( X_pred, label = as.integer(iris[, 5]) - 1, nthread = 1 ) fit <- xgboost::xgb.train(params = params, data = dtrain, nrounds = 10) # Select specific class x <- shapviz(fit, X_pred = X_pred, which_class = 3) x # Or combine all classes to "mshapviz" object x <- shapviz(fit, X_pred = X_pred) x # What if we would have one-hot-encoded values and want to explain the original column? X_pred <- stats::model.matrix(~ . -1, iris[, -1]) dtrain <- xgboost::xgb.DMatrix(X_pred, label = as.integer(iris[, 1]), nthread = 1) fit <- xgboost::xgb.train(list(nthread = 1), data = dtrain, nrounds = 10) x <- shapviz( fit, X_pred = X_pred, X = iris, collapse = list(Species = c("Speciessetosa", "Speciesversicolor", "Speciesvirginica")) ) summary(x) # Similarly with LightGBM if (requireNamespace("lightgbm", quietly = TRUE)) { fit <- lightgbm::lgb.train( params = list(objective = "regression", num_thread = 1), data = lightgbm::lgb.Dataset(X_pred, label = iris[, 1]), nrounds = 10, verbose = -2 ) x <- shapviz(fit, X_pred = X_pred) x # Multiclass params <- list(objective = "multiclass", num_class = 3, num_thread = 1) X_pred <- data.matrix(iris[, -5]) dtrain <- lightgbm::lgb.Dataset(X_pred, label = as.integer(iris[, 5]) - 1) fit <- lightgbm::lgb.train(params = params, data = dtrain, nrounds = 10) # Select specific class x <- shapviz(fit, X_pred = X_pred, which_class = 3) x # Or combine all classes to a "mshapviz" object mx <- shapviz(fit, X_pred = X_pred) mx all.equal(mx[[3]], x) }
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) shapviz(S, X, baseline = 4) # XGBoost models X_pred <- data.matrix(iris[, -1]) dtrain <- xgboost::xgb.DMatrix(X_pred, label = iris[, 1], nthread = 1) fit <- xgboost::xgb.train(list(nthread = 1), data = dtrain, nrounds = 10) # Will use numeric matrix "X_pred" as feature matrix x <- shapviz(fit, X_pred = X_pred) x sv_dependence(x, "Species") # Will use original values as feature matrix x <- shapviz(fit, X_pred = X_pred, X = iris) sv_dependence(x, "Species") # "X_pred" can also be passed as xgb.DMatrix, but only if X is passed as well! x <- shapviz(fit, X_pred = dtrain, X = iris) # Multiclass setting params <- list(objective = "multi:softprob", num_class = 3, nthread = 1) X_pred <- data.matrix(iris[, -5]) dtrain <- xgboost::xgb.DMatrix( X_pred, label = as.integer(iris[, 5]) - 1, nthread = 1 ) fit <- xgboost::xgb.train(params = params, data = dtrain, nrounds = 10) # Select specific class x <- shapviz(fit, X_pred = X_pred, which_class = 3) x # Or combine all classes to "mshapviz" object x <- shapviz(fit, X_pred = X_pred) x # What if we would have one-hot-encoded values and want to explain the original column? X_pred <- stats::model.matrix(~ . -1, iris[, -1]) dtrain <- xgboost::xgb.DMatrix(X_pred, label = as.integer(iris[, 1]), nthread = 1) fit <- xgboost::xgb.train(list(nthread = 1), data = dtrain, nrounds = 10) x <- shapviz( fit, X_pred = X_pred, X = iris, collapse = list(Species = c("Speciessetosa", "Speciesversicolor", "Speciesvirginica")) ) summary(x) # Similarly with LightGBM if (requireNamespace("lightgbm", quietly = TRUE)) { fit <- lightgbm::lgb.train( params = list(objective = "regression", num_thread = 1), data = lightgbm::lgb.Dataset(X_pred, label = iris[, 1]), nrounds = 10, verbose = -2 ) x <- shapviz(fit, X_pred = X_pred) x # Multiclass params <- list(objective = "multiclass", num_class = 3, num_thread = 1) X_pred <- data.matrix(iris[, -5]) dtrain <- lightgbm::lgb.Dataset(X_pred, label = as.integer(iris[, 5]) - 1) fit <- lightgbm::lgb.train(params = params, data = dtrain, nrounds = 10) # Select specific class x <- shapviz(fit, X_pred = X_pred, which_class = 3) x # Or combine all classes to a "mshapviz" object mx <- shapviz(fit, X_pred = X_pred) mx all.equal(mx[[3]], x) }
Splits "shapviz" object along a vector f
into an object of class "mshapviz".
## S3 method for class 'shapviz' split(x, f, ...)
## S3 method for class 'shapviz' split(x, f, ...)
x |
Object of class "shapviz". |
f |
Vector used to split feature values and SHAP (interaction) values. Empty factor levels are dropped. |
... |
Arguments passed to |
A "mshapviz" object.
## Not run: dtrain <- xgboost::xgb.DMatrix(data.matrix(iris[, -1]), label = iris[, 1]) fit <- xgboost::xgb.train(data = dtrain, nrounds = 10, nthread = 1) sv <- shapviz(fit, X_pred = dtrain, X = iris) mx <- split(sv, f = iris$Species) sv_dependence(mx, "Petal.Length") ## End(Not run)
## Not run: dtrain <- xgboost::xgb.DMatrix(data.matrix(iris[, -1]), label = iris[, 1]) fit <- xgboost::xgb.train(data = dtrain, nrounds = 10, nthread = 1) sv <- shapviz(fit, X_pred = dtrain, X = iris) mx <- split(sv, f = iris$Species) sv_dependence(mx, "Petal.Length") ## End(Not run)
Summarizes "shapviz" Object
## S3 method for class 'shapviz' summary(object, n = 2L, ...)
## S3 method for class 'shapviz' summary(object, n = 2L, ...)
object |
An object of class "shapviz". |
n |
Maximum number of rows of SHAP values and feature values to show. |
... |
Further arguments passed from other methods. |
Invisibly, the input is returned.
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) object <- shapviz(S, X, baseline = 4) summary(object)
S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y"))) X <- data.frame(x = c("a", "b"), y = c(100, 10)) object <- shapviz(S, X, baseline = 4) summary(object)
Scatterplot of the SHAP values of a feature against its feature values.
If SHAP interaction values are available, setting interactions = TRUE
allows
to focus on pure interaction effects (multiplied by two) or on pure main effects.
By default, the feature on the color scale is selected via SHAP interactions
(if available) or an interaction heuristic, see potential_interactions()
.
sv_dependence(object, ...) ## Default S3 method: sv_dependence(object, ...) ## S3 method for class 'shapviz' sv_dependence( object, v, color_var = "auto", color = "#3b528b", viridis_args = getOption("shapviz.viridis_args"), jitter_width = NULL, interactions = FALSE, ih_nbins = NULL, ih_color_num = TRUE, ih_scale = FALSE, ih_adjusted = FALSE, ... ) ## S3 method for class 'mshapviz' sv_dependence( object, v, color_var = "auto", color = "#3b528b", viridis_args = getOption("shapviz.viridis_args"), jitter_width = NULL, interactions = FALSE, ih_nbins = NULL, ih_color_num = TRUE, ih_scale = FALSE, ih_adjusted = FALSE, ... )
sv_dependence(object, ...) ## Default S3 method: sv_dependence(object, ...) ## S3 method for class 'shapviz' sv_dependence( object, v, color_var = "auto", color = "#3b528b", viridis_args = getOption("shapviz.viridis_args"), jitter_width = NULL, interactions = FALSE, ih_nbins = NULL, ih_color_num = TRUE, ih_scale = FALSE, ih_adjusted = FALSE, ... ) ## S3 method for class 'mshapviz' sv_dependence( object, v, color_var = "auto", color = "#3b528b", viridis_args = getOption("shapviz.viridis_args"), jitter_width = NULL, interactions = FALSE, ih_nbins = NULL, ih_color_num = TRUE, ih_scale = FALSE, ih_adjusted = FALSE, ... )
object |
An object of class "(m)shapviz". |
... |
Arguments passed to |
v |
Column name of feature to be plotted. Can be a vector/list if |
color_var |
Feature name to be used on the color scale to investigate
interactions. The default ("auto") uses SHAP interaction values (if available),
or a heuristic to select the strongest interacting feature. Set to |
color |
Color to be used if |
viridis_args |
List of viridis color scale arguments, see
|
jitter_width |
The amount of horizontal jitter. The default ( |
interactions |
Should SHAP interaction values be plotted? Default is |
ih_nbins , ih_color_num , ih_scale , ih_adjusted
|
Interaction heuristic (ih)
parameters used to select the color variable, see |
An object of class "ggplot" (or "patchwork") representing a dependence plot.
sv_dependence(default)
: Default method.
sv_dependence(shapviz)
: SHAP dependence plot for "shapviz" object.
sv_dependence(mshapviz)
: SHAP dependence plot for "mshapviz" object.
dtrain <- xgboost::xgb.DMatrix( data.matrix(iris[, -1]), label = iris[, 1], nthread = 1 ) fit <- xgboost::xgb.train(data = dtrain, nrounds = 10, nthread = 1) x <- shapviz(fit, X_pred = dtrain, X = iris) sv_dependence(x, "Petal.Length") sv_dependence(x, "Petal.Length", color_var = "Species") sv_dependence(x, "Petal.Length", color_var = NULL) sv_dependence(x, c("Species", "Petal.Length")) sv_dependence(x, "Petal.Width", color_var = c("Species", "Petal.Length")) # SHAP interaction values/main effects x2 <- shapviz(fit, X_pred = dtrain, X = iris, interactions = TRUE) sv_dependence(x2, "Petal.Length", interactions = TRUE) sv_dependence( x2, c("Petal.Length", "Species"), color_var = NULL, interactions = TRUE )
dtrain <- xgboost::xgb.DMatrix( data.matrix(iris[, -1]), label = iris[, 1], nthread = 1 ) fit <- xgboost::xgb.train(data = dtrain, nrounds = 10, nthread = 1) x <- shapviz(fit, X_pred = dtrain, X = iris) sv_dependence(x, "Petal.Length") sv_dependence(x, "Petal.Length", color_var = "Species") sv_dependence(x, "Petal.Length", color_var = NULL) sv_dependence(x, c("Species", "Petal.Length")) sv_dependence(x, "Petal.Width", color_var = c("Species", "Petal.Length")) # SHAP interaction values/main effects x2 <- shapviz(fit, X_pred = dtrain, X = iris, interactions = TRUE) sv_dependence(x2, "Petal.Length", interactions = TRUE) sv_dependence( x2, c("Petal.Length", "Species"), color_var = NULL, interactions = TRUE )
Scatterplot of two features, showing the sum of their SHAP values on the color scale.
This allows to visualize the combined effect of two features, including interactions.
A typical application are models with latitude and longitude as features (plus
maybe other regional features that can be passed via add_vars
).
If SHAP interaction values are available, setting interactions = TRUE
allows
to focus on pure interaction effects (multiplied by two). In this case, add_vars
has no effect.
sv_dependence2D(object, ...) ## Default S3 method: sv_dependence2D(object, ...) ## S3 method for class 'shapviz' sv_dependence2D( object, x, y, viridis_args = getOption("shapviz.viridis_args"), jitter_width = NULL, jitter_height = NULL, interactions = FALSE, add_vars = NULL, ... ) ## S3 method for class 'mshapviz' sv_dependence2D( object, x, y, viridis_args = getOption("shapviz.viridis_args"), jitter_width = NULL, jitter_height = NULL, interactions = FALSE, add_vars = NULL, ... )
sv_dependence2D(object, ...) ## Default S3 method: sv_dependence2D(object, ...) ## S3 method for class 'shapviz' sv_dependence2D( object, x, y, viridis_args = getOption("shapviz.viridis_args"), jitter_width = NULL, jitter_height = NULL, interactions = FALSE, add_vars = NULL, ... ) ## S3 method for class 'mshapviz' sv_dependence2D( object, x, y, viridis_args = getOption("shapviz.viridis_args"), jitter_width = NULL, jitter_height = NULL, interactions = FALSE, add_vars = NULL, ... )
object |
An object of class "(m)shapviz". |
... |
Arguments passed to |
x |
Feature name for x axis. Can be a vector/list if |
y |
Feature name for y axis. Can be a vector/list if |
viridis_args |
List of viridis color scale arguments, see
|
jitter_width |
The amount of horizontal jitter. The default ( |
jitter_height |
Similar to |
interactions |
Should SHAP interaction values be plotted? The default ( |
add_vars |
Optional vector of feature names, whose SHAP values should be added
to the sum of the SHAP values of |
An object of class "ggplot" (or "patchwork") representing a dependence plot.
sv_dependence2D(default)
: Default method.
sv_dependence2D(shapviz)
: 2D SHAP dependence plot for "shapviz" object.
sv_dependence2D(mshapviz)
: 2D SHAP dependence plot for "mshapviz" object.
dtrain <- xgboost::xgb.DMatrix( data.matrix(iris[, -1]), label = iris[, 1], nthread = 1 ) fit <- xgboost::xgb.train(data = dtrain, nrounds = 10, nthread = 1) sv <- shapviz(fit, X_pred = dtrain, X = iris) sv_dependence2D(sv, x = "Petal.Length", y = "Species") sv_dependence2D(sv, x = c("Petal.Length", "Species"), y = "Sepal.Width") # SHAP interaction values sv2 <- shapviz(fit, X_pred = dtrain, X = iris, interactions = TRUE) sv_dependence2D(sv2, x = "Petal.Length", y = "Species", interactions = TRUE) sv_dependence2D( sv2, x = "Petal.Length", y = c("Species", "Petal.Width"), interactions = TRUE ) # mshapviz object mx <- split(sv, f = iris$Species) sv_dependence2D(mx, x = "Petal.Length", y = "Sepal.Width")
dtrain <- xgboost::xgb.DMatrix( data.matrix(iris[, -1]), label = iris[, 1], nthread = 1 ) fit <- xgboost::xgb.train(data = dtrain, nrounds = 10, nthread = 1) sv <- shapviz(fit, X_pred = dtrain, X = iris) sv_dependence2D(sv, x = "Petal.Length", y = "Species") sv_dependence2D(sv, x = c("Petal.Length", "Species"), y = "Sepal.Width") # SHAP interaction values sv2 <- shapviz(fit, X_pred = dtrain, X = iris, interactions = TRUE) sv_dependence2D(sv2, x = "Petal.Length", y = "Species", interactions = TRUE) sv_dependence2D( sv2, x = "Petal.Length", y = c("Species", "Petal.Width"), interactions = TRUE ) # mshapviz object mx <- split(sv, f = iris$Species) sv_dependence2D(mx, x = "Petal.Length", y = "Sepal.Width")
Creates a force plot of SHAP values of one observation. If multiple observations are selected, their SHAP values and predictions are averaged.
sv_force(object, ...) ## Default S3 method: sv_force(object, ...) ## S3 method for class 'shapviz' sv_force( object, row_id = 1L, max_display = 6L, fill_colors = c("#f7d13d", "#a52c60"), format_shap = getOption("shapviz.format_shap"), format_feat = getOption("shapviz.format_feat"), contrast = TRUE, bar_label_size = 3.2, show_annotation = TRUE, annotation_size = 3.2, ... ) ## S3 method for class 'mshapviz' sv_force( object, row_id = 1L, max_display = 6L, fill_colors = c("#f7d13d", "#a52c60"), format_shap = getOption("shapviz.format_shap"), format_feat = getOption("shapviz.format_feat"), contrast = TRUE, bar_label_size = 3.2, show_annotation = TRUE, annotation_size = 3.2, ... )
sv_force(object, ...) ## Default S3 method: sv_force(object, ...) ## S3 method for class 'shapviz' sv_force( object, row_id = 1L, max_display = 6L, fill_colors = c("#f7d13d", "#a52c60"), format_shap = getOption("shapviz.format_shap"), format_feat = getOption("shapviz.format_feat"), contrast = TRUE, bar_label_size = 3.2, show_annotation = TRUE, annotation_size = 3.2, ... ) ## S3 method for class 'mshapviz' sv_force( object, row_id = 1L, max_display = 6L, fill_colors = c("#f7d13d", "#a52c60"), format_shap = getOption("shapviz.format_shap"), format_feat = getOption("shapviz.format_feat"), contrast = TRUE, bar_label_size = 3.2, show_annotation = TRUE, annotation_size = 3.2, ... )
object |
An object of class "(m)shapviz". |
... |
Arguments passed to |
row_id |
Subset of observations to plot, typically a single row number. If more than one row is selected, SHAP values are averaged, and feature values are shown only when they are unique. |
max_display |
Maximum number of features (with largest absolute SHAP values)
should be plotted? If there are more features, they will be collapsed to one
feature. Set to |
fill_colors |
A vector of exactly two fill colors: the first for positive SHAP values, the other for negative ones. |
format_shap |
Function used to format SHAP values. The default uses the
global option |
format_feat |
Function used to format numeric feature values. The default uses
the global option |
contrast |
Logical flag that detemines whether to use white text in dark arrows.
Default is |
bar_label_size |
Size of text used to describe bars
(via |
show_annotation |
Should "f(x)" and "E(f(x))" be plotted? Default is |
annotation_size |
Size of the annotation text (f(x)=... and E(f(x))=...). |
f(x) denotes the prediction on the SHAP scale, while E(f(x)) refers to the baseline SHAP value.
An object of class "ggplot" (or "patchwork") representing a force plot.
sv_force(default)
: Default method.
sv_force(shapviz)
: SHAP force plot for object of class "shapviz".
sv_force(mshapviz)
: SHAP force plot for object of class "mshapviz".
dtrain <- xgboost::xgb.DMatrix( data.matrix(iris[, -1]), label = iris[, 1], nthread = 1 ) fit <- xgboost::xgb.train(data = dtrain, nrounds = 20, nthread = 1) x <- shapviz(fit, X_pred = dtrain, X = iris[, -1]) sv_force(x) sv_force(x, row_id = 65, max_display = 3, size = 9, fill_colors = 4:5) # Aggregate over all observations with Petal.Length == 1.4 sv_force(x, row_id = x$X$Petal.Length == 1.4)
dtrain <- xgboost::xgb.DMatrix( data.matrix(iris[, -1]), label = iris[, 1], nthread = 1 ) fit <- xgboost::xgb.train(data = dtrain, nrounds = 20, nthread = 1) x <- shapviz(fit, X_pred = dtrain, X = iris[, -1]) sv_force(x) sv_force(x, row_id = 65, max_display = 3, size = 9, fill_colors = 4:5) # Aggregate over all observations with Petal.Length == 1.4 sv_force(x, row_id = x$X$Petal.Length == 1.4)
This function provides two types of SHAP importance plots: a bar plot and a beeswarm plot (sometimes called "SHAP summary plot"). The two types of plots can also be combined.
sv_importance(object, ...) ## Default S3 method: sv_importance(object, ...) ## S3 method for class 'shapviz' sv_importance( object, kind = c("bar", "beeswarm", "both", "no"), max_display = 15L, fill = "#fca50a", bar_width = 2/3, bee_width = 0.4, bee_adjust = 0.5, viridis_args = getOption("shapviz.viridis_args"), color_bar_title = "Feature value", show_numbers = FALSE, format_fun = format_max, number_size = 3.2, sort_features = TRUE, ... ) ## S3 method for class 'mshapviz' sv_importance( object, kind = c("bar", "beeswarm", "both", "no"), max_display = 15L, fill = "#fca50a", bar_width = 2/3, bar_type = c("dodge", "stack", "facets", "separate"), bee_width = 0.4, bee_adjust = 0.5, viridis_args = getOption("shapviz.viridis_args"), color_bar_title = "Feature value", show_numbers = FALSE, format_fun = format_max, number_size = 3.2, sort_features = TRUE, ... )
sv_importance(object, ...) ## Default S3 method: sv_importance(object, ...) ## S3 method for class 'shapviz' sv_importance( object, kind = c("bar", "beeswarm", "both", "no"), max_display = 15L, fill = "#fca50a", bar_width = 2/3, bee_width = 0.4, bee_adjust = 0.5, viridis_args = getOption("shapviz.viridis_args"), color_bar_title = "Feature value", show_numbers = FALSE, format_fun = format_max, number_size = 3.2, sort_features = TRUE, ... ) ## S3 method for class 'mshapviz' sv_importance( object, kind = c("bar", "beeswarm", "both", "no"), max_display = 15L, fill = "#fca50a", bar_width = 2/3, bar_type = c("dodge", "stack", "facets", "separate"), bee_width = 0.4, bee_adjust = 0.5, viridis_args = getOption("shapviz.viridis_args"), color_bar_title = "Feature value", show_numbers = FALSE, format_fun = format_max, number_size = 3.2, sort_features = TRUE, ... )
object |
An object of class "(m)shapviz". |
... |
Arguments passed to |
kind |
Should a "bar" plot (the default), a "beeswarm" plot, or "both" be shown? Set to "no" in order to suppress plotting. In that case, the sorted SHAP feature importances of all variables are returned. |
max_display |
How many features should be plotted?
Set to |
fill |
Color used to fill the bars (only used if bars are shown). |
bar_width |
Relative width of the bars (only used if bars are shown). |
bee_width |
Relative width of the beeswarms. |
bee_adjust |
Relative bandwidth adjustment factor used in estimating the density of the beeswarms. |
viridis_args |
List of viridis color scale arguments. The default points to the
global option |
color_bar_title |
Title of color bar of the beeswarm plot. Set to |
show_numbers |
Should SHAP feature importances be printed? Default is |
format_fun |
Function used to format SHAP feature importances
(only if |
number_size |
Text size of the numbers (if |
sort_features |
Should features be sorted or not? The default is |
bar_type |
For "mshapviz" objects with |
The bar plot shows SHAP feature importances, calculated as the average absolute SHAP
value per feature. The beeswarm plot displays SHAP values per feature, using min-max
scaled feature values on the color axis. Non-numeric features are transformed
to numeric by calling data.matrix()
first. For both types of plots, the features
are sorted in decreasing order of importance.
A "ggplot" (or "patchwork") object representing an importance plot, or - if
kind = "no"
- a named numeric vector of sorted SHAP feature importances
(or a matrix in case of an object of class "mshapviz").
sv_importance(default)
: Default method.
sv_importance(shapviz)
: SHAP importance plot for an object of class "shapviz".
sv_importance(mshapviz)
: SHAP importance plot for an object of class "mshapviz".
X_train <- data.matrix(iris[, -1]) dtrain <- xgboost::xgb.DMatrix(X_train, label = iris[, 1], nthread = 1) fit <- xgboost::xgb.train(data = dtrain, nrounds = 10, nthread = 1) x <- shapviz(fit, X_pred = X_train) sv_importance(x) sv_importance(x, kind = "no") sv_importance(x, kind = "beeswarm", show_numbers = TRUE)
X_train <- data.matrix(iris[, -1]) dtrain <- xgboost::xgb.DMatrix(X_train, label = iris[, 1], nthread = 1) fit <- xgboost::xgb.train(data = dtrain, nrounds = 10, nthread = 1) x <- shapviz(fit, X_pred = X_train) sv_importance(x) sv_importance(x, kind = "no") sv_importance(x, kind = "beeswarm", show_numbers = TRUE)
Plots a beeswarm plot for each feature pair. Diagonals represent the main effects,
while off-diagonals show interactions (multiplied by two due to symmetry).
The colors on the beeswarm plots represent min-max scaled feature values.
Non-numeric features are transformed to numeric by calling data.matrix()
first.
The features are sorted in decreasing order of usual SHAP importance.
sv_interaction(object, ...) ## Default S3 method: sv_interaction(object, ...) ## S3 method for class 'shapviz' sv_interaction( object, kind = c("beeswarm", "no"), max_display = 7L, alpha = 0.3, bee_width = 0.3, bee_adjust = 0.5, viridis_args = getOption("shapviz.viridis_args"), color_bar_title = "Row feature value", sort_features = TRUE, ... ) ## S3 method for class 'mshapviz' sv_interaction( object, kind = c("beeswarm", "no"), max_display = 7L, alpha = 0.3, bee_width = 0.3, bee_adjust = 0.5, viridis_args = getOption("shapviz.viridis_args"), color_bar_title = "Row feature value", sort_features = TRUE, ... )
sv_interaction(object, ...) ## Default S3 method: sv_interaction(object, ...) ## S3 method for class 'shapviz' sv_interaction( object, kind = c("beeswarm", "no"), max_display = 7L, alpha = 0.3, bee_width = 0.3, bee_adjust = 0.5, viridis_args = getOption("shapviz.viridis_args"), color_bar_title = "Row feature value", sort_features = TRUE, ... ) ## S3 method for class 'mshapviz' sv_interaction( object, kind = c("beeswarm", "no"), max_display = 7L, alpha = 0.3, bee_width = 0.3, bee_adjust = 0.5, viridis_args = getOption("shapviz.viridis_args"), color_bar_title = "Row feature value", sort_features = TRUE, ... )
object |
An object of class "(m)shapviz" containing element |
... |
Arguments passed to |
kind |
Set to "no" to return the matrix of average absolute SHAP interactions (or a list of such matrices in case of object of class "mshapviz"). Due to symmetry, off-diagonals are multiplied by two. The default is "beeswarm". |
max_display |
How many features should be plotted?
Set to |
alpha |
Transparency of the beeswarm dots. Defaults to 0.3. |
bee_width |
Relative width of the beeswarms. |
bee_adjust |
Relative bandwidth adjustment factor used in estimating the density of the beeswarms. |
viridis_args |
List of viridis color scale arguments. The default points to the
global option |
color_bar_title |
Title of color bar of the beeswarm plot. Set to |
sort_features |
Should features be sorted or not? The default is |
A "ggplot" (or "patchwork") object, or - if kind = "no"
- a named
numeric matrix of average absolute SHAP interactions sorted by the average
absolute SHAP values (or a list of such matrices in case of "mshapviz" object).
sv_interaction(default)
: Default method.
sv_interaction(shapviz)
: SHAP interaction plot for an object of class "shapviz".
sv_interaction(mshapviz)
: SHAP interaction plot for an object of class "mshapviz".
dtrain <- xgboost::xgb.DMatrix( data.matrix(iris[, -1]), label = iris[, 1], nthread = 1 ) fit <- xgboost::xgb.train(data = dtrain, nrounds = 10, nthread = 1) x <- shapviz(fit, X_pred = dtrain, X = iris, interactions = TRUE) sv_interaction(x, kind = "no") sv_interaction(x, max_display = 2, size = 3)
dtrain <- xgboost::xgb.DMatrix( data.matrix(iris[, -1]), label = iris[, 1], nthread = 1 ) fit <- xgboost::xgb.train(data = dtrain, nrounds = 10, nthread = 1) x <- shapviz(fit, X_pred = dtrain, X = iris, interactions = TRUE) sv_interaction(x, kind = "no") sv_interaction(x, max_display = 2, size = 3)
Creates a waterfall plot of SHAP values of one observation. If multiple observations are selected, their SHAP values and predictions are averaged.
sv_waterfall(object, ...) ## Default S3 method: sv_waterfall(object, ...) ## S3 method for class 'shapviz' sv_waterfall( object, row_id = 1L, max_display = 10L, order_fun = function(s) order(abs(s)), fill_colors = c("#f7d13d", "#a52c60"), format_shap = getOption("shapviz.format_shap"), format_feat = getOption("shapviz.format_feat"), contrast = TRUE, show_connection = TRUE, show_annotation = TRUE, annotation_size = 3.2, ... ) ## S3 method for class 'mshapviz' sv_waterfall( object, row_id = 1L, max_display = 10L, order_fun = function(s) order(abs(s)), fill_colors = c("#f7d13d", "#a52c60"), format_shap = getOption("shapviz.format_shap"), format_feat = getOption("shapviz.format_feat"), contrast = TRUE, show_connection = TRUE, show_annotation = TRUE, annotation_size = 3.2, ... )
sv_waterfall(object, ...) ## Default S3 method: sv_waterfall(object, ...) ## S3 method for class 'shapviz' sv_waterfall( object, row_id = 1L, max_display = 10L, order_fun = function(s) order(abs(s)), fill_colors = c("#f7d13d", "#a52c60"), format_shap = getOption("shapviz.format_shap"), format_feat = getOption("shapviz.format_feat"), contrast = TRUE, show_connection = TRUE, show_annotation = TRUE, annotation_size = 3.2, ... ) ## S3 method for class 'mshapviz' sv_waterfall( object, row_id = 1L, max_display = 10L, order_fun = function(s) order(abs(s)), fill_colors = c("#f7d13d", "#a52c60"), format_shap = getOption("shapviz.format_shap"), format_feat = getOption("shapviz.format_feat"), contrast = TRUE, show_connection = TRUE, show_annotation = TRUE, annotation_size = 3.2, ... )
object |
An object of class "(m)shapviz". |
... |
Arguments passed to |
row_id |
Subset of observations to plot, typically a single row number. If more than one row is selected, SHAP values are averaged, and feature values are shown only when they are unique. |
max_display |
Maximum number of features (with largest absolute SHAP values)
should be plotted? If there are more features, they will be collapsed to one
feature. Set to |
order_fun |
Function specifying the order of the variables/SHAP values.
It maps the vector |
fill_colors |
A vector of exactly two fill colors: the first for positive SHAP values, the other for negative ones. |
format_shap |
Function used to format SHAP values. The default uses the
global option |
format_feat |
Function used to format numeric feature values. The default uses
the global option |
contrast |
Logical flag that detemines whether to use white text in dark arrows.
Default is |
show_connection |
Should connecting lines be shown? Default is |
show_annotation |
Should "f(x)" and "E(f(x))" be plotted? Default is |
annotation_size |
Size of the annotation text (f(x)=... and E(f(x))=...). |
f(x) denotes the prediction on the SHAP scale, while E(f(x)) refers to the baseline SHAP value.
An object of class "ggplot" (or "patchwork") representing a waterfall plot.
sv_waterfall(default)
: Default method.
sv_waterfall(shapviz)
: SHAP waterfall plot for an object of class "shapviz".
sv_waterfall(mshapviz)
: SHAP waterfall plot for an object of class "mshapviz".
dtrain <- xgboost::xgb.DMatrix( data.matrix(iris[, -1]), label = iris[, 1], nthread = 1 ) fit <- xgboost::xgb.train(data = dtrain, nrounds = 20, nthread = 1) x <- shapviz(fit, X_pred = dtrain, X = iris[, -1]) sv_waterfall(x) sv_waterfall(x, row_id = 123, max_display = 2, size = 9, fill_colors = 4:5) # Ordered by colnames(x), combined with max_display sv_waterfall( x[, sort(colnames(x))], order_fun = function(s) length(s):1, max_display = 3 ) # Aggregate over all observations with Petal.Length == 1.4 sv_waterfall(x, row_id = x$X$Petal.Length == 1.4)
dtrain <- xgboost::xgb.DMatrix( data.matrix(iris[, -1]), label = iris[, 1], nthread = 1 ) fit <- xgboost::xgb.train(data = dtrain, nrounds = 20, nthread = 1) x <- shapviz(fit, X_pred = dtrain, X = iris[, -1]) sv_waterfall(x) sv_waterfall(x, row_id = 123, max_display = 2, size = 9, fill_colors = 4:5) # Ordered by colnames(x), combined with max_display sv_waterfall( x[, sort(colnames(x))], order_fun = function(s) length(s):1, max_display = 3 ) # Aggregate over all observations with Petal.Length == 1.4 sv_waterfall(x, row_id = x$X$Petal.Length == 1.4)