Title: | Model Audit - Verification, Validation, and Error Analysis |
---|---|
Description: | Provides an easy to use unified interface for creating validation plots for any model. The 'auditor' helps to avoid repetitive work consisting of writing code needed to create residual plots. This visualizations allow to asses and compare the goodness of fit, performance, and similarity of models. |
Authors: | Alicja Gosiewska [aut, cre] , Przemyslaw Biecek [aut, ths] , Hubert Baniecki [aut] , Tomasz Mikołajczyk [aut], Michal Burdukiewicz [ctb], Szymon Maksymiuk [ctb] |
Maintainer: | Alicja Gosiewska <[email protected]> |
License: | GPL |
Version: | 1.3.5 |
Built: | 2025-01-22 03:41:54 UTC |
Source: | https://github.com/modeloriented/auditor |
The audit()
function is deprecated, use explain
from the DALEX
package instead.
audit( object, data = NULL, y = NULL, predict.function = NULL, residual.function = NULL, label = NULL, predict_function = NULL, residual_function = NULL )
audit( object, data = NULL, y = NULL, predict.function = NULL, residual.function = NULL, label = NULL, predict_function = NULL, residual_function = NULL )
object |
An object containing a model or object of class explainer (see |
data |
Data.frame or matrix - data that will be used by further validation functions. If not provided, will be extracted from the model. |
y |
Response vector that will be used by further validation functions. Some functions may require an integer vector containing binary labels with values 0,1. If not provided, will be extracted from the model. |
predict.function |
Function that takes two arguments: model and data. It should return a numeric vector with predictions. |
residual.function |
Function that takes three arguments: model, data and response vector. It should return a numeric vector with model residuals for given data. If not provided, response residuals ( |
label |
Character - the name of the model. By default it's extracted from the 'class' attribute of the model. |
predict_function |
Function that takes two arguments: model and data. It should return a numeric vector with predictions. |
residual_function |
Function that takes three arguments: model, data and response vector. It should return a numeric vector with model residuals for given data. If not provided, response residuals ( |
An object of class explainer
.
data(titanic_imputed, package = "DALEX") model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) audit_glm <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) p_fun <- function(model, data) { predict(model, data, response = "link") } audit_glm_newpred <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived, predict.function = p_fun) library(randomForest) model_rf <- randomForest(Species ~ ., data=iris) audit_rf <- audit(model_rf)
data(titanic_imputed, package = "DALEX") model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) audit_glm <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) p_fun <- function(model, data) { predict(model, data, response = "link") } audit_glm_newpred <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived, predict.function = p_fun) library(randomForest) model_rf <- randomForest(Species ~ ., data=iris) audit_rf <- audit(model_rf)
The auditor Data is an artificial data set. It consists of 2000 observations. First four of simulated variables are treated as continuous while the fifth one is categorical.
data(auditorData)
data(auditorData)
a data frame with 2000 rows and 5 columns
data("auditorData", package = "auditor") head(auditorData)
data("auditorData", package = "auditor") head(auditorData)
Currently three tests are performed - for outliers in residuals - for autocorrelation in target variable or in residuals - for trend in residuals as a function of target variable (detection of bias)
check_residuals(object, ...)
check_residuals(object, ...)
object |
An object of class 'explainer' created with function |
... |
other parameters that will be passed to further functions. |
list with statistics for particular checks
dragons <- DALEX::dragons[1:100, ] lm_model <- lm(life_length ~ ., data = dragons) lm_audit <- audit(lm_model, data = dragons, y = dragons$life_length) check_residuals(lm_audit) ## Not run: library("randomForest") rf_model <- randomForest(life_length ~ ., data = dragons) rf_audit <- audit(rf_model, data = dragons, y = dragons$life_length) check_residuals(rf_audit) ## End(Not run)
dragons <- DALEX::dragons[1:100, ] lm_model <- lm(life_length ~ ., data = dragons) lm_audit <- audit(lm_model, data = dragons, y = dragons$life_length) check_residuals(lm_audit) ## Not run: library("randomForest") rf_model <- randomForest(life_length ~ ., data = dragons) rf_audit <- audit(rf_model, data = dragons, y = dragons$life_length) check_residuals(rf_audit) ## End(Not run)
Checks for autocorrelation in target variable or in residuals
check_residuals_autocorrelation(object, method = "pearson")
check_residuals_autocorrelation(object, method = "pearson")
object |
An object of class 'explainer' created with function |
method |
will be passed to the cor.test functions |
autocorrelation between target variable and between residuals
dragons <- DALEX::dragons[1:100, ] lm_model <- lm(life_length ~ ., data = dragons) lm_audit <- audit(lm_model, data = dragons, y = dragons$life_length) check_residuals_autocorrelation(lm_audit)
dragons <- DALEX::dragons[1:100, ] lm_model <- lm(life_length ~ ., data = dragons) lm_audit <- audit(lm_model, data = dragons, y = dragons$life_length) check_residuals_autocorrelation(lm_audit)
Outlier checks
check_residuals_outliers(object, n = 5)
check_residuals_outliers(object, n = 5)
object |
An object of class 'explainer' created with function |
n |
number of lowest and highest standardized residuals to be presented |
indexes of lowest and highest standardized residuals
dragons <- DALEX::dragons[1:100, ] lm_model <- lm(life_length ~ ., data = dragons) lm_audit <- audit(lm_model, data = dragons, y = dragons$life_length) check_residuals_outliers(lm_audit)
dragons <- DALEX::dragons[1:100, ] lm_model <- lm(life_length ~ ., data = dragons) lm_audit <- audit(lm_model, data = dragons, y = dragons$life_length) check_residuals_outliers(lm_audit)
Calculates loess fit for residuals and then extracts statistics that shows how far is this fit from one without trend
check_residuals_trend(object, B = 20)
check_residuals_trend(object, B = 20)
object |
An object of class 'explainer' created with function |
B |
number of samplings |
standardized loess fit for residuals
library(DALEX) dragons <- DALEX::dragons[1:100, ] lm_model <- lm(life_length ~ ., data = dragons) lm_exp <- explain(lm_model, data = dragons, y = dragons$life_length) library(auditor) check_residuals_trend(lm_exp)
library(DALEX) dragons <- DALEX::dragons[1:100, ] lm_model <- lm(life_length ~ ., data = dragons) lm_exp <- explain(lm_model, data = dragons, y = dragons$life_length) library(auditor) check_residuals_trend(lm_exp)
Calculates Cook's distances for each observation.
Please, note that it will work only for functions with specified update
method.
model_cooksdistance(object) observationInfluence(object)
model_cooksdistance(object) observationInfluence(object)
object |
An object of class |
An object of the class auditor_model_cooksdistance
.
Cook, R. Dennis (1977). "Detection of Influential Observations in Linear Regression". doi:10.2307/1268249.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # use DALEX package to wrap up a model into explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor mc <- model_cooksdistance(glm_audit) mc plot(mc)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # use DALEX package to wrap up a model into explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor mc <- model_cooksdistance(glm_audit) mc plot(mc)
Creates explanation of classification model.
Returns, among others, true positive rate (tpr), false positive rate (fpr), rate of positive prediction (rpp), and true positives (tp).
Created object of class auditor_model_evaluation
can be used to plot
Receiver Operating Characteristic (ROC) curve (plot plot_roc
) and LIFT curve (plot plot_lift
).
model_evaluation(object) modelEvaluation(object)
model_evaluation(object) modelEvaluation(object)
object |
An object of class |
An object of the class auditor_model_evaluation
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data= titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor me <- model_evaluation(glm_audit) me plot(me)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data= titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor me <- model_evaluation(glm_audit) me plot(me)
Creates auditor_model_halfnormal
object that can be used for plotting halfnormal plot.
model_halfnormal(object, quant = FALSE, ...) modelFit(object, quant = FALSE, ...)
model_halfnormal(object, quant = FALSE, ...) modelFit(object, quant = FALSE, ...)
object |
An object of class |
quant |
if TRUE values on axis are on quantile scale. |
... |
other parameters passed do |
An object of the class auditor_model_halfnormal
.
Moral, R., Hinde, J., & Demétrio, C. (2017). Half-Normal Plots and Overdispersed Models in R: The hnp Package.doi:http://dx.doi.org/10.18637/jss.v081.i10
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor mh <- model_halfnormal(glm_audit) mh plot(mh)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor mh <- model_halfnormal(glm_audit) mh plot(mh)
Creates auditor_model_performance
object that can be used to plot radar with ranking of models.
model_performance( object, score = c("mae", "mse", "rec", "rroc"), new_score = NULL, data = NULL, ... ) modelPerformance( object, score = c("mae", "mse", "rec", "rroc"), new_score = NULL )
model_performance( object, score = c("mae", "mse", "rec", "rroc"), new_score = NULL, data = NULL, ... ) modelPerformance( object, score = c("mae", "mse", "rec", "rroc"), new_score = NULL )
object |
An object of class |
score |
Vector of score names to be calculated. Possible values: |
new_score |
A named list of functions that take one argument: object of class 'explainer' and return a numeric value. The measure calculated by the function should have the property that lower score value indicates better model. |
data |
New data that will be used to calculate scores. Pass |
... |
Other arguments dependent on the score list. |
An object of the class auditor_model_performance
.
score_acc
, score_auc
, score_cooksdistance
, score_dw
,
score_f1
, score_gini
,
score_halfnormal
, score_mae
, score_mse
,
score_peak
, score_precision
, score_r2
,
score_rec
, score_recall
, score_rmse
,
score_rroc
, score_runs
, score_specificity
,
score_one_minus_acc
, score_one_minus_auc
, score_one_minus_f1
,
score_one_minus_precision
, score_one_minus_gini
,
score_one_minus_recall
, score_one_minus_specificity
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # use DALEX package to wrap up a model into explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor library(auditor) mp <- model_performance(glm_audit) mp plot(mp)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # use DALEX package to wrap up a model into explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor library(auditor) mp <- model_performance(glm_audit) mp plot(mp)
Creates auditor_model_residual
that contains sorted residuals.
An object can be further used to generate plots.
For the list of possible plots see see also section.
model_residual(object, ...) modelResiduals(object, ...)
model_residual(object, ...) modelResiduals(object, ...)
object |
An object of class |
... |
other parameters |
An object of the class auditor_model_residual
.
plot_acf, plot_autocorrelation, plot_residual, plot_residual_boxplot,
plot_pca, plot_correlation, plot_prediction, plot_rec, plot_residual_density,
plot_residual, plot_rroc, plot_scalelocation, plot_tsecdf
library(DALEX) # fit a model model_glm <- glm(m2.price ~ ., data = apartments) glm_audit <- explain(model_glm, data = apartments, y = apartments$m2.price) # validate a model with auditor mr <- model_residual(glm_audit) mr plot(mr)
library(DALEX) # fit a model model_glm <- glm(m2.price ~ ., data = apartments) glm_audit <- explain(model_glm, data = apartments, y = apartments$m2.price) # validate a model with auditor mr <- model_residual(glm_audit) mr plot(mr)
Plot Autocorrelation Function of models' residuals.
plot_acf(object, ..., variable = NULL, alpha = 0.95) plotACF(object, ..., variable = NULL, alpha = 0.95)
plot_acf(object, ..., variable = NULL, alpha = 0.95) plotACF(object, ..., variable = NULL, alpha = 0.95)
object |
An object of class |
... |
Other |
variable |
Name of variable to order residuals on a plot.
If |
alpha |
Confidence level of the interval. |
A ggplot object.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot(mr_lm, type = "acf") plot_acf(mr_lm) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_acf(mr_lm, mr_rf) plot(mr_lm, mr_rf, type="acf")
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot(mr_lm, type = "acf") plot_acf(mr_lm) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_acf(mr_lm, mr_rf) plot(mr_lm, mr_rf, type="acf")
This function provides several diagnostic plots for regression and classification models.
Provide object created with one of auditor's computational functions, model_residual
,
model_cooksdistance
, model_evaluation
, model_performance
,
model_evaluation
.
plot_auditor(x, ..., type = "residual", ask = TRUE, grid = TRUE) ## S3 method for class 'auditor_model_residual' plot(x, ..., type = "residual", ask = TRUE, grid = TRUE) ## S3 method for class 'auditor_model_performance' plot(x, ..., type = "residual", ask = TRUE, grid = TRUE) ## S3 method for class 'auditor_model_halfnormal' plot(x, ..., type = "residual", ask = TRUE, grid = TRUE) ## S3 method for class 'auditor_model_evaluation' plot(x, ..., type = "residual", ask = TRUE, grid = TRUE) ## S3 method for class 'auditor_model_cooksdistance' plot(x, ..., type = "residual", ask = TRUE, grid = TRUE)
plot_auditor(x, ..., type = "residual", ask = TRUE, grid = TRUE) ## S3 method for class 'auditor_model_residual' plot(x, ..., type = "residual", ask = TRUE, grid = TRUE) ## S3 method for class 'auditor_model_performance' plot(x, ..., type = "residual", ask = TRUE, grid = TRUE) ## S3 method for class 'auditor_model_halfnormal' plot(x, ..., type = "residual", ask = TRUE, grid = TRUE) ## S3 method for class 'auditor_model_evaluation' plot(x, ..., type = "residual", ask = TRUE, grid = TRUE) ## S3 method for class 'auditor_model_cooksdistance' plot(x, ..., type = "residual", ask = TRUE, grid = TRUE)
x |
object of class |
... |
other arguments dependent on the type of plot or additional objects of classes |
type |
the type of plot. Character or vector of characters. Possible values: |
ask |
logical; if |
grid |
logical; if |
A ggplot object.
plot_acf, plot_autocorrelation, plot_cooksdistance,
plot_halfnormal, plot_residual_boxplot, plot_lift, plot_pca,
plot_radar, plot_correlation,
plot_prediction, plot_rec, plot_residual_density, plot_residual, plot_roc,
plot_rroc, plot_scalelocation, plot_tsecdf
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot(mr_lm) plot(mr_lm, type = "prediction") hn_lm <- model_halfnormal(lm_audit) plot(hn_lm) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mp_rf <- model_performance(rf_audit) mp_lm <- model_performance(lm_audit) plot(mp_lm, mp_rf)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot(mr_lm) plot(mr_lm, type = "prediction") hn_lm <- model_halfnormal(lm_audit) plot(hn_lm) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mp_rf <- model_performance(rf_audit) mp_lm <- model_performance(lm_audit) plot(mp_lm, mp_rf)
Plot of i-th residual vs i+1-th residual.
plot_autocorrelation(object, ..., variable = "_y_hat_", smooth = FALSE) plotAutocorrelation(object, ..., variable, smooth = FALSE)
plot_autocorrelation(object, ..., variable = "_y_hat_", smooth = FALSE) plotAutocorrelation(object, ..., variable, smooth = FALSE)
object |
An object of class |
... |
Other |
variable |
Name of variable to order residuals on a plot.
If |
smooth |
Logical, if TRUE smooth line will be added. |
A ggplot object.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_autocorrelation(mr_lm) plot(mr_lm, type = "autocorrelation") plot_autocorrelation(mr_lm, smooth = TRUE) plot(mr_lm, type = "autocorrelation", smooth = TRUE)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_autocorrelation(mr_lm) plot(mr_lm, type = "autocorrelation") plot_autocorrelation(mr_lm, smooth = TRUE) plot(mr_lm, type = "autocorrelation", smooth = TRUE)
Plot of Cook’s distances used for estimate the influence of an single observation.
plot_cooksdistance(object, ..., nlabel = 3) plotCooksDistance(object, ..., nlabel = 3)
plot_cooksdistance(object, ..., nlabel = 3) plotCooksDistance(object, ..., nlabel = 3)
object |
An object of class |
... |
Other objects of class |
nlabel |
Number of observations with the biggest Cook's distances to be labeled. |
Cook’s distance is a tool for identifying observations that may negatively affect the model. They may be also used for indicating regions of the design space where it would be good to obtain more observations. Data points indicated by Cook’s distances are worth checking for validity.
Cook’s Distances are calculated by removing the i-th observation from the data and recalculating the model. It shows how much all the values in the model change when the i-th observation is removed.
For model classes other than lm and glm the distances are computed directly from the definition.
A ggplot object.
Cook, R. Dennis (1977). "Detection of Influential Observations in Linear Regression". doi:10.2307/1268249.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor library(auditor) cd_lm <- model_cooksdistance(lm_audit) # plot results plot_cooksdistance(cd_lm) plot(cd_lm, type = "cooksdistance")
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor library(auditor) cd_lm <- model_cooksdistance(lm_audit) # plot results plot_cooksdistance(cd_lm) plot(cd_lm, type = "cooksdistance")
Matrix of plots. Left-down triangle consists of plots of fitted values (alternatively residuals), on the diagonal there are density plots of fitted values (alternatively residuals), in the right-top triangle there are correlations between fitted values (alternatively residuals).
plot_correlation(object, ..., values = "fit") plotModelCorrelation(object, ..., values = "fit")
plot_correlation(object, ..., values = "fit") plotModelCorrelation(object, ..., values = "fit")
object |
An object of class |
... |
Other |
values |
"fit" for model fitted values or "res" for residual values. |
Invisibly returns a gtable
object.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # use DALEX package to wrap up a model into explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) # plot results plot_correlation(mr_lm, mr_rf) plot(mr_lm, mr_rf, type = "correlation")
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # use DALEX package to wrap up a model into explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) # plot results plot_correlation(mr_lm, mr_rf) plot(mr_lm, mr_rf, type = "correlation")
The half-normal plot is one of the tools designed to evaluate the goodness of fit of a statistical models. It is a graphical method for comparing two probability distributions by plotting their quantiles against each other. Points on the plot correspond to ordered absolute values of model diagnostic (i.e. standardized residuals) plotted against theoretical order statistics from a half-normal distribution.
plot_halfnormal(object, ..., quantiles = FALSE, sim = 99) plotHalfNormal(object, ..., quantiles = FALSE, sim = 99)
plot_halfnormal(object, ..., quantiles = FALSE, sim = 99) plotHalfNormal(object, ..., quantiles = FALSE, sim = 99)
object |
An object of class |
... |
Other |
quantiles |
If TRUE values on axis are on quantile scale. |
sim |
Number of residuals to simulate. |
A ggplot object.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor hn_lm <- model_halfnormal(lm_audit) # plot results plot_halfnormal(hn_lm) plot(hn_lm)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor hn_lm <- model_halfnormal(lm_audit) # plot results plot_halfnormal(hn_lm) plot(hn_lm)
LIFT is a plot of the rate of positive prediction against true positive rate for the different thresholds. It is useful for measuring and comparing the accuracy of the classificators.
plot_lift(object, ..., zeros = TRUE) plotLIFT(object, ...)
plot_lift(object, ..., zeros = TRUE) plotLIFT(object, ...)
object |
An object of class |
... |
Other |
zeros |
Logical. It makes the lines start from the |
A ggplot object.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor eva_glm <- model_evaluation(glm_audit) # plot results plot_lift(eva_glm) plot(eva_glm, type ="lift") model_glm_2 <- glm(survived ~ .-age, family = binomial, data = titanic_imputed) glm_audit_2 <- audit(model_glm_2, data = titanic_imputed, y = titanic_imputed$survived, label = "glm2") eva_glm_2 <- model_evaluation(glm_audit_2) plot_lift(eva_glm, eva_glm_2) plot(eva_glm, eva_glm_2, type = "lift")
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor eva_glm <- model_evaluation(glm_audit) # plot results plot_lift(eva_glm) plot(eva_glm, type ="lift") model_glm_2 <- glm(survived ~ .-age, family = binomial, data = titanic_imputed) glm_audit_2 <- audit(model_glm_2, data = titanic_imputed, y = titanic_imputed$survived, label = "glm2") eva_glm_2 <- model_evaluation(glm_audit_2) plot_lift(eva_glm, eva_glm_2) plot(eva_glm, eva_glm_2, type = "lift")
Principal Component Analysis of models residuals. PCA can be used to assess the similarity of the models.
plot_pca(object, ..., scale = TRUE, arrow_size = 2) plotModelPCA(object, ..., scale = TRUE)
plot_pca(object, ..., scale = TRUE, arrow_size = 2) plotModelPCA(object, ..., scale = TRUE)
object |
An object of class |
... |
Other |
scale |
A logical value indicating whether the models residuals should be scaled before the analysis. |
arrow_size |
Width of the arrows. |
A ggplot object.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) # plot results plot_pca(mr_lm, mr_rf)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) # plot results plot_pca(mr_lm, mr_rf)
Precision-Recall Curve summarize the trade-off between the true positive rate and the positive predictive value for a model. It is useful for measuring performance and comparing classificators.
Receiver Operating Characteristic Curve is a plot of the true positive rate (TPR) against the false positive rate (FPR) for the different thresholds. It is useful for measuring and comparing the accuracy of the classificators.
plot_prc(object, ..., nlabel = NULL) plot_roc(object, ..., nlabel = NULL) plotROC(object, ..., nlabel = NULL)
plot_prc(object, ..., nlabel = NULL) plot_roc(object, ..., nlabel = NULL) plotROC(object, ..., nlabel = NULL)
object |
An object of class |
... |
Other |
nlabel |
Number of cutoff points to show on the plot. Default is |
A ggplot object.
A ggplot object.
library(DALEX) # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor eva_glm <- model_evaluation(glm_audit) # plot results plot_prc(eva_glm) plot(eva_glm) #add second model model_glm_2 <- glm(survived ~ .-age, family = binomial, data = titanic_imputed) glm_audit_2 <- audit(model_glm_2, data = titanic_imputed, y = titanic_imputed$survived, label = "glm2") eva_glm_2 <- model_evaluation(glm_audit_2) plot_prc(eva_glm, eva_glm_2) plot(eva_glm, eva_glm_2) data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # use DALEX package to wrap up a model into explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor eva_glm <- model_evaluation(glm_audit) # plot results plot_roc(eva_glm) plot(eva_glm) #add second model model_glm_2 <- glm(survived ~ .-age, family = binomial, data = titanic_imputed) glm_audit_2 <- audit(model_glm_2, data = titanic_imputed, y = titanic_imputed$survived, label = "glm2") eva_glm_2 <- model_evaluation(glm_audit_2) plot_roc(eva_glm, eva_glm_2) plot(eva_glm, eva_glm_2)
library(DALEX) # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor eva_glm <- model_evaluation(glm_audit) # plot results plot_prc(eva_glm) plot(eva_glm) #add second model model_glm_2 <- glm(survived ~ .-age, family = binomial, data = titanic_imputed) glm_audit_2 <- audit(model_glm_2, data = titanic_imputed, y = titanic_imputed$survived, label = "glm2") eva_glm_2 <- model_evaluation(glm_audit_2) plot_prc(eva_glm, eva_glm_2) plot(eva_glm, eva_glm_2) data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # use DALEX package to wrap up a model into explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor eva_glm <- model_evaluation(glm_audit) # plot results plot_roc(eva_glm) plot(eva_glm) #add second model model_glm_2 <- glm(survived ~ .-age, family = binomial, data = titanic_imputed) glm_audit_2 <- audit(model_glm_2, data = titanic_imputed, y = titanic_imputed$survived, label = "glm2") eva_glm_2 <- model_evaluation(glm_audit_2) plot_roc(eva_glm, eva_glm_2) plot(eva_glm, eva_glm_2)
Plot of predicted response vs observed or variable Values.
plot_prediction(object, ..., variable = "_y_", smooth = FALSE, abline = FALSE) plotPrediction(object, ..., variable = NULL, smooth = FALSE, abline = FALSE)
plot_prediction(object, ..., variable = "_y_", smooth = FALSE, abline = FALSE) plotPrediction(object, ..., variable = NULL, smooth = FALSE, abline = FALSE)
object |
An object of class |
... |
Other |
variable |
Name of variable to order residuals on a plot.
If |
smooth |
Logical, indicates whenever smooth line should be added. |
abline |
Logical, indicates whenever function |
A ggplot2 object.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_prediction(mr_lm, abline = TRUE) plot_prediction(mr_lm, variable = "height", smooth = TRUE) plot(mr_lm, type = "prediction", abline = TRUE) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_prediction(mr_lm, mr_rf, variable = "height", smooth = TRUE)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_prediction(mr_lm, abline = TRUE) plot_prediction(mr_lm, variable = "height", smooth = TRUE) plot(mr_lm, type = "prediction", abline = TRUE) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_prediction(mr_lm, mr_rf, variable = "height", smooth = TRUE)
Radar plot with model score. score are scaled to [0,1]
,
each score is inversed and divided by maximum score value.
plot_radar(object, ..., verbose = TRUE) plotModelRanking(object, ..., verbose = TRUE)
plot_radar(object, ..., verbose = TRUE) plotModelRanking(object, ..., verbose = TRUE)
object |
An object of class |
... |
Other |
verbose |
Logical, indicates whether values of scores should be printed. |
A ggplot object.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mp_lm <- model_performance(lm_audit) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mp_rf <- model_performance(rf_audit) # plot results plot_radar(mp_lm, mp_rf)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mp_lm <- model_performance(lm_audit) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mp_rf <- model_performance(rf_audit) # plot results plot_radar(mp_lm, mp_rf)
Error Characteristic curves are a generalization of ROC curves. On the x axis of the plot there is an error tolerance and on the y axis there is a percentage of observations predicted within the given tolerance.
plot_rec(object, ...) plotREC(object, ...)
plot_rec(object, ...) plotREC(object, ...)
object |
An object of class |
... |
Other |
REC curve estimates the Cumulative Distribution Function (CDF) of the error
Area Over the REC Curve (REC) is a biased estimate of the expected error
A ggplot object.
Bi J., Bennett K.P. (2003). Regression error characteristic curves, in: Twentieth International Conference on Machine Learning (ICML-2003), Washington, DC.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # use DALEX package to wrap up a model into explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) plot_rec(mr_lm) plot(mr_lm, type = "rec") library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_rec(mr_lm, mr_rf) plot(mr_lm, mr_rf, type = "rec")
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # use DALEX package to wrap up a model into explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) plot_rec(mr_lm) plot(mr_lm, type = "rec") library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_rec(mr_lm, mr_rf) plot(mr_lm, mr_rf, type = "rec")
A plot of residuals against fitted values, observed values or any variable.
plot_residual( object, ..., variable = "_y_", smooth = FALSE, std_residuals = FALSE, nlabel = 0 ) plotResidual( object, ..., variable = NULL, smooth = FALSE, std_residuals = FALSE, nlabel = 0 )
plot_residual( object, ..., variable = "_y_", smooth = FALSE, std_residuals = FALSE, nlabel = 0 ) plotResidual( object, ..., variable = NULL, smooth = FALSE, std_residuals = FALSE, nlabel = 0 )
object |
An object of class |
... |
Other |
variable |
Name of variable to order residuals on a plot.
If |
smooth |
Logical, indicates whenever smoothed lines should be added. By default it's |
std_residuals |
Logical, indicates whenever standardized residuals should be used. |
nlabel |
Number of observations with the biggest absolute values of residuals to be labeled. |
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_residual(mr_lm) plot(mr_lm, type = "residual") library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_residual(mr_lm, mr_rf) plot(mr_rf, mr_rf, type = "residual")
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_residual(mr_lm) plot(mr_lm, type = "residual") library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_residual(mr_lm, mr_rf) plot(mr_rf, mr_rf, type = "residual")
A boxplot of residuals.
plot_residual_boxplot(object, ...) plotResidualBoxplot(object, ...)
plot_residual_boxplot(object, ...) plotResidualBoxplot(object, ...)
object |
An object of class |
... |
Other |
A ggplot object.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_residual_boxplot(mr_lm) plot(mr_lm, type = "residual_boxplot") library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_residual_boxplot(mr_lm, mr_rf) plot(mr_lm, mr_rf)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_residual_boxplot(mr_lm) plot(mr_lm, type = "residual_boxplot") library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_residual_boxplot(mr_lm, mr_rf) plot(mr_lm, mr_rf)
Density of model residuals.
plot_residual_density(object, ..., variable = "", show_rugs = TRUE) plotResidualDensity(object, ..., variable = NULL)
plot_residual_density(object, ..., variable = "", show_rugs = TRUE) plotResidualDensity(object, ..., variable = NULL)
object |
An object of class |
... |
Other |
variable |
Split plot by variable's factor level or median.
If |
show_rugs |
Adds rugs layer to the plot. By default it's TRUE |
A ggplot object.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_residual_density(mr_lm) plot(mr_lm, type = "residual_density") library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_residual_density(mr_lm, mr_rf) plot(mr_lm, mr_rf, type = "residual_density")
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_residual_density(mr_lm) plot(mr_lm, type = "residual_density") library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_residual_density(mr_lm, mr_rf) plot(mr_lm, mr_rf, type = "residual_density")
The basic idea of the ROC curves for regression is to show model asymmetry. The RROC is a plot where on the x-axis we depict total over-estimation and on the y-axis total under-estimation.
plot_rroc(object, ...) plotRROC(object, ...)
plot_rroc(object, ...) plotRROC(object, ...)
object |
An object of class |
... |
Other |
For RROC curves we use a shift, which is an equivalent to the threshold for ROC curves.
For each observation we calculate new prediction: where s is the shift.
Therefore, there are different error values for each shift:
Over-estimation is calculated as: .
Under-estimation is calculated as: .
The shift equals 0 is represented by a dot.
The Area Over the RROC Curve (AOC) equals to the variance of the errors multiplied by .
A ggplot object.
Hernández-Orallo, José. 2013. "ROC Curves for Regression". Pattern Recognition 46 (12): 3395–3411.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_rroc(mr_lm) plot(mr_lm, type = "rroc") library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_rroc(mr_lm, mr_rf) plot(mr_lm, mr_rf, type="rroc")
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_rroc(mr_lm) plot(mr_lm, type = "rroc") library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_rroc(mr_lm, mr_rf) plot(mr_lm, mr_rf, type="rroc")
Variable values vs square root of the absolute value of the residuals. A vertical line corresponds to median.
plot_scalelocation( object, ..., variable = "_y_", smooth = FALSE, peaks = FALSE ) plotScaleLocation(object, ..., variable = NULL, smooth = FALSE, peaks = FALSE)
plot_scalelocation( object, ..., variable = "_y_", smooth = FALSE, peaks = FALSE ) plotScaleLocation(object, ..., variable = NULL, smooth = FALSE, peaks = FALSE)
object |
An object of class |
... |
Other |
variable |
Name of variable to order residuals on a plot.
If |
smooth |
Logical, indicates whenever smoothed lines should be added. By default it's |
peaks |
A logical value. If |
A ggplot object.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_scalelocation(mr_lm) plot(mr_lm, type = "scalelocation")
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plot_scalelocation(mr_lm) plot(mr_lm, type = "scalelocation")
Cumulative Distribution Function for positive and negative residuals.
plot_tsecdf( object, ..., scale_error = TRUE, outliers = NA, residuals = TRUE, reverse_y = FALSE ) plotTwoSidedECDF( object, ..., scale_error = TRUE, outliers = NA, residuals = TRUE, reverse_y = FALSE )
plot_tsecdf( object, ..., scale_error = TRUE, outliers = NA, residuals = TRUE, reverse_y = FALSE ) plotTwoSidedECDF( object, ..., scale_error = TRUE, outliers = NA, residuals = TRUE, reverse_y = FALSE )
object |
An object of class 'auditor_model_residual' created with |
... |
Other modelAudit objects to be plotted together. |
scale_error |
A logical value indicating whether ECDF should be scaled by proportions of positive and negative proportions. |
outliers |
Number of outliers to be marked. |
residuals |
A logical value indicating whether residuals should be marked. |
reverse_y |
A logical value indicating whether values on y axis should be reversed. |
A ggplot object.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) plot_tsecdf(mr_lm) plot(mr_lm, type="tsecdf") library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_tsecdf(mr_lm, mr_rf, reverse_y = TRUE)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) plot_tsecdf(mr_lm) plot(mr_lm, type="tsecdf") library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plot_tsecdf(mr_lm, mr_rf, reverse_y = TRUE)
This function provides several diagnostic plots for regression and classification models.
Provide object created with one of auditor's computational functions, model_residual
,
model_cooksdistance
, model_evaluation
, model_performance
,
model_evaluation
.
plotD3(x, ...) plotD3_auditor(x, ..., type = "residual") ## S3 method for class 'auditor_model_residual' plotD3(x, ..., type = "residual") ## S3 method for class 'auditor_model_halfnormal' plotD3(x, ..., type = "residual") ## S3 method for class 'auditor_model_evaluation' plotD3(x, ..., type = "residual") ## S3 method for class 'auditor_model_cooksdistance' plotD3(x, ..., type = "residual")
plotD3(x, ...) plotD3_auditor(x, ..., type = "residual") ## S3 method for class 'auditor_model_residual' plotD3(x, ..., type = "residual") ## S3 method for class 'auditor_model_halfnormal' plotD3(x, ..., type = "residual") ## S3 method for class 'auditor_model_evaluation' plotD3(x, ..., type = "residual") ## S3 method for class 'auditor_model_cooksdistance' plotD3(x, ..., type = "residual")
x |
object of class |
... |
other arguments dependent on the type of plot or additional objects of classes |
type |
the type of plot. Single character. Possible values:
|
plotD3_acf, plotD3_autocorrelation, plotD3_cooksdistance,
plotD3_halfnormal, plotD3_residual, plotD3_lift,
plotD3_prediction, plotD3_rec, plotD3_roc,
plotD3_rroc, plotD3_scalelocation
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3(mr_lm) plotD3(mr_lm, type = "prediction") hn_lm <- model_halfnormal(lm_audit) plotD3(hn_lm)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3(mr_lm) plotD3(mr_lm, type = "prediction") hn_lm <- model_halfnormal(lm_audit) plotD3(hn_lm)
Plot Autocorrelation Function of models' residuals.
plotD3_acf(object, ..., variable = NULL, alpha = 0.95, scale_plot = FALSE) plotD3ACF(object, ..., variable = NULL, alpha = 0.95, scale_plot = FALSE)
plotD3_acf(object, ..., variable = NULL, alpha = 0.95, scale_plot = FALSE) plotD3ACF(object, ..., variable = NULL, alpha = 0.95, scale_plot = FALSE)
object |
An object of class 'auditor_model_residual' created with |
... |
Other 'auditor_model_residual' objects to be plotted together. |
variable |
Name of variable to order residuals on a plot.
If |
alpha |
Confidence level of the interval. |
scale_plot |
Logical, indicates whenever the plot should scale with height. By default it's FALSE. |
a 'r2d3' object.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3_acf(mr_lm) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plotD3_acf(mr_lm, mr_rf)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3_acf(mr_lm) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plotD3_acf(mr_lm, mr_rf)
Plot of i-th residual vs i+1-th residual.
plotD3_autocorrelation( object, ..., variable = NULL, points = TRUE, smooth = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE ) plotD3Autocorrelation( object, ..., variable = NULL, points = TRUE, smooth = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE )
plotD3_autocorrelation( object, ..., variable = NULL, points = TRUE, smooth = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE ) plotD3Autocorrelation( object, ..., variable = NULL, points = TRUE, smooth = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE )
object |
An object of class 'auditor_model_residual' created with |
... |
Other 'auditor_model_residual' objects to be plotted together. |
variable |
Name of variable to order residuals on a plot.
If |
points |
Logical, indicates whenever observations should be added as points. By default it's TRUE. |
smooth |
Logical, indicates whenever smoothed lines should be added. By default it's FALSE. |
point_count |
Number of points to be plotted per model. Points will be chosen randomly. By default plot all of them. |
single_plot |
Logical, indicates whenever single or facets should be plotted. By default it's TRUE. |
scale_plot |
Logical, indicates whenever the plot should scale with height. By default it's FALSE. |
background |
Logical, available only if single_plot = FALSE. Indicates whenever background plots should be plotted. By default it's FALSE. |
a r2d3
object
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3_autocorrelation(mr_lm) plotD3_autocorrelation(mr_lm, smooth = TRUE)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3_autocorrelation(mr_lm) plotD3_autocorrelation(mr_lm, smooth = TRUE)
Plot of Cook’s distances used for estimate the influence of an single observation.
plotD3_cooksdistance( object, ..., nlabel = 3, single_plot = FALSE, scale_plot = FALSE, background = FALSE ) plotD3CooksDistance( object, ..., nlabel = 3, single_plot = FALSE, scale_plot = FALSE, background = FALSE )
plotD3_cooksdistance( object, ..., nlabel = 3, single_plot = FALSE, scale_plot = FALSE, background = FALSE ) plotD3CooksDistance( object, ..., nlabel = 3, single_plot = FALSE, scale_plot = FALSE, background = FALSE )
object |
An object of class 'auditor_model_cooksdistance' created with |
... |
Other objects of class 'auditor_model_cooksdistance'. |
nlabel |
Number of observations with the biggest Cook's distances to be labeled. |
single_plot |
Logical, indicates whenever single or facets should be plotted. By default it's FALSE. |
scale_plot |
Logical, indicates whenever the plot should scale with height. By default it's FALSE. |
background |
Logical, available only if single_plot = FALSE. Indicates whenever background plots should be plotted. By default it's FALSE. |
Cook’s distance is a tool for identifying observations that may negatively affect the model. They may be also used for indicating regions of the design space where it would be good to obtain more observations. Data points indicated by Cook’s distances are worth checking for validity.
Cook’s Distances are calculated by removing the i-th observation from the data and recalculating the model. It shows how much all the values in the model change when the i-th observation is removed.
For model classes other than lm and glm the distances are computed directly from the definition.
a r2d3
object
Cook, R. Dennis (1977). "Detection of Influential Observations in Linear Regression". doi:10.2307/1268249.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor cd_lm <- model_cooksdistance(lm_audit) # plot results plotD3_cooksdistance(cd_lm, nlabel = 5)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor cd_lm <- model_cooksdistance(lm_audit) # plot results plotD3_cooksdistance(cd_lm, nlabel = 5)
The half-normal plot is one of the tools designed to evaluate the goodness of fit of a statistical models. It is a graphical method for comparing two probability distributions by plotting their quantiles against each other. Points on the plot correspond to ordered absolute values of model diagnostic (i.e. standardized residuals) plotted against theoretical order statistics from a half-normal distribution.
plotD3_halfnormal(object, ..., quantiles = FALSE, sim = 99, scale_plot = FALSE) plotD3HalfNormal(object, ..., quantiles = FALSE, sim = 99, scale_plot = FALSE)
plotD3_halfnormal(object, ..., quantiles = FALSE, sim = 99, scale_plot = FALSE) plotD3HalfNormal(object, ..., quantiles = FALSE, sim = 99, scale_plot = FALSE)
object |
An object of class 'auditor_model_halfnormal' created with |
... |
Other 'auditor_model_halfnormal' objects. |
quantiles |
If TRUE values on axis are on quantile scale. |
sim |
Number of residuals to simulate. |
scale_plot |
Logical, indicates whenever the plot should scale with height. By default it's FALSE. |
a r2d3
object
score_halfnormal, plot_halfnormal
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor hn_lm <- model_halfnormal(lm_audit) # plot results plotD3_halfnormal(hn_lm)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor hn_lm <- model_halfnormal(lm_audit) # plot results plotD3_halfnormal(hn_lm)
LIFT is a plot of the rate of positive prediction against true positive rate for the different thresholds. It is useful for measuring and comparing the accuracy of the classificators.
plotD3_lift(object, ..., scale_plot = FALSE, zeros = TRUE) plotD3LIFT(object, ..., scale_plot = FALSE)
plotD3_lift(object, ..., scale_plot = FALSE, zeros = TRUE) plotD3LIFT(object, ..., scale_plot = FALSE)
object |
An object of class 'auditor_model_evaluation' created with |
... |
Other 'auditor_model_evaluation' objects to be plotted together. |
scale_plot |
Logical, indicates whenever the plot should scale with height. By default it's FALSE. |
zeros |
Logical. It makes the lines start from the |
a r2d3
object
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor eva_glm <- model_evaluation(glm_audit) # plot results plot_roc(eva_glm) plot(eva_glm) #add second model model_glm_2 <- glm(survived ~ .-age, family = binomial, data = titanic_imputed) glm_audit_2 <- audit(model_glm_2, data = titanic_imputed, y = titanic_imputed$survived, label = "glm2") eva_glm_2 <- model_evaluation(glm_audit_2) plotD3_lift(eva_glm, eva_glm_2)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor eva_glm <- model_evaluation(glm_audit) # plot results plot_roc(eva_glm) plot(eva_glm) #add second model model_glm_2 <- glm(survived ~ .-age, family = binomial, data = titanic_imputed) glm_audit_2 <- audit(model_glm_2, data = titanic_imputed, y = titanic_imputed$survived, label = "glm2") eva_glm_2 <- model_evaluation(glm_audit_2) plotD3_lift(eva_glm, eva_glm_2)
Function plotD3_prediction
plots predicted values observed or variable values in the model.
plotD3_prediction( object, ..., variable = "_y_", points = TRUE, smooth = FALSE, abline = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE ) plotD3Prediction( object, ..., variable = NULL, points = TRUE, smooth = FALSE, abline = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE )
plotD3_prediction( object, ..., variable = "_y_", points = TRUE, smooth = FALSE, abline = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE ) plotD3Prediction( object, ..., variable = NULL, points = TRUE, smooth = FALSE, abline = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE )
object |
An object of class 'auditor_model_residual. |
... |
Other modelAudit or modelResiduals objects to be plotted together. |
variable |
Name of variable to order residuals on a plot.
If |
points |
Logical, indicates whenever observations should be added as points. By default it's TRUE. |
smooth |
Logical, indicates whenever smoothed lines should be added. By default it's FALSE. |
abline |
Logical, indicates whenever function y = x should be added. Works only
with |
point_count |
Number of points to be plotted per model. Points will be chosen randomly. By default plot all of them. |
single_plot |
Logical, indicates whenever single or facets should be plotted. By default it's TRUE. |
scale_plot |
Logical, indicates whenever the plot should scale with height. By default it's FALSE. |
background |
Logical, available only if single_plot = FALSE. Indicates whenever background plots should be plotted. By default it's FALSE. |
a r2d3
object
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3_prediction(mr_lm, abline = TRUE) plotD3_prediction(mr_lm, variable = "height", smooth = TRUE) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plotD3_prediction(mr_lm, mr_rf, variable = "weight", smooth = TRUE)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3_prediction(mr_lm, abline = TRUE) plotD3_prediction(mr_lm, variable = "height", smooth = TRUE) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plotD3_prediction(mr_lm, mr_rf, variable = "weight", smooth = TRUE)
Error Characteristic curves are a generalization of ROC curves. On the x axis of the plot there is an error tolerance and on the y axis there is a percentage of observations predicted within the given tolerance.
plotD3_rec(object, ..., scale_plot = FALSE) plotD3REC(object, ..., scale_plot = FALSE)
plotD3_rec(object, ..., scale_plot = FALSE) plotD3REC(object, ..., scale_plot = FALSE)
object |
An object of class 'auditor_model_residual' created with |
... |
Other 'auditor_model_residual' objects to be plotted together. |
scale_plot |
Logical, indicates whenever the plot should scale with height. By default it's FALSE. |
REC curve estimates the Cumulative Distribution Function (CDF) of the error
Area Over the REC Curve (REC) is a biased estimate of the expected error
a r2d3
object
Bi J., Bennett K.P. (2003). Regression error characteristic curves, in: Twentieth International Conference on Machine Learning (ICML-2003), Washington, DC.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) plotD3_rec(mr_lm) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plotD3_rec(mr_lm, mr_rf)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) plotD3_rec(mr_lm) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plotD3_rec(mr_lm, mr_rf)
Function plotD3_residual
plots residual values vs fitted, observed or variable values in the model.
plotD3_residual( object, ..., variable = "_y_", points = TRUE, smooth = FALSE, std_residuals = FALSE, nlabel = 0, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE ) plotD3Residual( object, ..., variable = NULL, points = TRUE, smooth = FALSE, std_residuals = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE )
plotD3_residual( object, ..., variable = "_y_", points = TRUE, smooth = FALSE, std_residuals = FALSE, nlabel = 0, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE ) plotD3Residual( object, ..., variable = NULL, points = TRUE, smooth = FALSE, std_residuals = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE )
object |
An object of class 'auditor_model_residual' created with |
... |
Other 'auditor_model_residual' objects to be plotted together. |
variable |
Name of variable to order residuals on a plot.
If |
points |
Logical, indicates whenever observations should be added as points. By default it's TRUE. |
smooth |
Logical, indicates whenever smoothed lines should be added. By default it's FALSE. |
std_residuals |
Logical, indicates whenever standardized residuals should be used. By default it's FALSE. |
nlabel |
Number of observations with the biggest residuals to be labeled. |
point_count |
Number of points to be plotted per model. Points will be chosen randomly. By default plot all of them. |
single_plot |
Logical, indicates whenever single or facets should be plotted. By default it's TRUE. |
scale_plot |
Logical, indicates whenever the plot should scale with height. By default it's FALSE. |
background |
Logical, available only if single_plot = FALSE. Indicates whenever background plots should be plotted. By default it's FALSE. |
a r2d3
object
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # use DALEX package to wrap up a model into explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3_residual(mr_lm) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plotD3_residual(mr_lm, mr_rf)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # use DALEX package to wrap up a model into explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3_residual(mr_lm) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plotD3_residual(mr_lm, mr_rf)
Receiver Operating Characteristic Curve is a plot of the true positive rate (TPR) against the false positive rate (FPR) for the different thresholds. It is useful for measuring and comparing the accuracy of the classificators.
plotD3_roc(object, ..., nlabel = NULL, scale_plot = FALSE)
plotD3_roc(object, ..., nlabel = NULL, scale_plot = FALSE)
object |
An object of class |
... |
Other |
nlabel |
Number of cutoff points to show on the plot. Default is |
scale_plot |
Logical, indicates whenever the plot should scale with height. By default it's |
a r2d3
object
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # use DALEX package to wrap up a model into explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor eva_glm <- model_evaluation(glm_audit) # plot results plot_roc(eva_glm) plot(eva_glm) #add second model model_glm_2 <- glm(survived ~ .-age, family = binomial, data = titanic_imputed) glm_audit_2 <- audit(model_glm_2, data = titanic_imputed, y = titanic_imputed$survived, label = "glm2") eva_glm_2 <- model_evaluation(glm_audit_2) plotD3_roc(eva_glm, eva_glm_2)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # use DALEX package to wrap up a model into explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor eva_glm <- model_evaluation(glm_audit) # plot results plot_roc(eva_glm) plot(eva_glm) #add second model model_glm_2 <- glm(survived ~ .-age, family = binomial, data = titanic_imputed) glm_audit_2 <- audit(model_glm_2, data = titanic_imputed, y = titanic_imputed$survived, label = "glm2") eva_glm_2 <- model_evaluation(glm_audit_2) plotD3_roc(eva_glm, eva_glm_2)
The basic idea of the ROC curves for regression is to show model asymmetry. The RROC is a plot where on the x-axis we depict total over-estimation and on the y-axis total under-estimation.
plotD3_rroc(object, ..., scale_plot = FALSE)
plotD3_rroc(object, ..., scale_plot = FALSE)
object |
An object of class 'auditor_model_residual' created with |
... |
Other 'auditor_model_residual' objects to be plotted together. |
scale_plot |
Logical, indicates whenever the plot should scale with height. By default it's FALSE. |
For RROC curves we use a shift, which is an equivalent to the threshold for ROC curves.
For each observation we calculate new prediction: where s is the shift.
Therefore, there are different error values for each shift:
Over-estimation is calculated as: .
Under-estimation is calculated as: .
The shift equals 0 is represented by a dot.
The Area Over the RROC Curve (AOC) equals to the variance of the errors multiplied by .
a 'r2d3' object
Hernández-Orallo, José. 2013. "ROC Curves for Regression". Pattern Recognition 46 (12): 3395–3411.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # use DALEX package to wrap up a model into explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3_rroc(mr_lm) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plotD3_rroc(mr_lm, mr_rf)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # use DALEX package to wrap up a model into explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3_rroc(mr_lm) library(randomForest) model_rf <- randomForest(life_length~., data = dragons) rf_audit <- audit(model_rf, data = dragons, y = dragons$life_length) mr_rf <- model_residual(rf_audit) plotD3_rroc(mr_lm, mr_rf)
Function plotD3_scalelocation
plots square root of the absolute value of the residuals vs target,
observed or variable values in the model. A vertical line corresponds to median.
plotD3_scalelocation( object, ..., variable = NULL, smooth = FALSE, peaks = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE ) plotD3ScaleLocation( object, ..., variable = NULL, smooth = FALSE, peaks = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE )
plotD3_scalelocation( object, ..., variable = NULL, smooth = FALSE, peaks = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE ) plotD3ScaleLocation( object, ..., variable = NULL, smooth = FALSE, peaks = FALSE, point_count = NULL, single_plot = TRUE, scale_plot = FALSE, background = FALSE )
object |
An object of class |
... |
Other |
variable |
Name of variable to order residuals on a plot.
If |
smooth |
Logical, indicates whenever smoothed lines should be added. By default it's |
peaks |
Logical, indicates whenever peak observations should be highlighted. By default it's |
point_count |
Number of points to be plotted per model. Points will be chosen randomly. By default plot all of them. |
single_plot |
Logical, indicates whenever single or facets should be plotted. By default it's |
scale_plot |
Logical, indicates whenever the plot should scale with height. By default it's |
background |
Logical, available only if single_plot = FALSE. Indicates whenever background plots should be plotted. By default it's FALSE. |
a r2d3
object
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3_scalelocation(mr_lm, peaks = TRUE)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # validate a model with auditor mr_lm <- model_residual(lm_audit) # plot results plotD3_scalelocation(mr_lm, peaks = TRUE)
Prints Model Cook's Distances Summary
## S3 method for class 'auditor_model_cooksdistance' print(x, ...)
## S3 method for class 'auditor_model_cooksdistance' print(x, ...)
x |
an object |
... |
other parameters |
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score model_cooksdistance(lm_audit)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score model_cooksdistance(lm_audit)
Prints Model Evaluation Summary
## S3 method for class 'auditor_model_evaluation' print(x, ...)
## S3 method for class 'auditor_model_evaluation' print(x, ...)
x |
an object |
... |
other parameters |
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data= titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor model_evaluation(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data= titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor model_evaluation(glm_audit)
Prints Model Halfnormal Summary
## S3 method for class 'auditor_model_halfnormal' print(x, ...)
## S3 method for class 'auditor_model_halfnormal' print(x, ...)
x |
an object |
... |
other parameters |
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor model_halfnormal(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor model_halfnormal(glm_audit)
Prints Model Performance Summary
## S3 method for class 'auditor_model_performance' print(x, ...)
## S3 method for class 'auditor_model_performance' print(x, ...)
x |
an object |
... |
other parameters |
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor model_performance(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor model_performance(glm_audit)
Prints Model Residual Summary
## S3 method for class 'auditor_model_residual' print(x, ...)
## S3 method for class 'auditor_model_residual' print(x, ...)
x |
an object |
... |
other parameters |
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor model_residual(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # validate a model with auditor model_residual(glm_audit)
Prints of Models Scores
## S3 method for class 'auditor_score' print(x, ...)
## S3 method for class 'auditor_score' print(x, ...)
x |
an object |
... |
other parameters |
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score(glm_audit, type = "auc")
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score(glm_audit, type = "auc")
This function provides several scores for model validation and performance assessment. Scores can be also used to compare models.
score(object, type = "mse", data = NULL, ...)
score(object, type = "mse", data = NULL, ...)
object |
An object of class |
type |
The score to be calculated. Possible values: |
data |
New data that will be used to calculate the score. Pass |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
, except Cooks distance, where numeric vector is returned.
score_acc
, score_auc
, score_cooksdistance
, score_dw
, score_f1
,
score_gini
score_halfnormal
, score_mae
, score_mse
, score_peak
,
score_precision
, score_r2
, score_rec
, score_recall
, score_rmse
,
score_rroc
, score_runs
, score_specificity
, score_one_minus_acc
,
score_one_minus_auc
,
score_one_minus_f1
, score_one_minus_gini
, score_one_minus_precision
, score_one_minus_recall
,
score_one_minus_specificity
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score(lm_audit, type = 'mae')
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score(lm_audit, type = 'mae')
Accuracy
score_acc(object, cutoff = 0.5, data = NULL, y = NULL, ...)
score_acc(object, cutoff = 0.5, data = NULL, y = NULL, ...)
object |
An object of class |
cutoff |
Threshold value, which divides model predicted values (y_hat) to calculate confusion matrix.
By default it's |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_acc(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_acc(glm_audit)
Area Under Curve (AUC) for Receiver Operating Characteristic.
score_auc(object, data = NULL, y = NULL, ...) scoreROC(object)
score_auc(object, data = NULL, y = NULL, ...) scoreROC(object)
object |
An object of class |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_auc(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_auc(glm_audit)
Area under precision-recall (AUPRC) curve.
score_auprc(object, data = NULL, y = NULL, ...)
score_auprc(object, data = NULL, y = NULL, ...)
object |
An object of class |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_auprc(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_auprc(glm_audit)
Cook’s distance are used for estimate of the influence of an single observation.
score_cooksdistance(object, verbose = TRUE, ...) scoreCooksDistance(object, verbose = TRUE)
score_cooksdistance(object, verbose = TRUE, ...) scoreCooksDistance(object, verbose = TRUE)
object |
An object of class |
verbose |
If |
... |
Other arguments dependent on the type of score. |
Cook’s distance is a tool for identifying observations that may negatively affect the model. They may be also used for indicating regions of the design space where it would be good to obtain more observations. Data points indicated by Cook’s distances are worth checking for validity.
Cook’s Distances are calculated by removing the i-th observation from the data and recalculating the model. It shows how much all the values in the model change when the i-th observation is removed.
Models of classes other than lm and glm the distances are computed directly from the definition, so this may take a while.
A vector of Cook's distances for each observation.
numeric vector
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_cooksdistance(lm_audit)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_cooksdistance(lm_audit)
Score based on Durbin-Watson test statistic. The score value is helpful in comparing models. It is worth pointing out that results of tests like p-value makes sense only when the test assumptions are satisfied. Otherwise test statistic may be considered as a score.
score_dw(object, variable = NULL, data = NULL, y = NULL, ...) scoreDW(object, variable = NULL)
score_dw(object, variable = NULL, data = NULL, y = NULL, ...) scoreDW(object, variable = NULL)
object |
An object of class |
variable |
Name of model variable to order residuals. |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_dw(lm_audit)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_dw(lm_audit)
F1 Score
score_f1(object, cutoff = 0.5, data = NULL, y = NULL, ...)
score_f1(object, cutoff = 0.5, data = NULL, y = NULL, ...)
object |
An object of class |
cutoff |
Threshold value, which divides model predicted values (y_hat) to calculate confusion matrix.
By default it's |
data |
New data that will be used to calculate the score. Pass
|
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_f1(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_f1(glm_audit)
The Gini coefficient measures the inequality among values of a frequency distribution. A Gini coefficient equals 0 means perfect equality, where all values are the same. A Gini coefficient equals 100
score_gini(object, data = NULL, y = NULL, ...)
score_gini(object, data = NULL, y = NULL, ...)
object |
An object of class |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
library(DALEX) # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer exp_glm <- explain(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_gini(exp_glm)
library(DALEX) # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer exp_glm <- explain(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_gini(exp_glm)
Score is approximately:
with the distinction that each element of sum is also scaled to take values from [0,1].
is a residual for i-th observation,
is the residual of j-th simulation
for i-th observation, and
is the number of simulations for each observation.
Scores are calculated on the basis of simulated data, so they may differ between function calls.
score_halfnormal(object, ...) scoreHalfNormal(object, ...)
score_halfnormal(object, ...) scoreHalfNormal(object, ...)
object |
An object of class |
... |
... |
An object of class auditor_score
.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_halfnormal(lm_audit)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_halfnormal(lm_audit)
Mean Absolute Error.
score_mae(object, data = NULL, y = NULL, ...) scoreMAE(object)
score_mae(object, data = NULL, y = NULL, ...) scoreMAE(object)
object |
An object of class |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_mae(lm_audit)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_mae(lm_audit)
Mean Square Error.
score_mse(object, data = NULL, y = NULL, ...) scoreMSE(object)
score_mse(object, data = NULL, y = NULL, ...) scoreMSE(object)
object |
An object of class |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_mse(lm_audit)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_mse(lm_audit)
One minus accuracy
score_one_minus_acc(object, cutoff = 0.5, data = NULL, y = NULL, ...)
score_one_minus_acc(object, cutoff = 0.5, data = NULL, y = NULL, ...)
object |
An object of class |
cutoff |
Threshold value, which divides model predicted values to calculate confusion matrix.
By default it's |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_acc(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_acc(glm_audit)
One minus Area Under Curve (AUC) for Receiver Operating Characteristic.
score_one_minus_auc(object, data = NULL, y = NULL, ...)
score_one_minus_auc(object, data = NULL, y = NULL, ...)
object |
An object of class |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_auc(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_auc(glm_audit)
One Minus Area under precision-recall (AUPRC) curve.
score_one_minus_auprc(object, data = NULL, y = NULL, ...)
score_one_minus_auprc(object, data = NULL, y = NULL, ...)
object |
An object of class |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_auprc(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_auprc(glm_audit)
One Minus F1 Score
score_one_minus_f1(object, cutoff = 0.5, data = NULL, y = NULL, ...)
score_one_minus_f1(object, cutoff = 0.5, data = NULL, y = NULL, ...)
object |
An object of class |
cutoff |
Threshold value, which divides model predicted values (y_hat) to calculate confusion matrix.
By default it's |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_f1(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_f1(glm_audit)
One minus Gini Coefficient 100 0 expresses maximal inequality of values.
score_one_minus_gini(object, data = NULL, y = NULL, ...)
score_one_minus_gini(object, data = NULL, y = NULL, ...)
object |
An object of class |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_gini(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_gini(glm_audit)
One Minus Precision
score_one_minus_precision(object, cutoff = 0.5, data = NULL, y = NULL, ...)
score_one_minus_precision(object, cutoff = 0.5, data = NULL, y = NULL, ...)
object |
An object of class |
cutoff |
Threshold value, which divides model predicted values (y_hat) to calculate confusion matrix.
By default it's |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
library(DALEX) # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer exp_glm <- explain(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_precision(exp_glm)
library(DALEX) # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer exp_glm <- explain(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_precision(exp_glm)
One minus recall
score_one_minus_recall(object, cutoff = 0.5, data = NULL, y = NULL, ...)
score_one_minus_recall(object, cutoff = 0.5, data = NULL, y = NULL, ...)
object |
An object of class |
cutoff |
Threshold value, which divides model predicted values (y_hat) to calculate confusion matrix.
By default it's |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
library(DALEX) # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer exp_glm <- explain(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_recall(exp_glm)
library(DALEX) # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer exp_glm <- explain(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_recall(exp_glm)
One minus specificity
score_one_minus_specificity(object, cutoff = 0.5, data = NULL, y = NULL, ...)
score_one_minus_specificity(object, cutoff = 0.5, data = NULL, y = NULL, ...)
object |
An object of class |
cutoff |
Threshold value, which divides model predicted values (y_hat) to calculate confusion matrix.
By default it's |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_specificity(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_one_minus_specificity(glm_audit)
This score is calculated on the basis of Peak test, which is used for checking for homoscedasticity of residuals in regression analyses.
score_peak(object, variable = NULL, data = NULL, y = NULL, ...) scorePeak(object)
score_peak(object, variable = NULL, data = NULL, y = NULL, ...) scorePeak(object)
object |
An object of class |
variable |
Name of model variable to order residuals. |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_peak(lm_audit)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_peak(lm_audit)
Precision
score_precision(object, cutoff = 0.5, data = NULL, y = NULL, ...)
score_precision(object, cutoff = 0.5, data = NULL, y = NULL, ...)
object |
An object of class |
cutoff |
Threshold value, which divides model predicted values (y_hat) to calculate confusion matrix.
By default it's |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_precision(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_precision(glm_audit)
The R2 is the coefficient of determination, An R2 coefficient equals 0 means that model explains none of the variability of the response. An R2 coefficient equals 1 means that model explains all the variability of the response.
score_r2(object, data = NULL, y = NULL, ...)
score_r2(object, data = NULL, y = NULL, ...)
object |
An object of class |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # use DALEX package to wrap up a model into explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score with auditor score_r2(lm_audit)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # use DALEX package to wrap up a model into explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score with auditor score_r2(lm_audit)
The area over the Regression Error Characteristic curve is a measure of the expected error for the regression model.
score_rec(object, data = NULL, y = NULL, ...) scoreREC(object)
score_rec(object, data = NULL, y = NULL, ...) scoreREC(object)
object |
An object of class |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
J. Bi, and K. P. Bennet, "Regression error characteristic curves," in Proc. 20th Int. Conf. Machine Learning, Washington DC, 2003, pp. 43-50
dragons <- DALEX::dragons[1:100, ] # fit a model lm_model <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(lm_model, data = dragons, y = dragons$life_length) # calculate score score_rec(lm_audit)
dragons <- DALEX::dragons[1:100, ] # fit a model lm_model <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(lm_model, data = dragons, y = dragons$life_length) # calculate score score_rec(lm_audit)
Recall
score_recall(object, cutoff = 0.5, data = NULL, y = NULL, ...)
score_recall(object, cutoff = 0.5, data = NULL, y = NULL, ...)
object |
An object of class |
cutoff |
Threshold value, which divides model predicted values (y_hat) to calculate confusion matrix.
By default it's |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_recall(glm_audit)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) # create an explainer glm_audit <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_recall(glm_audit)
Root Mean Square Error.
score_rmse(object, data = NULL, y = NULL, ...) scoreRMSE(object)
score_rmse(object, data = NULL, y = NULL, ...) scoreRMSE(object)
object |
An object of class |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_rmse(lm_audit)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_rmse(lm_audit)
The area over the Regression Receiver Operating Characteristic.
score_rroc(object, data = NULL, y = NULL, ...) scoreRROC(object)
score_rroc(object, data = NULL, y = NULL, ...) scoreRROC(object)
object |
An object of class |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
Hernández-Orallo, José. 2013. "ROC Curves for Regression". Pattern Recognition 46 (12): 3395–3411.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_rroc(lm_audit)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # calculate score score_rroc(lm_audit)
Score based on Runs test statistic. Note that this test is not very strong. It utilizes only signs of the residuals. The score value is helpful in comparing models. It is worth pointing out that results of tests like p-value makes sense only when the test assumptions are satisfied. Otherwise test statistic may be considered as a score.
score_runs(object, variable = NULL, data = NULL, y = NULL, ...) scoreRuns(object, variable = NULL)
score_runs(object, variable = NULL, data = NULL, y = NULL, ...) scoreRuns(object, variable = NULL)
object |
An object of class |
variable |
name of model variable to order residuals. |
data |
New data that will be used to calculate the score. Pass
|
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # caluclate score score_runs(lm_audit)
dragons <- DALEX::dragons[1:100, ] # fit a model model_lm <- lm(life_length ~ ., data = dragons) # create an explainer lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length) # caluclate score score_runs(lm_audit)
Specificity
score_specificity(object, cutoff = 0.5, data = NULL, y = NULL, ...)
score_specificity(object, cutoff = 0.5, data = NULL, y = NULL, ...)
object |
An object of class |
cutoff |
Threshold value, which divides model predicted values (y_hat) to calculate confusion matrix.
By default it's |
data |
New data that will be used to calculate the score.
Pass |
y |
New y parameter will be used to calculate score. |
... |
Other arguments dependent on the type of score. |
An object of class auditor_score
.
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) exp_glm <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_specificity(exp_glm)
data(titanic_imputed, package = "DALEX") # fit a model model_glm <- glm(survived ~ ., family = binomial, data = titanic_imputed) exp_glm <- audit(model_glm, data = titanic_imputed, y = titanic_imputed$survived) # calculate score score_specificity(exp_glm)