Skip to contents

Create a detailed model with user defined smoother

Usage

sculpt_detailed_generic(
  rs,
  smoother_fit,
  smoother_predict,
  missings = NULL,
  verbose = 0,
  allow_par = FALSE
)

Arguments

rs

Rough model, i.e. object of classes rough and sculpture.

smoother_fit

Smoother fitting function.

smoother_predict

Smoother prediction function.

missings

(NULL) or single value or a named vector. Specifies the value(-s) that stand for the missing values. If NULL, then no missing value handling is carried out. If single value, then it is assumed that this value is used for flagging missing values across all continuous variables. If named vector, then the names are used to refer to continuous variables and the values for flagging missing values in that variable.

verbose

(integer) 0 for silent run, > 0 for messages.

allow_par

(logical) Allow parallel computation? Defaults to FALSE.

Value

Object of classes detailed and sculpture.

Details

For parallel computation, use parallel_set() and set allow_par to TRUE. Note that parallel computation may fail if the model is too big and there is not enough memory.

Custom smoothers

If none of the predefined smoothers (sculpt_detailed_gam(), sculpt_detailed_lm()) suits your needs, you can define your own smoothers. You need to define 2 functions: smoother_fit and smoother_predict:

smoother_fit takes 5 arguments ("x", "y", "is_discrete", "column_name", "na_ind") and returns a model fit. "x" are the feature values, "y" are the PDP values, "is_discrete" flags a discrete feature, "column_name" holds the feature name, and "na_ind" passes the NA value from missings (or NULL by default).

smoother_predict takes also 5 arguments ("smoother", "new_x", "is_discrete", "column_name", "na_ind") and returns predictions as a vector. "smoother" is the model fit returned from smoother_fit, "new_x" are the feature values that we want to predict, "is_discrete", "column_name", and "na_ind" have the same purpose as in smoother_fit. See also Examples.

Examples

df <- mtcars
df$vs <- as.factor(df$vs)
model <- rpart::rpart(
  hp ~ mpg + carb + vs,
  data = df,
  control = rpart::rpart.control(minsplit = 10)
)
model_predict <- function(x) predict(model, newdata = x)
covariates <- c("mpg", "carb", "vs")
pm <- sample_marginals(df[covariates], n = 50, seed = 5)

rs <- sculpt_rough(
  dat = pm,
  model_predict_fun = model_predict,
  n_ice = 10,
  seed = 1,
  verbose = 0
)

# define custom smoother
# - gam with 3 knots for variable "mpg"
# - gam with 5 knots for variable "carb"
# - lm for any discrete variable
library(mgcv)
#> Loading required package: nlme
#> This is mgcv 1.9-1. For overview type 'help("mgcv-package")'.
my_smoother <- function(x, y, is_discrete, column_name, na_ind = NULL) {
  if (column_name == "mpg") {
    gam(y ~ s(x, k = 3))
  } else if (column_name == "carb") {
    gam(y ~ s(x, k = 5))
  } else if (is_discrete) {
    lm(y ~ x)
  } else {
    stop("Undefined smoother")
  }
}

# define appropriate predict function
# - predict.gam returns an array, we need to convert it to vector
# - if-else branch for illustration purposes
my_smoother_predict <- function(smoother, new_x, is_discrete, column_name, na_ind = NULL) {
  if (inherits(smoother, "gam")) {
    # as.numeric: convert array to vector
    as.numeric(predict(smoother, newdata = data.frame(x = new_x)))
  } else {
    predict(smoother, newdata = data.frame(x = new_x))
  }
}

ds <- sculpt_detailed_generic(
  rs = rs,
  smoother_fit = my_smoother,
  smoother_predict = my_smoother_predict
)
class(ds)
#> [1] "detailed"  "sculpture" "list"     
if (FALSE) {
# see components
g_component(ds)$continuous
}


# another example with constrained gam (cgam) package
if (FALSE) {
library(cgam)

cgam_smoother <- function(x, y, is_discrete, column_name, na_ind = NULL) {
  if (column_name == "carb") {
    cgam(y ~ s.incr(x, numknots = 3))
  } else if (column_name == "mpg") {
    cgam(y ~ s.decr(x, numknots = 3))
  } else {
    cgam(y ~ x)
  }
}

cgam_predict <- function(smoother, new_x, is_discrete, column_name, na_ind = NULL) {
  predict(smoother, newData = data.frame(x = new_x))$fit
}

ds2 <- sculpt_detailed_generic(
  rs = rs,
  smoother_fit = cgam_smoother,
  smoother_predict = cgam_predict
)

# see components
g_component(ds2)$continuous
}