Create a detailed model with user defined smoother
sculpt_detailed_generic.Rd
Create a detailed model with user defined smoother
Usage
sculpt_detailed_generic(
rs,
smoother_fit,
smoother_predict,
missings = NULL,
verbose = 0,
allow_par = FALSE
)
Arguments
- rs
Rough model, i.e. object of classes
rough
andsculpture
.- smoother_fit
Smoother fitting function.
- smoother_predict
Smoother prediction function.
- missings
(
NULL
) or single value or a named vector. Specifies the value(-s) that stand for the missing values. IfNULL
, then no missing value handling is carried out. If single value, then it is assumed that this value is used for flagging missing values across all continuous variables. If named vector, then the names are used to refer to continuous variables and the values for flagging missing values in that variable.- verbose
(
integer
) 0 for silent run, > 0 for messages.- allow_par
(
logical
) Allow parallel computation? Defaults toFALSE
.
Details
For parallel computation, use parallel_set()
and set allow_par
to TRUE
.
Note that parallel computation may fail if the model is too big and there is not enough memory.
Custom smoothers
If none of the predefined smoothers (sculpt_detailed_gam()
, sculpt_detailed_lm()
)
suits your needs, you can define your own smoothers.
You need to define 2 functions: smoother_fit
and smoother_predict
:
smoother_fit
takes 5 arguments ("x", "y", "is_discrete", "column_name", "na_ind") and
returns a model fit. "x" are the feature values, "y" are the PDP values,
"is_discrete" flags a discrete feature, "column_name" holds the feature name,
and "na_ind" passes the NA value from missings
(or NULL by default).
smoother_predict
takes also 5 arguments ("smoother", "new_x", "is_discrete", "column_name",
"na_ind") and returns predictions as a vector. "smoother" is the model fit returned from
smoother_fit
, "new_x" are the feature values that we want to predict, "is_discrete",
"column_name", and "na_ind" have the same purpose as in smoother_fit
.
See also Examples.
Examples
df <- mtcars
df$vs <- as.factor(df$vs)
model <- rpart::rpart(
hp ~ mpg + carb + vs,
data = df,
control = rpart::rpart.control(minsplit = 10)
)
model_predict <- function(x) predict(model, newdata = x)
covariates <- c("mpg", "carb", "vs")
pm <- sample_marginals(df[covariates], n = 50, seed = 5)
rs <- sculpt_rough(
dat = pm,
model_predict_fun = model_predict,
n_ice = 10,
seed = 1,
verbose = 0
)
# define custom smoother
# - gam with 3 knots for variable "mpg"
# - gam with 5 knots for variable "carb"
# - lm for any discrete variable
library(mgcv)
#> Loading required package: nlme
#> This is mgcv 1.9-1. For overview type 'help("mgcv-package")'.
my_smoother <- function(x, y, is_discrete, column_name, na_ind = NULL) {
if (column_name == "mpg") {
gam(y ~ s(x, k = 3))
} else if (column_name == "carb") {
gam(y ~ s(x, k = 5))
} else if (is_discrete) {
lm(y ~ x)
} else {
stop("Undefined smoother")
}
}
# define appropriate predict function
# - predict.gam returns an array, we need to convert it to vector
# - if-else branch for illustration purposes
my_smoother_predict <- function(smoother, new_x, is_discrete, column_name, na_ind = NULL) {
if (inherits(smoother, "gam")) {
# as.numeric: convert array to vector
as.numeric(predict(smoother, newdata = data.frame(x = new_x)))
} else {
predict(smoother, newdata = data.frame(x = new_x))
}
}
ds <- sculpt_detailed_generic(
rs = rs,
smoother_fit = my_smoother,
smoother_predict = my_smoother_predict
)
class(ds)
#> [1] "detailed" "sculpture" "list"
if (FALSE) {
# see components
g_component(ds)$continuous
}
# another example with constrained gam (cgam) package
if (FALSE) {
library(cgam)
cgam_smoother <- function(x, y, is_discrete, column_name, na_ind = NULL) {
if (column_name == "carb") {
cgam(y ~ s.incr(x, numknots = 3))
} else if (column_name == "mpg") {
cgam(y ~ s.decr(x, numknots = 3))
} else {
cgam(y ~ x)
}
}
cgam_predict <- function(smoother, new_x, is_discrete, column_name, na_ind = NULL) {
predict(smoother, newData = data.frame(x = new_x))$fit
}
ds2 <- sculpt_detailed_generic(
rs = rs,
smoother_fit = cgam_smoother,
smoother_predict = cgam_predict
)
# see components
g_component(ds2)$continuous
}