
Simulate covariates from a Gaussian mixture model
Source:R/simulate_X_mixture.R
simulate_X_mixture.RdGenerates covariates where categorical variables define mixture components and continuous variables are drawn from component-specific multivariate normal distributions. Each combination of categorical levels has its own probability and its own distribution for the continuous covariates.
Arguments
- n
Positive integer. Number of units to simulate.
- p_cat
Non-negative integer. Number of categorical covariates.
- p_cont
Non-negative integer. Number of continuous covariates. At least one of `p_cat` or `p_cont` must be positive.
- cat_level_list
List of length `p_cat`. Each element is a vector of possible levels for that categorical variable. The total number of combinations is `prod(lengths(cat_level_list))`.
- cat_comb_prob
Numeric vector of probabilities, one per combination of categorical levels (in the order produced by [expand.grid()]). Must sum to 1.
- cont_para_list
List of parameter lists for the continuous covariates. When `p_cat > 0`, must have one element per combination of categorical levels; each element is a list with `mean` (length `p_cont`) and `sigma` (`p_cont x p_cont` matrix). When `p_cat == 0`, a single-element list.
Examples
# Continuous only
X <- simulate_X_mixture(
n = 100, p_cat = 0, p_cont = 2,
cat_level_list = list(),
cat_comb_prob = c(),
cont_para_list = list(list(mean = c(0, 0), sigma = diag(2)))
)
# Mixed categorical and continuous
X <- simulate_X_mixture(
n = 100, p_cat = 1, p_cont = 2,
cat_level_list = list(c(0, 1)),
cat_comb_prob = c(0.4, 0.6),
cont_para_list = list(
list(mean = c(0, 0), sigma = diag(2)),
list(mean = c(2, 2), sigma = diag(2))
)
)