prepareMiDAS transform HLA alleles calls and KIR calls according to selected experiments creating a MiDAS object.

prepareMiDAS(
  hla_calls = NULL,
  kir_calls = NULL,
  colData,
  experiment = c("hla_alleles", "hla_aa", "hla_g_groups", "hla_supertypes",
    "hla_NK_ligands", "kir_genes", "kir_haplotypes", "hla_kir_interactions",
    "hla_divergence", "hla_het", "hla_custom", "kir_custom"),
  placeholder = "term",
  lower_frequency_cutoff = NULL,
  upper_frequency_cutoff = NULL,
  indels = TRUE,
  unkchar = FALSE,
  hla_divergence_aa_selection = "binding_groove",
  hla_het_resolution = 8,
  hla_dictionary = NULL,
  kir_dictionary = NULL
)

Arguments

hla_calls

HLA calls data frame, as returned by readHlaCalls function.

kir_calls

KIR calls data frame, as returned by readKirCalls function.

colData

Data frame holding additional variables like phenotypic observations or covariates. It have to contain 'ID' column holding samples identifiers corresponding to identifiers in hla_calls and kir_calls. Importantly rows of hla_calls and kir_calls without corresponding phenotype are discarded.

experiment

Character vector indicating analysis type for which data should be prepared. Valid choices are "hla_alleles", "hla_aa", "hla_g_groups", "hla_supertypes", "hla_NK_ligands", "kir_genes", "hla_kir_interactions", "hla_divergence", "hla_het". See details for further explanations.

placeholder

String giving name for dummy variable inserted to colData. This variable can be than used to define base statistical model used by runMiDAS.

lower_frequency_cutoff

Number giving lower frequency threshold. Numbers greater than 1 are interpreted as the number of feature occurrences, numbers between 0 and 1 as fractions.

upper_frequency_cutoff

Number giving upper frequency threshold. Numbers greater than 1 are interpreted as the number of feature occurrences, numbers between 0 and 1 as fractions.

indels

Logical indicating whether indels should be considered when checking amino acid variability in 'hla_aa' experiment.

unkchar

Logical indicating whether unknown characters in the alignment should be considered when checking amino acid variability in 'hla_aa' experiment.

hla_divergence_aa_selection

String specifying variable region in peptide binding groove which should be considered for Grantham distance calculation. Valid choices includes: "binding_groove", "B_pocket", "F_pocket". See details for more information.

hla_het_resolution

Number specifying HLA alleles resolution used to calculate heterogeneity in "hla_het" experiment.

hla_dictionary

Data frame giving HLA allele dictionary used in 'hla_custom' experiment. See hlaToVariable for more details.

kir_dictionary

Data frame giving KIR genes dictionary used in 'kir_custom' experiment. See countsToVariables for more details.

Value

Object of class MiDAS

Details

experiment specifies analysis types for which hla_calls and kir_call should be prepared.

'hla_alleles'

hla_calls are transformed to counts matrix describing number of allele occurrences for each sample. This experiment is used to test associations on HLA alleles level.

'hla_aa'

hla_calls are transformed to a matrix of variable amino acid positions. See hlaToAAVariation for more details. This experiment is used to test associations on amino acid level.

"hla_g_groups"

hla_calls are translated into HLA G groups and transformed to matrix describing number of G group occurrences for each sample. See hlaToVariable for more details. This experiment is used to test associations on HLA G groups level.

"hla_supertypes"

hla_calls are translated into HLA supertypes and transformed to matrix describing number of G group occurrences for each sample. See hlaToVariable for more details. This experiment is used to test associations on HLA supertypes level.

"hla_NK_ligands"

hla_calls are translated into NK ligands, which includes HLA Bw4/Bw6 and HLA C1/C2 groups and transformed to matrix describing number of their occurrences for each sample. See hlaToVariable for more details.This experiment is used to test associations on HLA NK ligands level.

"kir_genes"

kir_calls are transformed to counts matrix describing number of KIR gene occurrences for each sample. This experiment is used to test associations on KIR genes level.

"hla_kir_interactions"

hla_calls and kir_calls are translated to HLA - KIR interactions as defined in Pende et al., 2019.. See getHlaKirInteractions for more details. This experiment is used to test associations on HLA - KIR interactions level.

"hla_divergence"

Grantham distance for class I HLA alleles is calculated based on hla_calls using original formula by Grantham R. 1974.. See hlaCallsGranthamDistance for more details. This experiment is used to test associations on HLA divergence level measured by Grantham distance.

"hla_het"

hla_calls are transformed to heterozygosity status, where 1 designates a heterozygote and 0 homozygote. Heterozygosity status is calculated only for classical HLA genes (A, B, C, DQA1, DQB1, DRA, DRB1, DPA1, DPB1). This experiment is used to test associations on HLA divergence level measured by heterozygosity.

Examples

midas <- prepareMiDAS(hla_calls = MiDAS_tut_HLA,
                      kir_calls = MiDAS_tut_KIR,
                      colData = MiDAS_tut_pheno,
                      experiment = "hla_alleles")