SIGnature.models.scfoundation#

class SIGnature.models.scfoundation.SCFoundationWrapper(*args, **kwargs)[source]#

Bases: Module

A class to load and use the scFoundation (DOI: https://doi.org/10.1038/s41592-024-02305-7) model for embedding and attribution.

Parameters:
  • model_path (str)

  • use_gpu (bool)

calculate_attributions(X, method='ig', batch_size=1, multiply_by_inputs=True, disable_tqdm=False, target_sum=1000.0, npz_path=None)[source]#

Calculates gene attributions for the scFoundation model using a specified method.

Parameters:
  • X (torch.Tensor | numpy.ndarray | scipy.sparse.csr_matrix) – The input data matrix (e.g., log-normalized gene expression).

  • method (str) – The attribution method to use. Options are “ig” (Integrated Gradients), “dl” (DeepLift), or “ixg” (Saliency).

  • batch_size (int) – The number of samples to process in each batch.

  • multiply_by_inputs (bool) – Whether to multiply attributions by input values. Note: for Integrated Gradients and DeepLift, this is passed to the Captum constructor. For Saliency, the multiplication is done manually after calculation.

  • disable_tqdm (bool) – Whether to disable the progress bar.

  • target_sum (float) – The desired sum for each row after normalization.

  • npz_path (str | None) – Path to save the resulting sparse attribution matrix.

Returns:

A scipy.sparse.csr_matrix containing the calculated attributions.

Return type:

csr_matrix

preprocess_adata(adata, gene_overlap_threshold=500)[source]#

Preprocesses an AnnData object for use with the scFoundation model.

Parameters:
  • adata (anndata.AnnData)

  • gene_overlap_threshold (int)

Return type:

anndata.AnnData