SIGnature.SIGnature#

class SIGnature.SIGnature.SIGnature(gene_order, model=None, attribution_tiledb_uri=None, use_gpu=False)[source]#

Bases: object

A class for working with single cell gene attributions.

Parameters:
  • gene_order (list) – The gene order for the model.

  • model (nn.Module) – A pytorch model.

  • attributions_tiledb_uri (str) – The URI to the attributions tiledb.

  • use_gpu (bool, default: False) – Use GPU instead of CPU.

  • attribution_tiledb_uri (str | None)

Examples

>>> sig = SIGnature(model=scim.model, gene_order=scim.gene_order)
calculate_attributions(X, buffer_size=1000, target_sum=1000.0, disable_tqdm=False, npz_path=None)[source]#

Calculate gene attributions from a log normalized expression matrix.

Parameters:
  • X (torch.Tensor, numpy.ndarray, scipy.sparse.csr_matrix) – Log normalized expression matrix.

  • buffer_size (int, default: 1000) – Buffer size for batches.

  • target_sum (float, default: 1000) – Target sum for attribution normalization.

  • disable_tqdm (bool, default: False) – Disable the tqdm progress bar.

  • npz_path (Optional[str], default: None) – Filename for storing the attribution matrix.

Returns:

A sparse matrix of normalized gene attributions.

Return type:

scipy.sparse.csr_matrix

Examples

>>> attr = sig.calculate_attributions(X=adata.X, buffer_size=500)
check_genes(gene_list, print_missing=True)[source]#

Checks genes and returns ones usable by query.

Parameters:
  • gene_list (list) – A list of genes of interest.

  • print_missing (bool) – Print the genes not in the model’s gene order.

Returns:

A list of usable genes.

Return type:

list

Examples

>>> gene_list = sig.check_genes(gene_list)
create_tiledb(npz_path, batch_size=25000, attribution_tiledb_uri=None, overwrite=False)[source]#

Create a sparse TileDB array from attribution matrix.

Parameters:
  • npz_path (str) – Filename for the stored attribution matrix.

  • batch_size (int, default: 10000) – Batch size for the tiles.

  • attributions_tiledb_uri (str) – The URI to the attributions tiledb.

  • overwrite (bool, default: False) – Overwrite the existing TileDB.

  • attribution_tiledb_uri (str | None)

Examples

>>> sig.create_tiledb(npz_path="/opt/npz_attribution_matrices/data.npz")
query_attributions(gene_list, cell_indices=None, attribution_tiledb_uri=None, return_aggregate=True, aggregate_type='mean', weights=None)[source]#

Get attributions from sparse TileDB array.

Parameters:
  • gene_list (List[str]) – List of gene symbols.

  • cell_indices (Optional[List[int]], default: None) – List of cell indices.

  • attributions_tiledb_uri (str, optional) – The URI to the attributions tiledb.

  • return_aggregate (bool, default: True) – Return an aggregate of attributions for each cell.

  • aggregate_type (str, default: "mean") – Type of aggregation {“mean”, “sum”}.

  • weights (Optional[List[float]], default: None) – Weights for each gene when calculating weighted averages or sums. Must have the same length as gene_list.

  • attribution_tiledb_uri (str | None)

Examples

>>> cleaned_genes = sig.check_genes(genes)
>>> sig.query_attributions(gene_list=cleaned_genes, attributions_tiledb_uri=tiledb_path)