scimilarity.cell_embedding#

class scimilarity.cell_embedding.CellEmbedding(model_path, use_gpu=False)[source]#

Bases: object

A class that embeds cell gene expression data using a ML model.

Parameters:
  • model_path (str) –

  • use_gpu (bool) –

get_embeddings(X, num_cells=-1, buffer_size=10000)[source]#

Calculate embeddings for lognormed gene expression matrix.

Parameters:
  • X (scipy.sparse.csr_matrix, scipy.sparse.csc_matrix, numpy.ndarray) – Gene space aligned and log normalized (tp10k) gene expression matrix.

  • num_cells (int, default: -1) – The number of cells to embed, starting from index 0. A value of -1 will embed all cells.

  • buffer_size (int, default: 10000) – The number of cells to embed in one batch.

Returns:

A 2D numpy array of embeddings [num_cells x latent_space_dimensions].

Return type:

numpy.ndarray

Examples

>>> from scimilarity.utils import align_dataset, lognorm_counts
>>> ce = CellEmbedding(model_path="/opt/data/model")
>>> data = align_dataset(data, ce.gene_order)
>>> data = lognorm_counts(data)
>>> embeddings = ce.get_embeddings(data.X)