Installation and Setup#

Installing the SCimilarity API#

The latest API release can be installed quickly using pip in the usual manner:

pip install scimilarity

The SCimilarity API is under activate development. The latest development API can be downloaded from GitHub and installed as follows:

git clone https://github.com/genentech/scimilarity.git
cd scimilarity
pip install -e .

Warning

To enable rapid searches across tens of millions of cells, SCimilarity has very high memory requirements. To make queries, you will need at least 64 GB of system RAM.

Warning

If your environment has sufficient memory but loading the model or making kNN queries crashes, that may be due to older versions of dependencies such as hnswlib or numpy. We recommend using either using the Docker container or Conda environment described below.

Note

A GPU is not necessary for most applications, but model training will require GPU resources.

Downloading the pretrained models#

You can download the following pretrained models for use with SCimilarity from Zenodo: https://zenodo.org/records/10685499

Conda environment setup#

To install the SCimilarity API in a [Conda](https://docs.conda.io) environment we recommend this environment setup:

Download environment file

name: scimilarity
channels:
  - nvidia
  - pytorch
  - bioconda
  - conda-forge
dependencies:
  - python=3.10
  - ipykernel
  - ipython
  - ipywidgets
  - leidenalg
  - pip

Followed by installing the scimilarity package via pip, as above.

Using the SCimilarity Docker container#

A Docker container that includes the SCimilarity API is available from the GitHub Container Registry, which can be pulled via:

docker pull ghcr.io/genentech/scimilarity:latest

Models are not included in the Docker container and must be downloaded separately.

There are four preset bind points in the container:

  • /models

  • /data

  • /workspace

  • /scratch

We require binding /models to your local path storing SCimilarity models, /data to your repository of scRNA-seq data, and /workspace to your notebook path.

You can initiate a Jupyter Notebook session rooted in /workspace using the start-notebook command as follows:

docker run -it --platform linux/amd64 -p 8888:8888 \
    -v /path/to/workspace:/workspace \
    -v /path/to/data:/data \
    -v /path/to/models:/models \
    ghcr.io/genentech/scimilarity:latest start-notebook