{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Attribution and Motifs Detection with Decima" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This documentation demonstrates how to use Decima's attribution analysis capabilities to identify important regulatory regions in genomic sequences and discover transcription factor binding motifs within those regions. Attribution analysis helps reveal which parts of the DNA sequence most strongly influence gene expression predictions, while **motif scanning** can identify specific transcription factor binding sites in these regions of interest." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## CLI API" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at a simple example using Decima's CLI API to analyze the SPI1 and BRD3 genes. SPI1 is a key transcription factor in myeloid cell development. We'll examine its regulation across different monocyte and macrophage cell types where it is known to be important." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:24:58.683436Z", "iopub.status.busy": "2025-11-21T06:24:58.683299Z", "iopub.status.idle": "2025-11-21T06:25:56.147216Z", "shell.execute_reply": "2025-11-21T06:25:56.146610Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n", "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Usage: decima attributions [OPTIONS]\r\n", "\r\n", " Generate and save attribution analysis results for a gene or a set of\r\n", " sequences and perform seqlet calling on the attributions.\r\n", "\r\n", " Output files:\r\n", "\r\n", " ├── {output_prefix}.attributions.h5 # Raw attribution score matrix\r\n", " per gene.\r\n", "\r\n", " ├── {output_prefix}.attributions.bigwig # Genome browser track of\r\n", " attribution as bigwig file.\r\n", "\r\n", " ├── {output_prefix}.seqlets.bed # List of attribution peaks in\r\n", " BED format.\r\n", "\r\n", " ├── {output_prefix}.motifs.tsv # Detected motifs in peak\r\n", " regions.\r\n", "\r\n", " └── {output_prefix}.warnings.qc.log # QC warnings about prediction\r\n", " reliability.\r\n", "\r\n", " Examples:\r\n", "\r\n", " >>> decima attributions -o output_prefix -g SPI1\r\n", "\r\n", " >>> decima attributions -o output_prefix -g SPI1,CD68 --tasks \"cell_type\r\n", " == 'classical monocyte'\" --device 0\r\n", "\r\n", " >>> decima attributions -o output_prefix --seqs tests/data/seqs.fasta\r\n", " --tasks \"cell_type == 'classical monocyte'\" --device 0\r\n", "\r\n", "Options:\r\n", " -o, --output-prefix TEXT Prefix path to the output files [required]\r\n", " -g, --genes TEXT Comma-separated list of gene symbols or IDs\r\n", " to analyze.\r\n", " --seqs TEXT Path to a file containing sequences to\r\n", " analyze\r\n", " --tasks TEXT Query string to filter cell types to analyze\r\n", " attributions for (e.g. 'cell_type ==\r\n", " 'classical monocyte'')\r\n", " --off-tasks TEXT Optional query string to filter cell types\r\n", " to contrast against.\r\n", " --model TEXT Model to use for attribution analysis either\r\n", " replicate number or path to the model.\r\n", " [default: ensemble]\r\n", " --metadata TEXT Path to the metadata anndata file or name of\r\n", " the model. If not provided, the compabilite\r\n", " metadata for the model will be used.\r\n", " --method TEXT Method to use for attribution analysis.\r\n", " --transform [specificity|aggregate]\r\n", " Transform to use for attribution analysis.\r\n", " --num-workers INTEGER Number of workers for attribution analysis.\r\n", " --tss-distance INTEGER TSS distance for attribution analysis.\r\n", " --batch-size INTEGER Batch size for attribution analysis.\r\n", " --top-n-markers INTEGER Top n markers to predict. If not provided,\r\n", " all markers will be predicted.\r\n", " --threshold FLOAT Threshold for attribution analysis.\r\n", " --min-seqlet-len INTEGER Minimum length for seqlet calling.\r\n", " --max-seqlet-len INTEGER Maximum length for seqlet calling.\r\n", " --additional-flanks INTEGER Additional flanks for seqlet calling.\r\n", " --pattern-type [both|pos|neg] Type of pattern to call.\r\n", " --meme-motif-db TEXT Path to the MEME motif database. [default:\r\n", " hocomoco_v13]\r\n", " --device TEXT Device to use for attribution analysis.\r\n", " --genome TEXT Genome name or path to the genome fasta\r\n", " file. [default: hg38]\r\n", " --help Show this message and exit.\r\n", "\u001b[0m" ] } ], "source": [ "! decima attributions --help" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This decima command analyzes gene attributions: `--genes \"SPI1,BRD3\"` specifies focusing on SPI1 and BRD3; `--tasks \"cell_type == 'classical monocyte'\"` filters the analysis to classical monocytes only; and `--output_prefix` output_classical_monoctypes/ designates the output directory for the results. You can also pass `--off-tasks` that are cell types used as a contrast group when analyzing cell type specificity - they represent the cell types you want to compare against when determining. If you do not pass, `--tasks` argument all avaliable cells will be used for attribution calculation." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:25:56.149231Z", "iopub.status.busy": "2025-11-21T06:25:56.149075Z", "iopub.status.idle": "2025-11-21T06:26:54.297385Z", "shell.execute_reply": "2025-11-21T06:26:54.296708Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n", "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "decima - INFO - Using device: 0\r\n", "decima - INFO - Loading model v1_rep0 and metadata to compute attributions...\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33mmhcelik\u001b[0m (\u001b[33mmhcw\u001b[0m) to \u001b[32mhttps://api.wandb.ai\u001b[0m. Use \u001b[1m`wandb login --relogin`\u001b[0m to force relogin\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'rep0:latest', 720.03MB. 1 files...\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \r\n", "Done. 00:00:01.6 (445.8MB/s)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'metadata:latest', 3122.32MB. 1 files...\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \r\n", "Done. 00:00:07.1 (437.5MB/s)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "/home/celikm5/Projects/decima/src/decima/interpret/attributer.py:66: UserWarning: `off_tasks` is not provided. Using all other tasks as off_tasks.\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "Computing attributions...: 0%| | 0/2 [00:00" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from IPython.display import Image\n", "\n", "Image(\"example/output_classical_monoctypes_plots/SPI1_seqlogos/SPI1@267.png\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Querying Cells" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To obtain attributions, cells of interest must be selected using the query API. We support Pandas' query API functionality on the cell metadata DataFrame. Here are examples of how to write queries:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:28:53.788330Z", "iopub.status.busy": "2025-11-21T06:28:53.788190Z", "iopub.status.idle": "2025-11-21T06:29:05.164957Z", "shell.execute_reply": "2025-11-21T06:29:05.164317Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n", "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Usage: decima query-cell [OPTIONS] [QUERY]\r\n", "\r\n", " Query a cell using query string\r\n", "\r\n", " Examples:\r\n", "\r\n", " >>> decima query-cell 'cell_type == \"classical monocyte\"' ...\r\n", "\r\n", " >>> decima query-cell 'cell_type == \"classical monocyte\" and disease ==\r\n", " \"healthy\" and tissue == \"blood\"' ...\r\n", "\r\n", " >>> decima query-cell 'cell_type.str.contains(\"monocyte\") and disease ==\r\n", " \"healthy\"' ...\r\n", "\r\n", "Options:\r\n", " --metadata TEXT Path to the metadata anndata file or name of the model.\r\n", " Default: ensemble.\r\n", " --help Show this message and exit.\r\n", "\u001b[0m" ] } ], "source": [ "! decima query-cell --help" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Query cells of type \"classical monocyte\" using Pandas query syntax: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.query.html" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:29:05.166830Z", "iopub.status.busy": "2025-11-21T06:29:05.166668Z", "iopub.status.idle": "2025-11-21T06:29:20.089279Z", "shell.execute_reply": "2025-11-21T06:29:20.088698Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n", "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33mmhcelik\u001b[0m (\u001b[33mmhcw\u001b[0m) to \u001b[32mhttps://api.wandb.ai\u001b[0m. Use \u001b[1m`wandb login --relogin`\u001b[0m to force relogin\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'metadata:latest', 3122.32MB. 1 files...\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \r\n", "Done. 00:00:01.8 (1700.8MB/s)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " cell_type tissue organ disease study dataset region subregion celltype_coarse n_cells total_counts n_genes size_factor train_pearson val_pearson test_pearson\r\n", "agg_4705 classical monocyte alveolar system lung COVID-19 GSE155249 scimilarity nan nan 7244 26544273.0 15325 34749.092791034054 0.946616874183219 0.8437000068912937 0.8506571540216992\r\n", "agg_4706 classical monocyte alveolar system lung healthy GSE155249 scimilarity nan nan 72 218105.0 9142 30484.31888978114 0.9102228263646758 0.8083487523192785 0.8047828694155461\r\n", "agg_4707 classical monocyte ampulla of uterine tube fallopian tube healthy fc77d2ae-247d-44d7-aa24-3f4859254c2c scimilarity nan nan 78 550950.0 9639 30719.377971431015 0.9077670011915634 0.8045070167513724 0.7896845423359651\r\n", "agg_4708 classical monocyte aorta vasculature Abdominal Aortic Aneurysm GSE166676 scimilarity nan nan 432 1091075.0 11192 32981.443348717905 0.9389265854768138 0.8357299205241656 0.830575965756882\r\n", "agg_4709 classical monocyte aorta vasculature healthy GSE166676 scimilarity nan nan 25 162858.0 8859 31216.275954364824 0.8819013257206973 0.7821403055329706 0.7646999711802146\r\n", "agg_4710 classical monocyte apex of heart heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 397 1226515.0 12369 32022.563851814968 0.9469178617442242 0.8326145310572417 0.8365506153530168\r\n", "agg_4711 classical monocyte blood blood COVID-19 03f821b4-87be-4ff4-b65a-b5fc00061da7 scimilarity nan nan 17462 78882609.0 15711 33080.17541357136 0.9536210517623883 0.8539379752673611 0.8485800714004562\r\n", "agg_4712 classical monocyte blood blood COVID-19 7d7cabfd-1d1f-40af-96b7-26a0825a306d scimilarity nan nan 141914 659177004.0 15175 32282.29230923367 0.9575085257562032 0.8570056960758388 0.8507770895392532\r\n", "agg_4713 classical monocyte blood blood COVID-19 GSE154567 scimilarity nan nan 8613 40239000.0 16023 34450.628692057784 0.9605287026438375 0.8551842525202262 0.8491670381176235\r\n", "agg_4714 classical monocyte blood blood COVID-19 GSE158034 scimilarity nan nan 35 91390.0 7372 27446.425385592618 0.8476138307738649 0.7606096001026369 0.7256993661246048\r\n", "agg_4715 classical monocyte blood blood COVID-19 GSE161918 scimilarity nan nan 163244 1023475761.0 15929 31151.84947891148 0.9361102289470363 0.8261601916328626 0.8168421752771801\r\n", "agg_4716 classical monocyte blood blood COVID-19 GSE163668 scimilarity nan nan 8399 55036800.0 15792 33644.10626885235 0.9571638472088531 0.8529462339847145 0.8506441761860545\r\n", "agg_4717 classical monocyte blood blood COVID-19 GSE166992 scimilarity nan nan 2238 12283507.0 14186 32596.636302802952 0.9567710210039843 0.8531485173368566 0.8416173697367906\r\n", "agg_4718 classical monocyte blood blood COVID-19 ddfad306-714d-4cc0-9985-d9072820c530 scimilarity nan nan 61002 230056884.0 16484 32520.14418628346 0.9487335053479237 0.8533686239486711 0.8454541123444707\r\n", "agg_4719 classical monocyte blood blood COVID-19 eb735cc9-d0a7-48fa-b255-db726bf365af scimilarity nan nan 19777 105875381.0 15812 32330.088619084574 0.9558882745902155 0.8545238316898663 0.8468877639468763\r\n", "agg_4720 classical monocyte blood blood HIV enteropathy GSE157829 scimilarity nan nan 491 1449812.0 12290 33110.90004135926 0.9412108394642186 0.8352699509238034 0.8345507070277177\r\n", "agg_4721 classical monocyte blood blood Myelofibrosis GSE117824 scimilarity nan nan 357 1492491.0 11548 32726.985198452294 0.9446223529088382 0.8417521390049872 0.8328218073658378\r\n", "agg_4722 classical monocyte blood blood NA GSE132950 scimilarity nan nan 146 784054.0 10913 30417.15641845661 0.9276395863920666 0.8264978172767997 0.8176327551177259\r\n", "agg_4723 classical monocyte blood blood NA GSE135325 scimilarity nan nan 232 633533.0 11129 31159.105128910356 0.9369963391148282 0.8254811186623798 0.8207578599532835\r\n", "agg_4724 classical monocyte blood blood NA GSE150233 scimilarity nan nan 1141 2453545.0 12228 32204.245569759012 0.9354773292749718 0.8333534679658088 0.8202743631285762\r\n", "agg_4725 classical monocyte blood blood NA GSE151310 scimilarity nan nan 48 151358.0 8028 27001.118740317568 0.8873812787091045 0.7886356061906991 0.766461694552445\r\n", "agg_4726 classical monocyte blood blood NA GSE164378 scimilarity nan nan 54305 476237982.0 17463 34023.11682209347 0.9636663701487779 0.856267291847072 0.8496477594095655\r\n", "agg_4727 classical monocyte blood blood NA GSE164402 scimilarity nan nan 6577 33889420.0 14992 33855.14311643263 0.9502216042319906 0.846017695872854 0.8447747394204608\r\n", "agg_4728 classical monocyte blood blood Sezary's disease GSE122703 scimilarity nan nan 35 148650.0 8487 29592.979037498706 0.8928094999389883 0.7911806688728295 0.7911936593448785\r\n", "agg_4729 classical monocyte blood blood dengue disease GSE145307 scimilarity nan nan 785 7639702.0 13722 33610.52078618725 0.9561427618691068 0.8544883780028308 0.8514781068765508\r\n", "agg_4730 classical monocyte blood blood dengue disease GSE154386 scimilarity nan nan 19173 143929741.0 16877 34242.50262506596 0.9586193824399223 0.8509705295166231 0.8546685528097621\r\n", "agg_4731 classical monocyte blood blood drug hypersensitivity syndrome GSE132802 scimilarity nan nan 1269 7314697.0 13270 32574.34811388645 0.9570929839341253 0.8466339050741839 0.8442788242172967\r\n", "agg_4732 classical monocyte blood blood fibrosis GSE136103 scimilarity nan nan 1774 5003888.0 13389 31155.271000486402 0.9562933985421416 0.8435982250231042 0.8386367834560556\r\n", "agg_4733 classical monocyte blood blood healthy 03f821b4-87be-4ff4-b65a-b5fc00061da7 scimilarity nan nan 32464 109280914.0 16158 33646.02843110038 0.9568728712803031 0.8545324533535094 0.8487445540580735\r\n", "agg_4734 classical monocyte blood blood healthy 436154da-bcf1-4130-9c8b-120ff9a888f2 scimilarity nan nan 76800 206490628.0 16683 30736.453324546856 0.955313650235467 0.8494127267867799 0.84210567312908\r\n", "agg_4735 classical monocyte blood blood healthy 5d445965-6f1a-4b68-ba3a-b8f765155d3a scimilarity nan nan 1044 3638976.0 12306 30542.8977364245 0.9465384578921433 0.851991795617166 0.8308191715256182\r\n", "agg_4736 classical monocyte blood blood healthy DS000010023 scimilarity nan nan 243 362606.0 8414 25201.261024165953 0.8865446516098515 0.7621391376670988 0.7681818769796727\r\n", "agg_4737 classical monocyte blood blood healthy GSE122703 scimilarity nan nan 18 83417.0 7546 28194.725173315186 0.859225612396475 0.7640890699253299 0.7443681539461423\r\n", "agg_4738 classical monocyte blood blood healthy GSE130117 scimilarity nan nan 2017 7130588.0 13535 33078.53692191542 0.9553673450160365 0.851402109626239 0.8385936100758409\r\n", "agg_4739 classical monocyte blood blood healthy GSE132802 scimilarity nan nan 1601 9955248.0 13132 32063.630951743195 0.9478882791739611 0.8391025143866828 0.8303465877530952\r\n", "agg_4740 classical monocyte blood blood healthy GSE139324 scimilarity nan nan 2333 8331045.0 13985 31135.881287246768 0.9608208780142045 0.8473885992448625 0.8432790193723467\r\n", "agg_4741 classical monocyte blood blood healthy GSE145809 scimilarity nan nan 69 245221.0 8962 29135.67197629852 0.8825701041728526 0.7811799267734735 0.7818647625179129\r\n", "agg_4742 classical monocyte blood blood healthy GSE149313 scimilarity nan nan 2420 6974751.0 13143 29560.496854576566 0.9574598613513423 0.8505290963248237 0.8379199735887167\r\n", "agg_4743 classical monocyte blood blood healthy GSE153421 scimilarity nan nan 3691 15561725.0 14569 34377.465875728165 0.9636686704566925 0.8576434473562725 0.8511814190737197\r\n", "agg_4744 classical monocyte blood blood healthy GSE156989 scimilarity nan nan 13554 160011485.0 16915 34135.439844737564 0.9640667421350761 0.8577967800377495 0.8517975138366085\r\n", "agg_4745 classical monocyte blood blood healthy GSE157829 scimilarity nan nan 1619 6957811.0 13507 30199.39288988673 0.9484019976492215 0.8436979316400604 0.8347196616710685\r\n", "agg_4746 classical monocyte blood blood healthy GSE159113 scimilarity nan nan 1025 6298250.0 12083 27477.50809897617 0.9078020151513733 0.8121457150205226 0.7980372877810575\r\n", "agg_4747 classical monocyte blood blood healthy GSE161329 scimilarity nan nan 5654 25653579.0 14349 28848.0539929647 0.9549801428956252 0.8450430950674043 0.8406188789518544\r\n", "agg_4748 classical monocyte blood blood healthy GSE161738 scimilarity nan nan 2676 13801473.0 12825 33337.477050230416 0.9541962906717452 0.8512846409758499 0.8485408028961247\r\n", "agg_4749 classical monocyte blood blood healthy GSE163668 scimilarity nan nan 2644 10486314.0 14049 33786.96584264489 0.9597801578342394 0.8560775485935677 0.8512149509551471\r\n", "agg_4750 classical monocyte blood blood healthy GSE166992 scimilarity nan nan 7501 28033216.0 15079 33455.367364577316 0.9622273594219685 0.8558958139235102 0.8495571689751152\r\n", "agg_4751 classical monocyte blood blood healthy GSE167363 scimilarity nan nan 3135 14722635.0 14375 29977.24002819913 0.942417448875388 0.8368071803109702 0.8258536430202982\r\n", "agg_4752 classical monocyte blood blood healthy GSE168710 scimilarity nan nan 16484 104881872.0 16223 34107.336261357574 0.9398282119039322 0.8424821834537695 0.8372971004604842\r\n", "agg_4753 classical monocyte blood blood healthy GSE168732 scimilarity nan nan 770 2548822.0 12508 33411.30103713399 0.9552513581030765 0.8508279875038706 0.847461536110767\r\n", "agg_4754 classical monocyte blood blood healthy b0cf0afa-ec40-4d65-b570-ed4ceacc6813 scimilarity nan nan 40975 300555227.0 15784 35938.85772500803 0.9622425892039956 0.853424173800979 0.8508714303589978\r\n", "agg_4755 classical monocyte blood blood healthy ddfad306-714d-4cc0-9985-d9072820c530 scimilarity nan nan 8827 36073928.0 15131 33208.591584008376 0.9546118779961532 0.8543086616569785 0.8462739374830107\r\n", "agg_4756 classical monocyte blood blood intracranial hypotension GSE138266 scimilarity nan nan 2503 9675804.0 14485 30160.767605621222 0.9452052724479383 0.8423537848756032 0.8326629487875993\r\n", "agg_4757 classical monocyte blood blood mucocutaneous lymph node syndrome GSE168732 scimilarity nan nan 5745 25930751.0 14822 33366.18751424575 0.9564515409367231 0.8556431530577528 0.8540185868162636\r\n", "agg_4758 classical monocyte blood blood multiple sclerosis GSE138266 scimilarity nan nan 3988 13926825.0 14991 31442.03464388843 0.9522779953120408 0.847799219646348 0.8382058078578654\r\n", "agg_4759 classical monocyte blood blood non-alcoholic fatty liver disease GSE136103 scimilarity nan nan 8306 29424841.0 15410 32004.200489375227 0.9619190264492873 0.8478709124980346 0.8451242436776344\r\n", "agg_4760 classical monocyte blood blood rheumatoid arthritis GSE159117 scimilarity nan nan 834 4637566.0 12079 31364.847230552205 0.9356058277176598 0.8232134520999813 0.8176333921128414\r\n", "agg_4761 classical monocyte blood blood septic shock GSE167363 scimilarity nan nan 3860 51041813.0 15830 31688.79561595612 0.948652055959824 0.8541736693569211 0.8427375237296424\r\n", "agg_4762 classical monocyte blood blood systemic lupus erythematosus 436154da-bcf1-4130-9c8b-120ff9a888f2 scimilarity nan nan 200468 516575809.0 16896 30011.373010792136 0.9562030923644677 0.8480520393236465 0.844052374052952\r\n", "agg_4763 classical monocyte blood blood systemic lupus erythematosus GSE142016 scimilarity nan nan 8268 22146620.0 14873 30889.72528098081 0.9588937962174496 0.8480448150326806 0.8395981302528143\r\n", "agg_4764 classical monocyte blood blood systemic lupus erythematosus GSE153765 scimilarity nan nan 42 109982.0 7500 27335.812367044335 0.8566470004710053 0.7665719714665945 0.7374607536624445\r\n", "agg_4765 classical monocyte blood blood systemic lupus erythematosus GSE156989 scimilarity nan nan 30367 310637290.0 17082 33485.308563356346 0.9623402060903008 0.8532487075078466 0.8473526649094757\r\n", "agg_4766 classical monocyte blood blood thrombocytopenia GSE149313 scimilarity nan nan 2724 15059814.0 14328 30599.80301260898 0.9543473550386421 0.8520722945129096 0.8417995728182829\r\n", "agg_4767 classical monocyte bone bone Langerhans Cell Histiocytosis GSE133704 scimilarity nan nan 439 1404680.0 11388 30817.807833507268 0.9358504157466769 0.830348562008033 0.826566269904344\r\n", "agg_4769 classical monocyte bone marrow bone marrow NA GSE162692 scimilarity nan nan 1234 4757721.0 13466 31707.380952189662 0.953852620063789 0.8503857428588029 0.8377674131460707\r\n", "agg_4770 classical monocyte bone marrow bone marrow essential thrombocythemia GSE117824 scimilarity nan nan 1649 7825780.0 13487 32454.468003620656 0.9503614540027875 0.8479601234457582 0.838408163377457\r\n", "agg_4772 classical monocyte bone marrow bone marrow healthy GSE132509 scimilarity nan nan 610 2315570.0 12950 31768.06513427212 0.95159369508558 0.8517118261701931 0.836658919433696\r\n", "agg_4773 classical monocyte bone marrow bone marrow healthy GSE154109 scimilarity nan nan 531 1431388.0 11793 31377.450948003392 0.9490546933955852 0.8431566630120637 0.8370883160295727\r\n", "agg_4774 classical monocyte bone marrow bone marrow healthy GSE163278 scimilarity nan nan 1119 3970394.0 13361 32081.93302956569 0.9620394897163868 0.8531148861215617 0.8426785397396367\r\n", "agg_4775 classical monocyte bone marrow bone marrow healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 151 8025584.0 12883 29444.78352863075 0.8440608178227607 0.740574910750325 0.7417328577454956\r\n", "agg_4776 classical monocyte bone marrow bone marrow monoclonal gammopathy GSE163278 scimilarity nan nan 1010 3124102.0 12959 30344.72094719757 0.9581906874137958 0.8503948562041261 0.8391672246132192\r\n", "agg_4777 classical monocyte breast breast healthy GSE164898 scimilarity nan nan 136 641471.0 12971 34463.52724138501 0.9163324788498406 0.8116274576633968 0.7978555908123931\r\n", "agg_4778 classical monocyte breast breast healthy c9706a92-0e5f-46c1-96d8-20e42467f287 scimilarity nan nan 98 1444245.0 13491 30678.263421880285 0.9165520953567395 0.8162053142576849 0.7994301225229256\r\n", "agg_4779 classical monocyte bronchus airway COVID-19 03f821b4-87be-4ff4-b65a-b5fc00061da7 scimilarity nan nan 104 270108.0 8816 27933.501359354133 0.8884582873892427 0.7928474093439662 0.786626157931558\r\n", "agg_4780 classical monocyte bronchus airway COVID-19 GSE168215 scimilarity nan nan 90 217928.0 8444 27417.845704249117 0.880738805029126 0.7855823736726739 0.7798557380738542\r\n", "agg_4782 classical monocyte bronchus airway healthy GSE158127 scimilarity nan nan 158 1158198.0 12643 34764.50196701077 0.9364512338084163 0.8259291909369686 0.8266638555276521\r\n", "agg_4783 classical monocyte cardiac muscle of left ventricle heart healthy GSE156703 scimilarity nan nan 13 116181.0 9463 35695.66320276271 0.8542740960069863 0.7515621053395214 0.7561639038477878\r\n", "agg_4784 classical monocyte carotid artery segment vasculature atherosclerosis GSE155512 scimilarity nan nan 58 515211.0 10839 32837.84505237503 0.9343565353773022 0.8190650931322585 0.8202426969221358\r\n", "agg_4785 classical monocyte caudate lobe of liver liver healthy 44531dd9-1388-4416-a117-af0a99de2294 scimilarity nan nan 238 730016.0 11505 31342.386314731422 0.9217674983890346 0.8140551552218395 0.8040417787954989\r\n", "agg_4786 classical monocyte cortex of kidney kidney healthy 120e86b4-1195-48c5-845b-b98054105eec scimilarity nan nan 79 323010.0 10939 32378.76683324232 0.9028856137251035 0.7978822439778066 0.7839454035009307\r\n", "agg_4787 classical monocyte cortex of kidney kidney healthy a98b828a-622a-483a-80e0-15703678befd scimilarity nan nan 91 477355.0 10898 32358.068865763344 0.9328436291917394 0.8237810319569842 0.8195391931798526\r\n", "agg_4789 classical monocyte digestive tract gut healthy DS000011665 scimilarity nan nan 347 1679116.0 12155 33347.55517047197 0.9422556928648441 0.8417267634096297 0.84018452536733\r\n", "agg_4790 classical monocyte exocrine pancreas pancreas healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 821 7824069.0 14847 36135.64587109593 0.9493709998172055 0.8440837716099078 0.8410246939313819\r\n", "agg_4791 classical monocyte fallopian tube fallopian tube healthy fc77d2ae-247d-44d7-aa24-3f4859254c2c scimilarity nan nan 131 734093.0 11240 33115.640434504094 0.9359103734457376 0.8339026306142181 0.8225901799509813\r\n", "agg_4792 classical monocyte fimbria of uterine tube fallopian tube healthy fc77d2ae-247d-44d7-aa24-3f4859254c2c scimilarity nan nan 34 209362.0 7663 27635.733860508328 0.8560416684135254 0.7382997749370328 0.7459235366949488\r\n", "agg_4794 classical monocyte gingiva mouth periodontitis GSE152042 scimilarity nan nan 198 879477.0 11312 32107.813302914532 0.9416262876541264 0.8333723697695014 0.8279530215775117\r\n", "agg_4795 classical monocyte head of femur bone healthy GSE169396 scimilarity nan nan 450 3669304.0 13216 33082.323604222154 0.9529417082022753 0.8522346343107771 0.8359032081996703\r\n", "agg_4797 classical monocyte heart left ventricle heart NA ENCODE scimilarity nan nan 50 138407.30523254164 11428 41105.63651890687 0.8614790128015358 0.7843765107548858 0.7889308929671582\r\n", "agg_4798 classical monocyte heart left ventricle heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 192 585001.0 11159 31422.874036870588 0.9363985001217598 0.8226438123601741 0.8283173244446851\r\n", "agg_4799 classical monocyte heart right ventricle heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 316 936263.0 11904 32002.67227624691 0.9425900348990802 0.8306128730459813 0.8308116461021977\r\n", "agg_4800 classical monocyte ileum gut Crohn's disease 17481d16-ee44-49e5-bcf0-28c0780d8c4a scimilarity nan nan 76 311515.0 9984 29687.611679190355 0.9103310804624617 0.8063080284939284 0.7916226068478351\r\n", "agg_4801 classical monocyte ileum gut Crohn's disease DS000011665 scimilarity nan nan 119 298206.0 8021 26013.286557459236 0.880272438867354 0.7572099232100128 0.7459713937143965\r\n", "agg_4802 classical monocyte inferior nasal concha bone chronic rhinosinusitis with nasal polyps GSE156285 scimilarity nan nan 241 1048981.0 12463 35082.083353928334 0.9475981193848912 0.8330375982148592 0.833114773003936\r\n", "agg_4803 classical monocyte interventricular septum heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 442 1322725.0 12418 32235.5434681197 0.94751399102473 0.8340623939411483 0.8353858365852226\r\n", "agg_4804 classical monocyte isthmus of fallopian tube fallopian tube healthy fc77d2ae-247d-44d7-aa24-3f4859254c2c scimilarity nan nan 62 318330.0 8642 29768.198512126502 0.8846027131590791 0.784220371073871 0.7739277341448829\r\n", "agg_4805 classical monocyte kidney kidney NA GSE145927 scimilarity nan nan 1789 6957949.0 15145 34853.400215205314 0.9586386429540185 0.8579622487738222 0.8524123833700197\r\n", "agg_4806 classical monocyte kidney kidney acute kidney failure bcb61471-2a44-4d00-a0af-ff085512674c scimilarity nan nan 587 1589224.0 12335 32471.78147434854 0.9513927341618255 0.84278469020191 0.8402011848101985\r\n", "agg_4807 classical monocyte kidney kidney chronic kidney disease bcb61471-2a44-4d00-a0af-ff085512674c scimilarity nan nan 134 440788.0 10831 32410.84600407974 0.9323603662663799 0.8153954443953603 0.8190416636069936\r\n", "agg_4808 classical monocyte kidney kidney healthy 120e86b4-1195-48c5-845b-b98054105eec scimilarity nan nan 762 4034828.0 14946 34015.00816823295 0.9520640694425091 0.848205266473767 0.836085027208869\r\n", "agg_4809 classical monocyte kidney kidney healthy DS000010415 scimilarity nan nan 55 127079.0 8055 27206.419037355434 0.8216238135756493 0.7570847030479543 0.72524174726152\r\n", "agg_4810 classical monocyte kidney kidney healthy GSE140989 scimilarity nan nan 174 563438.0 11016 29887.593299155575 0.914390252807459 0.8069762795735104 0.8086759072557722\r\n", "agg_4811 classical monocyte left cardiac atrium heart NA ENCODE scimilarity nan nan 59 225070.96128814947 12831 43727.214042795575 0.8938217272345973 0.8048974803645641 0.8168195119621123\r\n", "agg_4812 classical monocyte left cardiac atrium heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 450 1446734.0 12669 32433.20393285361 0.9467532324494885 0.8378571513042986 0.8360845068688169\r\n", "agg_4813 classical monocyte left lung lung NA ENCODE scimilarity nan nan 16 40636.89964582212 7786 32627.503591080356 0.7956982592153874 0.7257891637293196 0.6945729831399207\r\n", "agg_4814 classical monocyte liver liver Alagille syndrome GSE163650 scimilarity nan nan 92 188745.0 7678 24676.271768005652 0.8778490663838523 0.7727787177978541 0.7475297818459176\r\n", "agg_4815 classical monocyte liver liver Biliary atresia GSE163650 scimilarity nan nan 367 1615410.0 11423 31280.031237098276 0.9308940943983487 0.8275363405345841 0.8098394949074608\r\n", "agg_4816 classical monocyte liver liver fibrosis GSE136103 scimilarity nan nan 1053 5824229.0 14559 31950.801484792315 0.9632028296773183 0.8486875622591808 0.8450295232730788\r\n", "agg_4817 classical monocyte liver liver healthy GSE136103 scimilarity nan nan 2036 7668787.0 14818 32446.71686472893 0.9614468191909097 0.8455504517344443 0.8464065028454753\r\n", "agg_4818 classical monocyte liver liver healthy GSE159977 scimilarity nan nan 584 4703990.0 13644 34306.87421529999 0.9580119653448788 0.8463006395755477 0.8456137609888349\r\n", "agg_4819 classical monocyte liver liver healthy GSE163650 scimilarity nan nan 440 4840312.0 12272 31180.407161439263 0.9312379663603724 0.8198899526213368 0.8071334502842269\r\n", "agg_4820 classical monocyte liver liver non-alcoholic fatty liver disease GSE136103 scimilarity nan nan 675 3625081.0 13858 31875.772311414476 0.9607644852684971 0.8451153856892568 0.8434410362324897\r\n", "agg_4821 classical monocyte liver liver non-alcoholic steatohepatitis GSE159977 scimilarity nan nan 818 5328417.0 13712 34244.79288663241 0.9625204736588413 0.8495046961098656 0.843851264103893\r\n", "agg_4822 classical monocyte lower lobe of left lung lung NA ENCODE scimilarity nan nan 119 332235.9213328175 13992 45607.18635453728 0.9075609223976155 0.8257092626936027 0.8254709531577699\r\n", "agg_4823 classical monocyte lower lobe of lung lung healthy GSE169471 scimilarity nan nan 305 1224338.0 11342 28255.985922767635 0.9404343350984603 0.8261785132237449 0.8150341534919611\r\n", "agg_4824 classical monocyte lung lung COVID-19 GSE145926 scimilarity nan nan 6755 29326462.0 15670 32143.238185602037 0.9347325303698273 0.8382075353537076 0.8350131792189084\r\n", "agg_4825 classical monocyte lung lung COVID-19 GSE149878 scimilarity nan nan 1388 17477477.0 15453 32118.37391645824 0.9547396360778304 0.8488321261303191 0.8334226404565429\r\n", "agg_4826 classical monocyte lung lung COVID-19 covid scimilarity nan nan 87 182436.0 8979 30922.571470240666 0.8944321462094855 0.7906545756360502 0.7941388516891991\r\n", "agg_4827 classical monocyte lung lung Idiopathic pulmonary arterial hypertension GSE169471 scimilarity nan nan 338 1099281.0 11441 29048.65706098467 0.9394268881459132 0.8205211245968264 0.8008915371926529\r\n", "agg_4828 classical monocyte lung lung NA GSE122960 scimilarity nan nan 2035 4747594.0 13592 31170.75054081323 0.947029489389696 0.8239671657460403 0.8229589179837169\r\n", "agg_4829 classical monocyte lung lung NA GSE150708 scimilarity nan nan 1711 18922764.0 15768 34651.58457127426 0.9197817700449655 0.8258478966731367 0.8332402246613281\r\n", "agg_4830 classical monocyte lung lung NA GSE159354 scimilarity nan nan 804 1319717.0 12267 30466.67987009986 0.9289888078126158 0.8179900347657229 0.8046761734394422\r\n", "agg_4831 classical monocyte lung lung chronic obstructive pulmonary disease DS000011735 scimilarity nan nan 1757 5736362.0 16385 37750.32029026267 0.8922650938609307 0.8174310943962452 0.7970204631208431\r\n", "agg_4832 classical monocyte lung lung healthy 5d445965-6f1a-4b68-ba3a-b8f765155d3a scimilarity nan nan 1254 5397217.0 13298 31013.01576531177 0.9490555294982109 0.8457060773457411 0.8329330004620054\r\n", "agg_4833 classical monocyte lung lung healthy DS000011735 scimilarity nan nan 4653 16523051.0 17066 37593.98867222708 0.8985708908575675 0.8260014412964971 0.8051199142820423\r\n", "agg_4834 classical monocyte lung lung healthy GSE128033 scimilarity nan nan 1047 3646581.0 13185 29331.2035341724 0.9513045959002788 0.837770557527153 0.8238987539695043\r\n", "agg_4835 classical monocyte lung lung healthy GSE128169 scimilarity nan nan 1732 11798577.0 15051 32941.755873862814 0.9636814596401634 0.8539834251349044 0.8457626672394015\r\n", "agg_4836 classical monocyte lung lung healthy GSE132771 scimilarity nan nan 1601 4614408.0 13275 29761.03818692572 0.9531291409651284 0.8457644131717956 0.8310980214000753\r\n", "agg_4837 classical monocyte lung lung healthy GSE169471 scimilarity nan nan 498 1613976.0 11886 28956.213107123967 0.9433400854993297 0.8271067022019086 0.815595680192867\r\n", "agg_4838 classical monocyte lung lung hypersensitivity pneumonitis GSE122960 scimilarity nan nan 374 1513589.0 11850 30594.494180377842 0.9436379625667726 0.8236248274374875 0.8201281668004226\r\n", "agg_4839 classical monocyte lung lung idiopathic pulmonary fibrosis DS000011735 scimilarity nan nan 3273 11098539.0 16692 36983.80245044498 0.9002060376489591 0.825628591643822 0.8057309447193657\r\n", "agg_4840 classical monocyte lung lung idiopathic pulmonary fibrosis GSE122960 scimilarity nan nan 795 2302741.0 12763 31758.949309599942 0.9481965424789537 0.8315974088368188 0.8291678346888611\r\n", "agg_4841 classical monocyte lung lung idiopathic pulmonary fibrosis GSE128033 scimilarity nan nan 264 892053.0 10997 28549.410927787198 0.9388857876621541 0.8259212982973368 0.8088642639807162\r\n", "agg_4842 classical monocyte lung lung idiopathic pulmonary fibrosis GSE132771 scimilarity nan nan 562 1301612.0 12354 30446.385456748263 0.9469963680992495 0.8353213795556867 0.820474733820443\r\n", "agg_4844 classical monocyte lung lung idiopathic pulmonary fibrosis GSE143706 scimilarity nan nan 28 77859.0 5999 21933.076720558005 0.8162248508692449 0.6872290766118224 0.6675933221673609\r\n", "agg_4845 classical monocyte lung lung idiopathic pulmonary fibrosis GSE146981 scimilarity nan nan 28 77859.0 5999 21933.076720558005 0.8151661654118738 0.6848887765520302 0.667713152235204\r\n", "agg_4846 classical monocyte lung lung idiopathic pulmonary fibrosis GSE159354 scimilarity nan nan 963 1825354.0 12518 29731.366588446697 0.9431677835768482 0.8335606700375219 0.8147956336542009\r\n", "agg_4847 classical monocyte lung lung interstitial lung disease GSE122960 scimilarity nan nan 255 622277.0 10480 29149.028142322823 0.9283467350849584 0.8054016349869099 0.8007862785744999\r\n", "agg_4848 classical monocyte lung lung interstitial lung disease GSE128169 scimilarity nan nan 697 1972432.0 12243 29254.839846468705 0.9423878335093786 0.8300974626358707 0.8196994274197087\r\n", "agg_4849 classical monocyte lung lung scleroderma GSE128169 scimilarity nan nan 108 906362.0 11850 32908.696557354284 0.9438117150692724 0.8371885386238901 0.8225520820106291\r\n", "agg_4850 classical monocyte lung lung scleroderma GSE132771 scimilarity nan nan 98 335776.0 9515 28212.056049440183 0.9149834322044056 0.8124364794406282 0.7889831369493125\r\n", "agg_4851 classical monocyte lung lung systemic scleroderma;interstitial lung disease GSE159354 scimilarity nan nan 680 1244200.0 11364 27669.832360293723 0.9311681193836778 0.8218066897336025 0.801632654842017\r\n", "agg_4852 classical monocyte lung parenchyma lung COVID-19 GSE158127 scimilarity nan nan 1028 2949423.0 13468 33486.58561312063 0.9573331305866764 0.8476148647425674 0.8402420746316328\r\n", "agg_4853 classical monocyte lung parenchyma lung healthy GSE158127 scimilarity nan nan 791 2735456.0 13260 33646.87553319058 0.9544351950657847 0.8399981618864122 0.8327646420552953\r\n", "agg_4854 classical monocyte lymph node lymph node Langerhans Cell Histiocytosis GSE133704 scimilarity nan nan 41 112531.0 7250 25404.282603262254 0.8424914182716917 0.7490029862443883 0.7315420690462492\r\n", "agg_4855 classical monocyte mesenteric artery vasculature healthy GSE156341 scimilarity nan nan 49 408553.0 10083 30979.99239432764 0.9337200372314851 0.8169964046619824 0.8163109060481625\r\n", "agg_4856 classical monocyte mesenteric artery vasculature type II diabetes mellitus GSE156341 scimilarity nan nan 107 869426.0 11343 33124.102127533 0.9473048055491107 0.8341783608252021 0.8308049817701406\r\n", "agg_4857 classical monocyte mesenteric lymph node lymph node healthy 7681c7d7-0168-4892-a547-6f02a6430ace scimilarity nan nan 23 211416.0 9219 31018.02041830142 0.9058403141644794 0.7914556298280883 0.7867496166249129\r\n", "agg_4858 classical monocyte muscle tissue muscle healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 1800 50754027.0 16141 33323.26969602708 0.9316672745867468 0.8348295016474899 0.8219918807835366\r\n", "agg_4859 classical monocyte nasal cavity airway COVID-19 03f821b4-87be-4ff4-b65a-b5fc00061da7 scimilarity nan nan 2907 12130592.0 15268 34541.15937505348 0.9398365208964624 0.8399826209015534 0.8406896100498221\r\n", "agg_4860 classical monocyte nasal cavity airway healthy 03f821b4-87be-4ff4-b65a-b5fc00061da7 scimilarity nan nan 129 319231.0 10961 33948.695520725145 0.908031594986554 0.802885512146251 0.7998873890770253\r\n", "agg_4861 classical monocyte nasopharynx airway nasopharyngeal neoplasm GSE150825 scimilarity nan nan 248 919710.0 11652 32725.826204558824 0.9483756316738621 0.8416477499125612 0.8434913191320811\r\n", "agg_4862 classical monocyte nose airway chronic rhinosinusitis with nasal polyps GSE156285 scimilarity nan nan 89 407982.0 10658 32874.53627471163 0.9356986809715955 0.8282644612568757 0.8193450383260489\r\n", "agg_4863 classical monocyte olfactory epithelium airway NA GSE139522 scimilarity nan nan 152 645745.0 11496 32760.770927681508 0.9344670047519139 0.8326335387583638 0.8215291009295241\r\n", "agg_4864 classical monocyte omental fat pad peritoneum obesity GSE163830 scimilarity nan nan 248 603440.0 11376 33014.67391159742 0.9276969765366693 0.8135481953870003 0.8105166731324462\r\n", "agg_4865 classical monocyte omentum peritoneum NA GSE151889 scimilarity nan nan 106 233037.0 9833 30451.216970905818 0.9023265636700606 0.794410906944691 0.7787417127899898\r\n", "agg_4868 classical monocyte peritoneum peritoneum NA GSE130888 scimilarity nan nan 20547 75515682.0 16606 31611.601467579523 0.9640573553126023 0.8513803623213743 0.8467189186177884\r\n", "agg_4869 classical monocyte peritoneum peritoneum healthy GSE130888 scimilarity nan nan 297 509237.0 11456 32213.169493243313 0.9218188146119148 0.8055575625334659 0.8039145027939639\r\n", "agg_4870 classical monocyte prostate gland prostate healthy GSE145843 scimilarity nan nan 24 87555.0 5997 21432.99958526179 0.816337811891654 0.7246769991610698 0.6943045232082502\r\n", "agg_4871 classical monocyte prostate gland prostate healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 220 2445216.0 12460 33911.259081133416 0.9301914834403244 0.8259560877683964 0.824440772955729\r\n", "agg_4872 classical monocyte renal medulla kidney healthy 120e86b4-1195-48c5-845b-b98054105eec scimilarity nan nan 21 101089.0 7354 26191.96315382653 0.8491245141516279 0.751292403843078 0.7310820608427115\r\n", "agg_4873 classical monocyte respiratory airway airway COVID-19 29f92179-ca10-4309-a32b-d383d80347c1 scimilarity nan nan 24222 187246624.0 17810 38621.12130270373 0.911673853186318 0.8025805020768422 0.8054824859649656\r\n", "agg_4874 classical monocyte respiratory tract epithelium airway NA GSE139522 scimilarity nan nan 69 371203.0 11152 33714.18718588416 0.9189481336748234 0.8044925205508522 0.8036843863420214\r\n", "agg_4875 classical monocyte right cardiac atrium heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 311 977934.0 11891 31570.89048372305 0.9357294628028995 0.8260631626226992 0.8224053236502149\r\n", "agg_4876 classical monocyte sigmoid colon gut ulcerative colitis DS000010618 scimilarity nan nan 56 157830.0 7772 25795.237651990203 0.8795503731261843 0.7579451181338609 0.7495331251552053\r\n", "agg_4877 classical monocyte spleen spleen HIV infection GSE148796 scimilarity nan nan 48 118392.0 7120 25206.082214526155 0.8589535988735265 0.7723626781751426 0.7544543625246387\r\n", "agg_4878 classical monocyte spleen spleen healthy 4d74781b-8186-4c9a-b659-ff4dc4601d91 scimilarity nan nan 2166 7905128.0 13952 30832.98149016589 0.957626764427953 0.8498277489734691 0.8368540073560422\r\n", "agg_4879 classical monocyte spleen spleen healthy GSE148796 scimilarity nan nan 49 99684.0 6785 24492.260723886982 0.8477188947186437 0.7504685014947085 0.7295165069095441\r\n", "agg_4880 classical monocyte spleen spleen healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 3483 86078540.0 17775 36266.36759486 0.9100465322095738 0.8097019968114763 0.8081945289287479\r\n", "agg_4881 classical monocyte subcutaneous adipose tissue adipose healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 2019 27914261.0 15739 35142.992419516115 0.9421666034710219 0.8357406603138717 0.8249091266671545\r\n", "agg_4882 classical monocyte synovial fluid synovial joint juvenile idiopathic arthritis GSE160097 scimilarity nan nan 68 430194.0 9771 30748.449183442222 0.9095023395695137 0.8036624580038914 0.8039477211085259\r\n", "agg_4883 classical monocyte synovial fluid synovial joint psoriatic arthritis GSE161500 scimilarity nan nan 675 4107697.0 12980 33508.9642503863 0.9516464026609945 0.8477381221507095 0.8460202124356817\r\n", "agg_4884 classical monocyte tertiary ovarian follicle ovary NA GSE146512 scimilarity nan nan 100 296748.0 10411 33084.00694991347 0.9139378500397654 0.8064064097295065 0.8023402394768804\r\n", "agg_4885 classical monocyte testis testis NA GSE153819 scimilarity nan nan 17 97625.0 7958 29072.297180853668 0.855345538109863 0.764463114789657 0.7520891122089112\r\n", "agg_4886 classical monocyte thoracic lymph node lymph node healthy 62ef75e4-cbea-454e-a0ce-998ec40223d3 scimilarity nan nan 20194 147502356.0 17156 34278.43204322054 0.9584542376154844 0.8568164088302802 0.8468420651789338\r\n", "agg_4887 classical monocyte thymus thymus healthy 62ef75e4-cbea-454e-a0ce-998ec40223d3 scimilarity nan nan 487 2692133.0 12969 33549.45213075861 0.9497232200872254 0.8549259958290932 0.843613914494647\r\n", "agg_4888 classical monocyte thymus thymus healthy 83ed3be8-4cb9-43e6-9aaa-3fbbf5d1bd3a scimilarity nan nan 27 80042.0 6527 23800.833698344715 0.8448423801602987 0.7347769941739797 0.7181319535525903\r\n", "agg_4889 classical monocyte thymus thymus healthy de13e3e2-23b6-40ed-a413-e9e12d7d3910 scimilarity nan nan 52 298983.0 9582 30002.684399867492 0.9175531872094376 0.8185046437128072 0.8176606503789746\r\n", "agg_4890 classical monocyte tonsil tonsil healthy GSE119506 scimilarity nan nan 321 1114339.0 11546 30516.473940957327 0.936749794981687 0.8307356996245999 0.8215841775914924\r\n", "agg_4893 classical monocyte trachea airway healthy 03f821b4-87be-4ff4-b65a-b5fc00061da7 scimilarity nan nan 126 457580.0 11135 33734.327765812595 0.9299687996532033 0.8279876048002456 0.8230768581762299\r\n", "agg_4894 classical monocyte trachea airway healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 130 8245622.0 13663 32584.555010387805 0.859406194907081 0.7522520630806427 0.7550758397192211\r\n", "agg_4895 classical monocyte transition zone of prostate prostate prostatic hypertrophy 4b54248f-2165-477c-a027-dd55082e8818 scimilarity nan nan 520 2949099.0 13618 29095.077520604846 0.9205242051339535 0.807648434373933 0.7862293702463532\r\n", "agg_4896 classical monocyte transverse colon gut healthy 62ef75e4-cbea-454e-a0ce-998ec40223d3 scimilarity nan nan 503 2932999.0 12868 33265.22762476361 0.9476555210791636 0.8481166846094836 0.8392740144102075\r\n", "agg_4897 classical monocyte tympanic membrane ear NA GSE128892 scimilarity nan nan 33 153723.0 7971 26217.848855873086 0.8515891731868822 0.7526405790618411 0.7328434865867175\r\n", "agg_4899 classical monocyte upper lobe of lung lung healthy GSE169471 scimilarity nan nan 180 594059.0 10222 27881.011904781462 0.9248153838159024 0.8106326147324702 0.8003438801026445\r\n", "agg_4900 classical monocyte urine urinary healthy GSE165396 scimilarity nan nan 20 109197.0 7299 25505.26771828214 0.8530560359297166 0.7505269381795711 0.7350625083451162\r\n", "agg_4901 classical monocyte uterus uterus healthy 32f2fd23-ec74-486f-9544-e5b2f41725f5 scimilarity nan nan 18 189472.0 9397 30891.07905591756 0.8870309576225303 0.7796319609750195 0.7827462282311286\r\n", "agg_4902 classical monocyte vasculature vasculature healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 14537 224206261.0 17922 36651.90273209977 0.9438232475974557 0.8425086108952623 0.8343079266995497\r\n", "agg_4903 classical monocyte visceral fat adipose obesity GSE128518 scimilarity nan nan 74 196657.0 9100 28836.082756573305 0.8890453654508667 0.7810993233599736 0.7788838086708725\r\n" ] } ], "source": [ "! decima query-cell 'cell_type == \"classical monocyte\"' | column -t -s $'\\t'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Query cells that:\n", "- have \"monocyte\" in their cell type name (cell_type.str.contains(\"monocyte\"))\n", "- are from healthy donors (disease == \"healthy\")" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:29:20.091032Z", "iopub.status.busy": "2025-11-21T06:29:20.090868Z", "iopub.status.idle": "2025-11-21T06:29:35.058170Z", "shell.execute_reply": "2025-11-21T06:29:35.057422Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n", "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33mmhcelik\u001b[0m (\u001b[33mmhcw\u001b[0m) to \u001b[32mhttps://api.wandb.ai\u001b[0m. Use \u001b[1m`wandb login --relogin`\u001b[0m to force relogin\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'metadata:latest', 3122.32MB. 1 files...\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \r\n", "Done. 00:00:01.9 (1657.8MB/s)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " cell_type tissue organ disease study dataset region subregion celltype_coarse n_cells total_counts n_genes size_factor train_pearson val_pearson test_pearson\r\n", "agg_4706 classical monocyte alveolar system lung healthy GSE155249 scimilarity nan nan 72 218105.0 9142 30484.31888978114 0.9102228263646758 0.8083487523192785 0.8047828694155461\r\n", "agg_4707 classical monocyte ampulla of uterine tube fallopian tube healthy fc77d2ae-247d-44d7-aa24-3f4859254c2c scimilarity nan nan 78 550950.0 9639 30719.377971431015 0.9077670011915634 0.8045070167513724 0.7896845423359651\r\n", "agg_4709 classical monocyte aorta vasculature healthy GSE166676 scimilarity nan nan 25 162858.0 8859 31216.275954364824 0.8819013257206973 0.7821403055329706 0.7646999711802146\r\n", "agg_4710 classical monocyte apex of heart heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 397 1226515.0 12369 32022.563851814968 0.9469178617442242 0.8326145310572417 0.8365506153530168\r\n", "agg_4733 classical monocyte blood blood healthy 03f821b4-87be-4ff4-b65a-b5fc00061da7 scimilarity nan nan 32464 109280914.0 16158 33646.02843110038 0.9568728712803031 0.8545324533535094 0.8487445540580735\r\n", "agg_4734 classical monocyte blood blood healthy 436154da-bcf1-4130-9c8b-120ff9a888f2 scimilarity nan nan 76800 206490628.0 16683 30736.453324546856 0.955313650235467 0.8494127267867799 0.84210567312908\r\n", "agg_4735 classical monocyte blood blood healthy 5d445965-6f1a-4b68-ba3a-b8f765155d3a scimilarity nan nan 1044 3638976.0 12306 30542.8977364245 0.9465384578921433 0.851991795617166 0.8308191715256182\r\n", "agg_4736 classical monocyte blood blood healthy DS000010023 scimilarity nan nan 243 362606.0 8414 25201.261024165953 0.8865446516098515 0.7621391376670988 0.7681818769796727\r\n", "agg_4737 classical monocyte blood blood healthy GSE122703 scimilarity nan nan 18 83417.0 7546 28194.725173315186 0.859225612396475 0.7640890699253299 0.7443681539461423\r\n", "agg_4738 classical monocyte blood blood healthy GSE130117 scimilarity nan nan 2017 7130588.0 13535 33078.53692191542 0.9553673450160365 0.851402109626239 0.8385936100758409\r\n", "agg_4739 classical monocyte blood blood healthy GSE132802 scimilarity nan nan 1601 9955248.0 13132 32063.630951743195 0.9478882791739611 0.8391025143866828 0.8303465877530952\r\n", "agg_4740 classical monocyte blood blood healthy GSE139324 scimilarity nan nan 2333 8331045.0 13985 31135.881287246768 0.9608208780142045 0.8473885992448625 0.8432790193723467\r\n", "agg_4741 classical monocyte blood blood healthy GSE145809 scimilarity nan nan 69 245221.0 8962 29135.67197629852 0.8825701041728526 0.7811799267734735 0.7818647625179129\r\n", "agg_4742 classical monocyte blood blood healthy GSE149313 scimilarity nan nan 2420 6974751.0 13143 29560.496854576566 0.9574598613513423 0.8505290963248237 0.8379199735887167\r\n", "agg_4743 classical monocyte blood blood healthy GSE153421 scimilarity nan nan 3691 15561725.0 14569 34377.465875728165 0.9636686704566925 0.8576434473562725 0.8511814190737197\r\n", "agg_4744 classical monocyte blood blood healthy GSE156989 scimilarity nan nan 13554 160011485.0 16915 34135.439844737564 0.9640667421350761 0.8577967800377495 0.8517975138366085\r\n", "agg_4745 classical monocyte blood blood healthy GSE157829 scimilarity nan nan 1619 6957811.0 13507 30199.39288988673 0.9484019976492215 0.8436979316400604 0.8347196616710685\r\n", "agg_4746 classical monocyte blood blood healthy GSE159113 scimilarity nan nan 1025 6298250.0 12083 27477.50809897617 0.9078020151513733 0.8121457150205226 0.7980372877810575\r\n", "agg_4747 classical monocyte blood blood healthy GSE161329 scimilarity nan nan 5654 25653579.0 14349 28848.0539929647 0.9549801428956252 0.8450430950674043 0.8406188789518544\r\n", "agg_4748 classical monocyte blood blood healthy GSE161738 scimilarity nan nan 2676 13801473.0 12825 33337.477050230416 0.9541962906717452 0.8512846409758499 0.8485408028961247\r\n", "agg_4749 classical monocyte blood blood healthy GSE163668 scimilarity nan nan 2644 10486314.0 14049 33786.96584264489 0.9597801578342394 0.8560775485935677 0.8512149509551471\r\n", "agg_4750 classical monocyte blood blood healthy GSE166992 scimilarity nan nan 7501 28033216.0 15079 33455.367364577316 0.9622273594219685 0.8558958139235102 0.8495571689751152\r\n", "agg_4751 classical monocyte blood blood healthy GSE167363 scimilarity nan nan 3135 14722635.0 14375 29977.24002819913 0.942417448875388 0.8368071803109702 0.8258536430202982\r\n", "agg_4752 classical monocyte blood blood healthy GSE168710 scimilarity nan nan 16484 104881872.0 16223 34107.336261357574 0.9398282119039322 0.8424821834537695 0.8372971004604842\r\n", "agg_4753 classical monocyte blood blood healthy GSE168732 scimilarity nan nan 770 2548822.0 12508 33411.30103713399 0.9552513581030765 0.8508279875038706 0.847461536110767\r\n", "agg_4754 classical monocyte blood blood healthy b0cf0afa-ec40-4d65-b570-ed4ceacc6813 scimilarity nan nan 40975 300555227.0 15784 35938.85772500803 0.9622425892039956 0.853424173800979 0.8508714303589978\r\n", "agg_4755 classical monocyte blood blood healthy ddfad306-714d-4cc0-9985-d9072820c530 scimilarity nan nan 8827 36073928.0 15131 33208.591584008376 0.9546118779961532 0.8543086616569785 0.8462739374830107\r\n", "agg_4772 classical monocyte bone marrow bone marrow healthy GSE132509 scimilarity nan nan 610 2315570.0 12950 31768.06513427212 0.95159369508558 0.8517118261701931 0.836658919433696\r\n", "agg_4773 classical monocyte bone marrow bone marrow healthy GSE154109 scimilarity nan nan 531 1431388.0 11793 31377.450948003392 0.9490546933955852 0.8431566630120637 0.8370883160295727\r\n", "agg_4774 classical monocyte bone marrow bone marrow healthy GSE163278 scimilarity nan nan 1119 3970394.0 13361 32081.93302956569 0.9620394897163868 0.8531148861215617 0.8426785397396367\r\n", "agg_4775 classical monocyte bone marrow bone marrow healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 151 8025584.0 12883 29444.78352863075 0.8440608178227607 0.740574910750325 0.7417328577454956\r\n", "agg_4777 classical monocyte breast breast healthy GSE164898 scimilarity nan nan 136 641471.0 12971 34463.52724138501 0.9163324788498406 0.8116274576633968 0.7978555908123931\r\n", "agg_4778 classical monocyte breast breast healthy c9706a92-0e5f-46c1-96d8-20e42467f287 scimilarity nan nan 98 1444245.0 13491 30678.263421880285 0.9165520953567395 0.8162053142576849 0.7994301225229256\r\n", "agg_4782 classical monocyte bronchus airway healthy GSE158127 scimilarity nan nan 158 1158198.0 12643 34764.50196701077 0.9364512338084163 0.8259291909369686 0.8266638555276521\r\n", "agg_4783 classical monocyte cardiac muscle of left ventricle heart healthy GSE156703 scimilarity nan nan 13 116181.0 9463 35695.66320276271 0.8542740960069863 0.7515621053395214 0.7561639038477878\r\n", "agg_4785 classical monocyte caudate lobe of liver liver healthy 44531dd9-1388-4416-a117-af0a99de2294 scimilarity nan nan 238 730016.0 11505 31342.386314731422 0.9217674983890346 0.8140551552218395 0.8040417787954989\r\n", "agg_4786 classical monocyte cortex of kidney kidney healthy 120e86b4-1195-48c5-845b-b98054105eec scimilarity nan nan 79 323010.0 10939 32378.76683324232 0.9028856137251035 0.7978822439778066 0.7839454035009307\r\n", "agg_4787 classical monocyte cortex of kidney kidney healthy a98b828a-622a-483a-80e0-15703678befd scimilarity nan nan 91 477355.0 10898 32358.068865763344 0.9328436291917394 0.8237810319569842 0.8195391931798526\r\n", "agg_4789 classical monocyte digestive tract gut healthy DS000011665 scimilarity nan nan 347 1679116.0 12155 33347.55517047197 0.9422556928648441 0.8417267634096297 0.84018452536733\r\n", "agg_4790 classical monocyte exocrine pancreas pancreas healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 821 7824069.0 14847 36135.64587109593 0.9493709998172055 0.8440837716099078 0.8410246939313819\r\n", "agg_4791 classical monocyte fallopian tube fallopian tube healthy fc77d2ae-247d-44d7-aa24-3f4859254c2c scimilarity nan nan 131 734093.0 11240 33115.640434504094 0.9359103734457376 0.8339026306142181 0.8225901799509813\r\n", "agg_4792 classical monocyte fimbria of uterine tube fallopian tube healthy fc77d2ae-247d-44d7-aa24-3f4859254c2c scimilarity nan nan 34 209362.0 7663 27635.733860508328 0.8560416684135254 0.7382997749370328 0.7459235366949488\r\n", "agg_4795 classical monocyte head of femur bone healthy GSE169396 scimilarity nan nan 450 3669304.0 13216 33082.323604222154 0.9529417082022753 0.8522346343107771 0.8359032081996703\r\n", "agg_4798 classical monocyte heart left ventricle heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 192 585001.0 11159 31422.874036870588 0.9363985001217598 0.8226438123601741 0.8283173244446851\r\n", "agg_4799 classical monocyte heart right ventricle heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 316 936263.0 11904 32002.67227624691 0.9425900348990802 0.8306128730459813 0.8308116461021977\r\n", "agg_4803 classical monocyte interventricular septum heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 442 1322725.0 12418 32235.5434681197 0.94751399102473 0.8340623939411483 0.8353858365852226\r\n", "agg_4804 classical monocyte isthmus of fallopian tube fallopian tube healthy fc77d2ae-247d-44d7-aa24-3f4859254c2c scimilarity nan nan 62 318330.0 8642 29768.198512126502 0.8846027131590791 0.784220371073871 0.7739277341448829\r\n", "agg_4808 classical monocyte kidney kidney healthy 120e86b4-1195-48c5-845b-b98054105eec scimilarity nan nan 762 4034828.0 14946 34015.00816823295 0.9520640694425091 0.848205266473767 0.836085027208869\r\n", "agg_4809 classical monocyte kidney kidney healthy DS000010415 scimilarity nan nan 55 127079.0 8055 27206.419037355434 0.8216238135756493 0.7570847030479543 0.72524174726152\r\n", "agg_4810 classical monocyte kidney kidney healthy GSE140989 scimilarity nan nan 174 563438.0 11016 29887.593299155575 0.914390252807459 0.8069762795735104 0.8086759072557722\r\n", "agg_4812 classical monocyte left cardiac atrium heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 450 1446734.0 12669 32433.20393285361 0.9467532324494885 0.8378571513042986 0.8360845068688169\r\n", "agg_4817 classical monocyte liver liver healthy GSE136103 scimilarity nan nan 2036 7668787.0 14818 32446.71686472893 0.9614468191909097 0.8455504517344443 0.8464065028454753\r\n", "agg_4818 classical monocyte liver liver healthy GSE159977 scimilarity nan nan 584 4703990.0 13644 34306.87421529999 0.9580119653448788 0.8463006395755477 0.8456137609888349\r\n", "agg_4819 classical monocyte liver liver healthy GSE163650 scimilarity nan nan 440 4840312.0 12272 31180.407161439263 0.9312379663603724 0.8198899526213368 0.8071334502842269\r\n", "agg_4823 classical monocyte lower lobe of lung lung healthy GSE169471 scimilarity nan nan 305 1224338.0 11342 28255.985922767635 0.9404343350984603 0.8261785132237449 0.8150341534919611\r\n", "agg_4832 classical monocyte lung lung healthy 5d445965-6f1a-4b68-ba3a-b8f765155d3a scimilarity nan nan 1254 5397217.0 13298 31013.01576531177 0.9490555294982109 0.8457060773457411 0.8329330004620054\r\n", "agg_4833 classical monocyte lung lung healthy DS000011735 scimilarity nan nan 4653 16523051.0 17066 37593.98867222708 0.8985708908575675 0.8260014412964971 0.8051199142820423\r\n", "agg_4834 classical monocyte lung lung healthy GSE128033 scimilarity nan nan 1047 3646581.0 13185 29331.2035341724 0.9513045959002788 0.837770557527153 0.8238987539695043\r\n", "agg_4835 classical monocyte lung lung healthy GSE128169 scimilarity nan nan 1732 11798577.0 15051 32941.755873862814 0.9636814596401634 0.8539834251349044 0.8457626672394015\r\n", "agg_4836 classical monocyte lung lung healthy GSE132771 scimilarity nan nan 1601 4614408.0 13275 29761.03818692572 0.9531291409651284 0.8457644131717956 0.8310980214000753\r\n", "agg_4837 classical monocyte lung lung healthy GSE169471 scimilarity nan nan 498 1613976.0 11886 28956.213107123967 0.9433400854993297 0.8271067022019086 0.815595680192867\r\n", "agg_4853 classical monocyte lung parenchyma lung healthy GSE158127 scimilarity nan nan 791 2735456.0 13260 33646.87553319058 0.9544351950657847 0.8399981618864122 0.8327646420552953\r\n", "agg_4855 classical monocyte mesenteric artery vasculature healthy GSE156341 scimilarity nan nan 49 408553.0 10083 30979.99239432764 0.9337200372314851 0.8169964046619824 0.8163109060481625\r\n", "agg_4857 classical monocyte mesenteric lymph node lymph node healthy 7681c7d7-0168-4892-a547-6f02a6430ace scimilarity nan nan 23 211416.0 9219 31018.02041830142 0.9058403141644794 0.7914556298280883 0.7867496166249129\r\n", "agg_4858 classical monocyte muscle tissue muscle healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 1800 50754027.0 16141 33323.26969602708 0.9316672745867468 0.8348295016474899 0.8219918807835366\r\n", "agg_4860 classical monocyte nasal cavity airway healthy 03f821b4-87be-4ff4-b65a-b5fc00061da7 scimilarity nan nan 129 319231.0 10961 33948.695520725145 0.908031594986554 0.802885512146251 0.7998873890770253\r\n", "agg_4869 classical monocyte peritoneum peritoneum healthy GSE130888 scimilarity nan nan 297 509237.0 11456 32213.169493243313 0.9218188146119148 0.8055575625334659 0.8039145027939639\r\n", "agg_4870 classical monocyte prostate gland prostate healthy GSE145843 scimilarity nan nan 24 87555.0 5997 21432.99958526179 0.816337811891654 0.7246769991610698 0.6943045232082502\r\n", "agg_4871 classical monocyte prostate gland prostate healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 220 2445216.0 12460 33911.259081133416 0.9301914834403244 0.8259560877683964 0.824440772955729\r\n", "agg_4872 classical monocyte renal medulla kidney healthy 120e86b4-1195-48c5-845b-b98054105eec scimilarity nan nan 21 101089.0 7354 26191.96315382653 0.8491245141516279 0.751292403843078 0.7310820608427115\r\n", "agg_4875 classical monocyte right cardiac atrium heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 311 977934.0 11891 31570.89048372305 0.9357294628028995 0.8260631626226992 0.8224053236502149\r\n", "agg_4878 classical monocyte spleen spleen healthy 4d74781b-8186-4c9a-b659-ff4dc4601d91 scimilarity nan nan 2166 7905128.0 13952 30832.98149016589 0.957626764427953 0.8498277489734691 0.8368540073560422\r\n", "agg_4879 classical monocyte spleen spleen healthy GSE148796 scimilarity nan nan 49 99684.0 6785 24492.260723886982 0.8477188947186437 0.7504685014947085 0.7295165069095441\r\n", "agg_4880 classical monocyte spleen spleen healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 3483 86078540.0 17775 36266.36759486 0.9100465322095738 0.8097019968114763 0.8081945289287479\r\n", "agg_4881 classical monocyte subcutaneous adipose tissue adipose healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 2019 27914261.0 15739 35142.992419516115 0.9421666034710219 0.8357406603138717 0.8249091266671545\r\n", "agg_4886 classical monocyte thoracic lymph node lymph node healthy 62ef75e4-cbea-454e-a0ce-998ec40223d3 scimilarity nan nan 20194 147502356.0 17156 34278.43204322054 0.9584542376154844 0.8568164088302802 0.8468420651789338\r\n", "agg_4887 classical monocyte thymus thymus healthy 62ef75e4-cbea-454e-a0ce-998ec40223d3 scimilarity nan nan 487 2692133.0 12969 33549.45213075861 0.9497232200872254 0.8549259958290932 0.843613914494647\r\n", "agg_4888 classical monocyte thymus thymus healthy 83ed3be8-4cb9-43e6-9aaa-3fbbf5d1bd3a scimilarity nan nan 27 80042.0 6527 23800.833698344715 0.8448423801602987 0.7347769941739797 0.7181319535525903\r\n", "agg_4889 classical monocyte thymus thymus healthy de13e3e2-23b6-40ed-a413-e9e12d7d3910 scimilarity nan nan 52 298983.0 9582 30002.684399867492 0.9175531872094376 0.8185046437128072 0.8176606503789746\r\n", "agg_4890 classical monocyte tonsil tonsil healthy GSE119506 scimilarity nan nan 321 1114339.0 11546 30516.473940957327 0.936749794981687 0.8307356996245999 0.8215841775914924\r\n", "agg_4893 classical monocyte trachea airway healthy 03f821b4-87be-4ff4-b65a-b5fc00061da7 scimilarity nan nan 126 457580.0 11135 33734.327765812595 0.9299687996532033 0.8279876048002456 0.8230768581762299\r\n", "agg_4894 classical monocyte trachea airway healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 130 8245622.0 13663 32584.555010387805 0.859406194907081 0.7522520630806427 0.7550758397192211\r\n", "agg_4896 classical monocyte transverse colon gut healthy 62ef75e4-cbea-454e-a0ce-998ec40223d3 scimilarity nan nan 503 2932999.0 12868 33265.22762476361 0.9476555210791636 0.8481166846094836 0.8392740144102075\r\n", "agg_4899 classical monocyte upper lobe of lung lung healthy GSE169471 scimilarity nan nan 180 594059.0 10222 27881.011904781462 0.9248153838159024 0.8106326147324702 0.8003438801026445\r\n", "agg_4900 classical monocyte urine urinary healthy GSE165396 scimilarity nan nan 20 109197.0 7299 25505.26771828214 0.8530560359297166 0.7505269381795711 0.7350625083451162\r\n", "agg_4901 classical monocyte uterus uterus healthy 32f2fd23-ec74-486f-9544-e5b2f41725f5 scimilarity nan nan 18 189472.0 9397 30891.07905591756 0.8870309576225303 0.7796319609750195 0.7827462282311286\r\n", "agg_4902 classical monocyte vasculature vasculature healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 14537 224206261.0 17922 36651.90273209977 0.9438232475974557 0.8425086108952623 0.8343079266995497\r\n", "agg_6287 intermediate monocyte head of femur bone healthy GSE169396 scimilarity nan nan 102 191075.0 7853 26179.035956297153 0.8330771518439726 0.7503209876663273 0.7113081302663875\r\n", "agg_6289 intermediate monocyte lung lung healthy 5d445965-6f1a-4b68-ba3a-b8f765155d3a scimilarity nan nan 178 1172582.0 11314 30515.569680815937 0.9409051040470379 0.8435279441582394 0.8229749449946738\r\n", "agg_6290 intermediate monocyte spleen spleen healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 60 252220.0 8236 26163.955486425446 0.836498746238646 0.7439980946911184 0.7168040239671246\r\n", "agg_6291 intermediate monocyte thymus thymus healthy GSE159745 scimilarity nan nan 29 82115.0 5987 22540.234815420707 0.7817665678679913 0.7043553158094606 0.6760717657015229\r\n", "agg_6292 intermediate monocyte vasculature vasculature healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 162 1525515.0 11980 32186.1781582846 0.9269571292216415 0.8224395619954739 0.8055668264019443\r\n", "agg_7919 non-classical monocyte apex of heart heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 158 675731.0 11222 30876.899222310585 0.9301440448035162 0.8258961291954683 0.824159827752181\r\n", "agg_7939 non-classical monocyte blood blood healthy 03f821b4-87be-4ff4-b65a-b5fc00061da7 scimilarity nan nan 7359 36418526.0 14851 33258.733090217436 0.9519276590597959 0.8501060197631037 0.850960653386464\r\n", "agg_7940 non-classical monocyte blood blood healthy 436154da-bcf1-4130-9c8b-120ff9a888f2 scimilarity nan nan 14619 54479703.0 15493 30191.499157063707 0.9490211472741342 0.8448211022633122 0.8418061408505518\r\n", "agg_7941 non-classical monocyte blood blood healthy 5d445965-6f1a-4b68-ba3a-b8f765155d3a scimilarity nan nan 195 1239240.0 11002 30223.31815928311 0.9379359624101378 0.8397815574875395 0.8299059656859994\r\n", "agg_7942 non-classical monocyte blood blood healthy GSE130117 scimilarity nan nan 322 1200290.0 11182 31409.27333671809 0.9418345510352945 0.8405863822794143 0.8304108431330093\r\n", "agg_7943 non-classical monocyte blood blood healthy GSE132802 scimilarity nan nan 102 726593.0 10232 30435.311954214158 0.9262718659348109 0.8186907385315048 0.8147462064075649\r\n", "agg_7944 non-classical monocyte blood blood healthy GSE134004 scimilarity nan nan 21 123360.0 6990 24135.62187154626 0.8660944307658819 0.7608357820845066 0.7535994530765779\r\n", "agg_7945 non-classical monocyte blood blood healthy GSE139324 scimilarity nan nan 435 2395489.0 12428 30987.28210955269 0.9487843521961865 0.8347185640893492 0.8386196076774184\r\n", "agg_7946 non-classical monocyte blood blood healthy GSE149313 scimilarity nan nan 567 2445950.0 11737 29144.914765151065 0.9480753865144765 0.8398428245307203 0.8354975068540325\r\n", "agg_7947 non-classical monocyte blood blood healthy GSE153421 scimilarity nan nan 441 2114891.0 11966 32647.54233438482 0.9524227157990666 0.8453840087915955 0.8445318485082767\r\n", "agg_7948 non-classical monocyte blood blood healthy GSE156989 scimilarity nan nan 3151 40420221.0 15662 32233.769948686153 0.9558775883405131 0.8527879494870606 0.8461332941066178\r\n", "agg_7949 non-classical monocyte blood blood healthy GSE157829 scimilarity nan nan 144 890675.0 10657 29172.377435644317 0.9302786443476869 0.8283014100006509 0.8220722486967346\r\n", "agg_7950 non-classical monocyte blood blood healthy GSE161329 scimilarity nan nan 1118 7175719.0 12865 29244.857112487658 0.9476682128749678 0.8393859845018986 0.8392730294908749\r\n", "agg_7951 non-classical monocyte blood blood healthy GSE161738 scimilarity nan nan 1497 12143757.0 12362 32632.110821778042 0.9476207444575895 0.8463853459866758 0.8490136875251815\r\n", "agg_7952 non-classical monocyte blood blood healthy GSE163668 scimilarity nan nan 323 1716760.0 11769 32812.39505818172 0.9472316303194352 0.8438698542111769 0.8424970169398833\r\n", "agg_7953 non-classical monocyte blood blood healthy GSE166992 scimilarity nan nan 1613 7143035.0 13383 32605.389353410996 0.9532209129181096 0.8486599938147087 0.8451557766892072\r\n", "agg_7954 non-classical monocyte blood blood healthy GSE167363 scimilarity nan nan 458 3094035.0 12228 29678.03739631962 0.9435654504659038 0.8379948419319221 0.8241961683725367\r\n", "agg_7955 non-classical monocyte blood blood healthy GSE168710 scimilarity nan nan 75 701113.0 10776 32559.49721301115 0.9233871475920297 0.8250528447177075 0.8185333840139741\r\n", "agg_7956 non-classical monocyte blood blood healthy GSE168732 scimilarity nan nan 229 1242404.0 11269 31965.299177754878 0.9416742339845935 0.8377482237588458 0.8415500681607002\r\n", "agg_7957 non-classical monocyte blood blood healthy b0cf0afa-ec40-4d65-b570-ed4ceacc6813 scimilarity nan nan 5897 43180935.0 14970 35595.18649267558 0.9543288699936718 0.8496535044071925 0.8536782531653531\r\n", "agg_7970 non-classical monocyte bone marrow bone marrow healthy GSE132509 scimilarity nan nan 28 86280.0 7085 25870.53624683356 0.8392355979135352 0.755387271124993 0.7303317003908475\r\n", "agg_7971 non-classical monocyte bone marrow bone marrow healthy GSE154109 scimilarity nan nan 50 289907.0 9147 28526.85661530531 0.9070952567396775 0.808361820498357 0.7964598438037417\r\n", "agg_7972 non-classical monocyte bone marrow bone marrow healthy GSE163278 scimilarity nan nan 127 682328.0 10897 31202.529821716787 0.9358619814006104 0.8294675777784289 0.8230468472234642\r\n", "agg_7974 non-classical monocyte breast breast healthy GSE164898 scimilarity nan nan 54 120275.0 7652 26423.010551181265 0.8534343500588628 0.755424451856796 0.7230125883881353\r\n", "agg_7976 non-classical monocyte cortex of kidney kidney healthy 120e86b4-1195-48c5-845b-b98054105eec scimilarity nan nan 63 401864.0 11464 32990.96311584531 0.9014527631813947 0.7931942866954178 0.7865930763896118\r\n", "agg_7977 non-classical monocyte cortex of kidney kidney healthy a98b828a-622a-483a-80e0-15703678befd scimilarity nan nan 161 772141.0 11062 31056.660350613587 0.9346806312815913 0.8317520409566359 0.8313984606487961\r\n", "agg_7979 non-classical monocyte fallopian tube fallopian tube healthy fc77d2ae-247d-44d7-aa24-3f4859254c2c scimilarity nan nan 16 222500.0 7755 27898.660349927846 0.8621134765305163 0.7601890611687369 0.744729424431635\r\n", "agg_7980 non-classical monocyte fimbria of uterine tube fallopian tube healthy fc77d2ae-247d-44d7-aa24-3f4859254c2c scimilarity nan nan 28 244266.0 7735 27678.814270356754 0.8734809587140066 0.7648153554084077 0.76608428730196\r\n", "agg_7982 non-classical monocyte heart left ventricle heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 57 213565.0 9322 29851.355460106315 0.8978296307325292 0.7939956162210045 0.776120799756924\r\n", "agg_7983 non-classical monocyte heart right ventricle heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 124 613752.0 10880 30311.360727579897 0.9311389328985725 0.8253171421863371 0.8219320702682233\r\n", "agg_7985 non-classical monocyte interventricular septum heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 144 524658.0 10853 31170.857886799495 0.9241302459076691 0.8138209035301205 0.8224136632163046\r\n", "agg_7986 non-classical monocyte isthmus of fallopian tube fallopian tube healthy fc77d2ae-247d-44d7-aa24-3f4859254c2c scimilarity nan nan 12 86198.0 5668 23558.81110396413 0.793224614700874 0.7114672551137675 0.6825985710920001\r\n", "agg_7990 non-classical monocyte kidney kidney healthy 120e86b4-1195-48c5-845b-b98054105eec scimilarity nan nan 214 1788808.0 13749 33794.71179753717 0.9324162382479619 0.8250902105825786 0.818301334304562\r\n", "agg_7991 non-classical monocyte kidney kidney healthy GSE140989 scimilarity nan nan 473 1769375.0 13008 31797.13462972748 0.9190676320157254 0.8182361375222441 0.8172110487458264\r\n", "agg_7992 non-classical monocyte left cardiac atrium heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 82 357018.0 10073 29995.2881818953 0.916928150799629 0.8074046233049013 0.808753490409063\r\n", "agg_7995 non-classical monocyte liver liver healthy GSE136103 scimilarity nan nan 423 2383574.0 13122 31645.435149503457 0.9524231389227664 0.8376748793878739 0.8415142758287522\r\n", "agg_7996 non-classical monocyte liver liver healthy GSE159977 scimilarity nan nan 473 4877555.0 13370 33200.636271185205 0.9498326621251537 0.8392029926705806 0.8469259328198118\r\n", "agg_7997 non-classical monocyte liver liver healthy GSE163650 scimilarity nan nan 10 96148.0 6782 24958.605923030595 0.844996457443227 0.7380858347163884 0.7209649786125333\r\n", "agg_8007 non-classical monocyte lung lung healthy 5d445965-6f1a-4b68-ba3a-b8f765155d3a scimilarity nan nan 576 3313836.0 12566 30969.255628748215 0.9438879246302309 0.8444989044016609 0.8358111974870654\r\n", "agg_8008 non-classical monocyte lung lung healthy DS000011735 scimilarity nan nan 169 779577.0 12885 36247.76234710614 0.8808886302153663 0.8102555656654346 0.7948721237235893\r\n", "agg_8009 non-classical monocyte lung lung healthy GSE128033 scimilarity nan nan 79 343546.0 9330 27406.22701580364 0.9035509606166174 0.7968173004217675 0.7863212826693502\r\n", "agg_8010 non-classical monocyte lung lung healthy GSE128169 scimilarity nan nan 276 2433151.0 12769 32027.902610765417 0.9547437308289124 0.8449880167209973 0.8438649522788478\r\n", "agg_8011 non-classical monocyte lung lung healthy GSE132771 scimilarity nan nan 37 151922.0 7860 26295.743163049112 0.8838770944570796 0.790431305730734 0.7607028661311058\r\n", "agg_8012 non-classical monocyte lung lung healthy GSE169471 scimilarity nan nan 27 103204.0 6854 23947.65597317191 0.8348392917103126 0.7488344663849291 0.7237935010019484\r\n", "agg_8024 non-classical monocyte lung parenchyma lung healthy GSE158127 scimilarity nan nan 309 1788828.0 12136 32078.56380429425 0.9418105467635777 0.8295808338281161 0.831837004521098\r\n", "agg_8026 non-classical monocyte muscle tissue muscle healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 321 23778338.0 13169 27977.08919973388 0.8814221140322411 0.7869360982600341 0.7677337716492793\r\n", "agg_8034 non-classical monocyte right cardiac atrium heart healthy b52eb423-5d0d-4645-b217-e1c6d38b2e72 scimilarity nan nan 70 408965.0 10146 29746.121058681525 0.9252128588702657 0.8186384457806745 0.8103901461358153\r\n", "agg_8036 non-classical monocyte spleen spleen healthy 4d74781b-8186-4c9a-b659-ff4dc4601d91 scimilarity nan nan 336 1586973.0 11934 30580.798338873254 0.9471354298985436 0.8378586626071394 0.8322814736104384\r\n", "agg_8039 non-classical monocyte thoracic lymph node lymph node healthy 62ef75e4-cbea-454e-a0ce-998ec40223d3 scimilarity nan nan 1950 18888557.0 15221 33581.76389036113 0.9559331607078543 0.8537562341521224 0.8469325123405803\r\n", "agg_8040 non-classical monocyte thymus thymus healthy 62ef75e4-cbea-454e-a0ce-998ec40223d3 scimilarity nan nan 68 441502.0 10301 31247.452676451114 0.917407487628822 0.8223254514108821 0.8197270854172879\r\n", "agg_8041 non-classical monocyte trachea airway healthy 03f821b4-87be-4ff4-b65a-b5fc00061da7 scimilarity nan nan 12 97840.0 7863 29204.083428972655 0.8702163087134754 0.7685799000235117 0.7716054237714325\r\n", "agg_8042 non-classical monocyte trachea airway healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 16 534304.0 8970 24807.157880956805 0.8232311272852844 0.7180336962121132 0.7003798955542161\r\n", "agg_8044 non-classical monocyte transverse colon gut healthy 62ef75e4-cbea-454e-a0ce-998ec40223d3 scimilarity nan nan 135 908452.0 11380 32804.296806012215 0.933818206129888 0.8353788654625934 0.8345447172426862\r\n", "agg_8045 non-classical monocyte upper lobe of lung lung healthy GSE169471 scimilarity nan nan 32 143341.0 7269 24395.316476390286 0.8673226900308589 0.7523857428733156 0.7403920706432944\r\n", "agg_8046 non-classical monocyte vasculature vasculature healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan 987 10267209.0 15019 35019.47682838501 0.9379668399645168 0.8304128458333673 0.8283221722893935\r\n" ] } ], "source": [ "! decima query-cell 'cell_type.str.contains(\"monocyte\") and disease == \"healthy\"' | column -t -s $'\\t'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This query selects cells that are:\n", "- classical monocytes (cell_type == \"classical monocyte\")\n", "- from healthy donors (disease == \"healthy\")\n", "- from blood tissue (tissue == \"blood\")" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:29:35.059952Z", "iopub.status.busy": "2025-11-21T06:29:35.059766Z", "iopub.status.idle": "2025-11-21T06:29:49.968177Z", "shell.execute_reply": "2025-11-21T06:29:49.967575Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n", "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33mmhcelik\u001b[0m (\u001b[33mmhcw\u001b[0m) to \u001b[32mhttps://api.wandb.ai\u001b[0m. Use \u001b[1m`wandb login --relogin`\u001b[0m to force relogin\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'metadata:latest', 3122.32MB. 1 files...\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \r\n", "Done. 00:00:01.9 (1663.1MB/s)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " cell_type tissue organ disease study dataset region subregion celltype_coarse n_cells total_counts n_genes size_factor train_pearson val_pearson test_pearson\r\n", "agg_4733 classical monocyte blood blood healthy 03f821b4-87be-4ff4-b65a-b5fc00061da7 scimilarity nan nan 32464 109280914.0 16158 33646.02843110038 0.9568728712803031 0.8545324533535094 0.8487445540580735\r\n", "agg_4734 classical monocyte blood blood healthy 436154da-bcf1-4130-9c8b-120ff9a888f2 scimilarity nan nan 76800 206490628.0 16683 30736.453324546856 0.955313650235467 0.8494127267867799 0.84210567312908\r\n", "agg_4735 classical monocyte blood blood healthy 5d445965-6f1a-4b68-ba3a-b8f765155d3a scimilarity nan nan 1044 3638976.0 12306 30542.8977364245 0.9465384578921433 0.851991795617166 0.8308191715256182\r\n", "agg_4736 classical monocyte blood blood healthy DS000010023 scimilarity nan nan 243 362606.0 8414 25201.261024165953 0.8865446516098515 0.7621391376670988 0.7681818769796727\r\n", "agg_4737 classical monocyte blood blood healthy GSE122703 scimilarity nan nan 18 83417.0 7546 28194.725173315186 0.859225612396475 0.7640890699253299 0.7443681539461423\r\n", "agg_4738 classical monocyte blood blood healthy GSE130117 scimilarity nan nan 2017 7130588.0 13535 33078.53692191542 0.9553673450160365 0.851402109626239 0.8385936100758409\r\n", "agg_4739 classical monocyte blood blood healthy GSE132802 scimilarity nan nan 1601 9955248.0 13132 32063.630951743195 0.9478882791739611 0.8391025143866828 0.8303465877530952\r\n", "agg_4740 classical monocyte blood blood healthy GSE139324 scimilarity nan nan 2333 8331045.0 13985 31135.881287246768 0.9608208780142045 0.8473885992448625 0.8432790193723467\r\n", "agg_4741 classical monocyte blood blood healthy GSE145809 scimilarity nan nan 69 245221.0 8962 29135.67197629852 0.8825701041728526 0.7811799267734735 0.7818647625179129\r\n", "agg_4742 classical monocyte blood blood healthy GSE149313 scimilarity nan nan 2420 6974751.0 13143 29560.496854576566 0.9574598613513423 0.8505290963248237 0.8379199735887167\r\n", "agg_4743 classical monocyte blood blood healthy GSE153421 scimilarity nan nan 3691 15561725.0 14569 34377.465875728165 0.9636686704566925 0.8576434473562725 0.8511814190737197\r\n", "agg_4744 classical monocyte blood blood healthy GSE156989 scimilarity nan nan 13554 160011485.0 16915 34135.439844737564 0.9640667421350761 0.8577967800377495 0.8517975138366085\r\n", "agg_4745 classical monocyte blood blood healthy GSE157829 scimilarity nan nan 1619 6957811.0 13507 30199.39288988673 0.9484019976492215 0.8436979316400604 0.8347196616710685\r\n", "agg_4746 classical monocyte blood blood healthy GSE159113 scimilarity nan nan 1025 6298250.0 12083 27477.50809897617 0.9078020151513733 0.8121457150205226 0.7980372877810575\r\n", "agg_4747 classical monocyte blood blood healthy GSE161329 scimilarity nan nan 5654 25653579.0 14349 28848.0539929647 0.9549801428956252 0.8450430950674043 0.8406188789518544\r\n", "agg_4748 classical monocyte blood blood healthy GSE161738 scimilarity nan nan 2676 13801473.0 12825 33337.477050230416 0.9541962906717452 0.8512846409758499 0.8485408028961247\r\n", "agg_4749 classical monocyte blood blood healthy GSE163668 scimilarity nan nan 2644 10486314.0 14049 33786.96584264489 0.9597801578342394 0.8560775485935677 0.8512149509551471\r\n", "agg_4750 classical monocyte blood blood healthy GSE166992 scimilarity nan nan 7501 28033216.0 15079 33455.367364577316 0.9622273594219685 0.8558958139235102 0.8495571689751152\r\n", "agg_4751 classical monocyte blood blood healthy GSE167363 scimilarity nan nan 3135 14722635.0 14375 29977.24002819913 0.942417448875388 0.8368071803109702 0.8258536430202982\r\n", "agg_4752 classical monocyte blood blood healthy GSE168710 scimilarity nan nan 16484 104881872.0 16223 34107.336261357574 0.9398282119039322 0.8424821834537695 0.8372971004604842\r\n", "agg_4753 classical monocyte blood blood healthy GSE168732 scimilarity nan nan 770 2548822.0 12508 33411.30103713399 0.9552513581030765 0.8508279875038706 0.847461536110767\r\n", "agg_4754 classical monocyte blood blood healthy b0cf0afa-ec40-4d65-b570-ed4ceacc6813 scimilarity nan nan 40975 300555227.0 15784 35938.85772500803 0.9622425892039956 0.853424173800979 0.8508714303589978\r\n", "agg_4755 classical monocyte blood blood healthy ddfad306-714d-4cc0-9985-d9072820c530 scimilarity nan nan 8827 36073928.0 15131 33208.591584008376 0.9546118779961532 0.8543086616569785 0.8462739374830107\r\n" ] } ], "source": [ "! decima query-cell 'cell_type == \"classical monocyte\" and disease == \"healthy\" and tissue == \"blood\"' | column -t -s $'\\t'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Attribution calling with custom genes and sequences" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section, we demonstrate how to call attributions using custom gene sequences.\n", "You can provide your own FASTA file containing sequences of interest and run attribution analysis\n", "for any set of genes or genomic regions, using the Decima command-line interface.\n", "The following examples show how to inspect your FASTA file, run attributions, and explore the output files.\n", "The FASTA header line for each sequence contains the gene name and the coordinates of the masked region used for attribution analysis. \n", "For example, in the header:\n", "\n", " CD68|gene_mask_start=163840|gene_mask_end=166460\n", "\"CD68\" is the gene name, \"gene_mask_start\" and \"gene_mask_end\" specify the start and end positions (relative to the input sequence) of the region that was masked and analyzed for attributions." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:29:49.970038Z", "iopub.status.busy": "2025-11-21T06:29:49.969871Z", "iopub.status.idle": "2025-11-21T06:29:50.102142Z", "shell.execute_reply": "2025-11-21T06:29:50.101542Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "cat: ../tests/data/seqs.fasta: No such file or directory\r\n" ] } ], "source": [ "! cat ../tests/data/seqs.fasta | cut -c 1-200" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:29:50.103740Z", "iopub.status.busy": "2025-11-21T06:29:50.103580Z", "iopub.status.idle": "2025-11-21T06:30:36.906607Z", "shell.execute_reply": "2025-11-21T06:30:36.905975Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n", "/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n", " warnings.warn(\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "decima - INFO - Using device: 0\r\n", "decima - INFO - Loading model v1_rep0 and metadata to compute attributions...\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33mmhcelik\u001b[0m (\u001b[33mmhcw\u001b[0m) to \u001b[32mhttps://api.wandb.ai\u001b[0m. Use \u001b[1m`wandb login --relogin`\u001b[0m to force relogin\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'rep0:latest', 720.03MB. 1 files...\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \r\n", "Done. 00:00:00.9 (837.1MB/s)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'metadata:latest', 3122.32MB. 1 files...\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \r\n", "Done. 00:00:02.0 (1562.9MB/s)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "/home/celikm5/Projects/decima/src/decima/interpret/attributer.py:66: UserWarning: `off_tasks` is not provided. Using all other tasks as off_tasks.\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "Computing attributions...: 0%| | 0/2 [00:00" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "attribution.plot_seqlogo(relative_loc=291)\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:30:56.744352Z", "iopub.status.busy": "2025-11-21T06:30:56.744216Z", "iopub.status.idle": "2025-11-21T06:31:03.814731Z", "shell.execute_reply": "2025-11-21T06:31:03.814125Z" } }, "outputs": [ { "data": { "image/png": "" }, "metadata": { "image/png": { "height": 200, "width": 1000 } }, "output_type": "display_data" } ], "source": [ "attribution.plot_peaks()" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:31:03.816125Z", "iopub.status.busy": "2025-11-21T06:31:03.815965Z", "iopub.status.idle": "2025-11-21T06:31:03.905305Z", "shell.execute_reply": "2025-11-21T06:31:03.904884Z" } }, "outputs": [], "source": [ "import torch\n", "from decima import predict_attributions_seqlet_calling\n", "\n", "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n", "\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:31:03.906863Z", "iopub.status.busy": "2025-11-21T06:31:03.906719Z", "iopub.status.idle": "2025-11-21T06:31:03.908718Z", "shell.execute_reply": "2025-11-21T06:31:03.908328Z" } }, "outputs": [], "source": [ "spi1_cell_types = [\n", " \"classical monocyte\",\n", " \"intermediate monocyte\",\n", " \"non-classical monocyte\",\n", " \"alveolar macrophage\",\n", " \"macrophage\",\n", "]" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:31:03.910006Z", "iopub.status.busy": "2025-11-21T06:31:03.909877Z", "iopub.status.idle": "2025-11-21T06:32:07.465527Z", "shell.execute_reply": "2025-11-21T06:32:07.464802Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'rep0:latest', 720.03MB. 1 files...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Done. 00:00:00.6 (1180.1MB/s)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'metadata:latest', 3122.32MB. 1 files...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Done. 00:00:01.8 (1694.4MB/s)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/home/celikm5/Projects/decima/src/decima/interpret/attributer.py:66: UserWarning: `off_tasks` is not provided. Using all other tasks as off_tasks.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\r", "Computing attributions...: 0%| | 0/1 [00:00 \u001b[39m\u001b[32m3\u001b[39m df_seqs = \u001b[43mpd\u001b[49m\u001b[43m.\u001b[49m\u001b[43mread_csv\u001b[49m\u001b[43m(\u001b[49m\u001b[33;43m\"\u001b[39;49m\u001b[33;43m../tests/data/seqs.csv\u001b[39;49m\u001b[33;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mindex_col\u001b[49m\u001b[43m=\u001b[49m\u001b[32;43m0\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[32m 4\u001b[39m df_seqs\n", "\u001b[36mFile \u001b[39m\u001b[32m~/miniforge3/envs/decima2/lib/python3.11/site-packages/pandas/io/parsers/readers.py:1026\u001b[39m, in \u001b[36mread_csv\u001b[39m\u001b[34m(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend)\u001b[39m\n\u001b[32m 1013\u001b[39m kwds_defaults = _refine_defaults_read(\n\u001b[32m 1014\u001b[39m dialect,\n\u001b[32m 1015\u001b[39m delimiter,\n\u001b[32m (...)\u001b[39m\u001b[32m 1022\u001b[39m dtype_backend=dtype_backend,\n\u001b[32m 1023\u001b[39m )\n\u001b[32m 1024\u001b[39m kwds.update(kwds_defaults)\n\u001b[32m-> \u001b[39m\u001b[32m1026\u001b[39m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43m_read\u001b[49m\u001b[43m(\u001b[49m\u001b[43mfilepath_or_buffer\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mkwds\u001b[49m\u001b[43m)\u001b[49m\n", "\u001b[36mFile \u001b[39m\u001b[32m~/miniforge3/envs/decima2/lib/python3.11/site-packages/pandas/io/parsers/readers.py:620\u001b[39m, in \u001b[36m_read\u001b[39m\u001b[34m(filepath_or_buffer, kwds)\u001b[39m\n\u001b[32m 617\u001b[39m _validate_names(kwds.get(\u001b[33m\"\u001b[39m\u001b[33mnames\u001b[39m\u001b[33m\"\u001b[39m, \u001b[38;5;28;01mNone\u001b[39;00m))\n\u001b[32m 619\u001b[39m \u001b[38;5;66;03m# Create the parser.\u001b[39;00m\n\u001b[32m--> \u001b[39m\u001b[32m620\u001b[39m parser = \u001b[43mTextFileReader\u001b[49m\u001b[43m(\u001b[49m\u001b[43mfilepath_or_buffer\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43m*\u001b[49m\u001b[43m*\u001b[49m\u001b[43mkwds\u001b[49m\u001b[43m)\u001b[49m\n\u001b[32m 622\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m chunksize \u001b[38;5;129;01mor\u001b[39;00m iterator:\n\u001b[32m 623\u001b[39m \u001b[38;5;28;01mreturn\u001b[39;00m parser\n", "\u001b[36mFile \u001b[39m\u001b[32m~/miniforge3/envs/decima2/lib/python3.11/site-packages/pandas/io/parsers/readers.py:1620\u001b[39m, in \u001b[36mTextFileReader.__init__\u001b[39m\u001b[34m(self, f, engine, **kwds)\u001b[39m\n\u001b[32m 1617\u001b[39m \u001b[38;5;28mself\u001b[39m.options[\u001b[33m\"\u001b[39m\u001b[33mhas_index_names\u001b[39m\u001b[33m\"\u001b[39m] = kwds[\u001b[33m\"\u001b[39m\u001b[33mhas_index_names\u001b[39m\u001b[33m\"\u001b[39m]\n\u001b[32m 1619\u001b[39m \u001b[38;5;28mself\u001b[39m.handles: IOHandles | \u001b[38;5;28;01mNone\u001b[39;00m = \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[32m-> \u001b[39m\u001b[32m1620\u001b[39m \u001b[38;5;28mself\u001b[39m._engine = \u001b[38;5;28;43mself\u001b[39;49m\u001b[43m.\u001b[49m\u001b[43m_make_engine\u001b[49m\u001b[43m(\u001b[49m\u001b[43mf\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m.\u001b[49m\u001b[43mengine\u001b[49m\u001b[43m)\u001b[49m\n", "\u001b[36mFile \u001b[39m\u001b[32m~/miniforge3/envs/decima2/lib/python3.11/site-packages/pandas/io/parsers/readers.py:1880\u001b[39m, in \u001b[36mTextFileReader._make_engine\u001b[39m\u001b[34m(self, f, engine)\u001b[39m\n\u001b[32m 1878\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m \u001b[33m\"\u001b[39m\u001b[33mb\u001b[39m\u001b[33m\"\u001b[39m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;129;01min\u001b[39;00m mode:\n\u001b[32m 1879\u001b[39m mode += \u001b[33m\"\u001b[39m\u001b[33mb\u001b[39m\u001b[33m\"\u001b[39m\n\u001b[32m-> \u001b[39m\u001b[32m1880\u001b[39m \u001b[38;5;28mself\u001b[39m.handles = \u001b[43mget_handle\u001b[49m\u001b[43m(\u001b[49m\n\u001b[32m 1881\u001b[39m \u001b[43m \u001b[49m\u001b[43mf\u001b[49m\u001b[43m,\u001b[49m\n\u001b[32m 1882\u001b[39m \u001b[43m \u001b[49m\u001b[43mmode\u001b[49m\u001b[43m,\u001b[49m\n\u001b[32m 1883\u001b[39m \u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[43m=\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m.\u001b[49m\u001b[43moptions\u001b[49m\u001b[43m.\u001b[49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[33;43m\"\u001b[39;49m\u001b[33;43mencoding\u001b[39;49m\u001b[33;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[32m 1884\u001b[39m \u001b[43m \u001b[49m\u001b[43mcompression\u001b[49m\u001b[43m=\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m.\u001b[49m\u001b[43moptions\u001b[49m\u001b[43m.\u001b[49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[33;43m\"\u001b[39;49m\u001b[33;43mcompression\u001b[39;49m\u001b[33;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[32m 1885\u001b[39m \u001b[43m \u001b[49m\u001b[43mmemory_map\u001b[49m\u001b[43m=\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m.\u001b[49m\u001b[43moptions\u001b[49m\u001b[43m.\u001b[49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[33;43m\"\u001b[39;49m\u001b[33;43mmemory_map\u001b[39;49m\u001b[33;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mFalse\u001b[39;49;00m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[32m 1886\u001b[39m \u001b[43m \u001b[49m\u001b[43mis_text\u001b[49m\u001b[43m=\u001b[49m\u001b[43mis_text\u001b[49m\u001b[43m,\u001b[49m\n\u001b[32m 1887\u001b[39m \u001b[43m \u001b[49m\u001b[43merrors\u001b[49m\u001b[43m=\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m.\u001b[49m\u001b[43moptions\u001b[49m\u001b[43m.\u001b[49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[33;43m\"\u001b[39;49m\u001b[33;43mencoding_errors\u001b[39;49m\u001b[33;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[33;43m\"\u001b[39;49m\u001b[33;43mstrict\u001b[39;49m\u001b[33;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[32m 1888\u001b[39m \u001b[43m \u001b[49m\u001b[43mstorage_options\u001b[49m\u001b[43m=\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m.\u001b[49m\u001b[43moptions\u001b[49m\u001b[43m.\u001b[49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[33;43m\"\u001b[39;49m\u001b[33;43mstorage_options\u001b[39;49m\u001b[33;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[32m 1889\u001b[39m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n\u001b[32m 1890\u001b[39m \u001b[38;5;28;01massert\u001b[39;00m \u001b[38;5;28mself\u001b[39m.handles \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[32m 1891\u001b[39m f = \u001b[38;5;28mself\u001b[39m.handles.handle\n", "\u001b[36mFile \u001b[39m\u001b[32m~/miniforge3/envs/decima2/lib/python3.11/site-packages/pandas/io/common.py:873\u001b[39m, in \u001b[36mget_handle\u001b[39m\u001b[34m(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)\u001b[39m\n\u001b[32m 868\u001b[39m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(handle, \u001b[38;5;28mstr\u001b[39m):\n\u001b[32m 869\u001b[39m \u001b[38;5;66;03m# Check whether the filename is to be opened in binary mode.\u001b[39;00m\n\u001b[32m 870\u001b[39m \u001b[38;5;66;03m# Binary mode does not support 'encoding' and 'newline'.\u001b[39;00m\n\u001b[32m 871\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m ioargs.encoding \u001b[38;5;129;01mand\u001b[39;00m \u001b[33m\"\u001b[39m\u001b[33mb\u001b[39m\u001b[33m\"\u001b[39m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;129;01min\u001b[39;00m ioargs.mode:\n\u001b[32m 872\u001b[39m \u001b[38;5;66;03m# Encoding\u001b[39;00m\n\u001b[32m--> \u001b[39m\u001b[32m873\u001b[39m handle = \u001b[38;5;28;43mopen\u001b[39;49m\u001b[43m(\u001b[49m\n\u001b[32m 874\u001b[39m \u001b[43m \u001b[49m\u001b[43mhandle\u001b[49m\u001b[43m,\u001b[49m\n\u001b[32m 875\u001b[39m \u001b[43m \u001b[49m\u001b[43mioargs\u001b[49m\u001b[43m.\u001b[49m\u001b[43mmode\u001b[49m\u001b[43m,\u001b[49m\n\u001b[32m 876\u001b[39m \u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[43m=\u001b[49m\u001b[43mioargs\u001b[49m\u001b[43m.\u001b[49m\u001b[43mencoding\u001b[49m\u001b[43m,\u001b[49m\n\u001b[32m 877\u001b[39m \u001b[43m \u001b[49m\u001b[43merrors\u001b[49m\u001b[43m=\u001b[49m\u001b[43merrors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[32m 878\u001b[39m \u001b[43m \u001b[49m\u001b[43mnewline\u001b[49m\u001b[43m=\u001b[49m\u001b[33;43m\"\u001b[39;49m\u001b[33;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[32m 879\u001b[39m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[32m 880\u001b[39m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[32m 881\u001b[39m \u001b[38;5;66;03m# Binary mode\u001b[39;00m\n\u001b[32m 882\u001b[39m handle = \u001b[38;5;28mopen\u001b[39m(handle, ioargs.mode)\n", "\u001b[31mFileNotFoundError\u001b[39m: [Errno 2] No such file or directory: '../tests/data/seqs.csv'" ] } ], "source": [ "import pandas as pd\n", "\n", "df_seqs = pd.read_csv(\"../tests/data/seqs.csv\", index_col=0)\n", "df_seqs" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:32:07.905316Z", "iopub.status.busy": "2025-11-21T06:32:07.905178Z", "iopub.status.idle": "2025-11-21T06:32:07.926685Z", "shell.execute_reply": "2025-11-21T06:32:07.926250Z" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'df_seqs' is not defined", "output_type": "error", "traceback": [ "\u001b[31m---------------------------------------------------------------------------\u001b[39m", "\u001b[31mNameError\u001b[39m Traceback (most recent call last)", "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[29]\u001b[39m\u001b[32m, line 3\u001b[39m\n\u001b[32m 1\u001b[39m predict_attributions_seqlet_calling(\n\u001b[32m 2\u001b[39m output_prefix=\u001b[33m\"\u001b[39m\u001b[33mexample/attrs_custom_seqs_monoctypes\u001b[39m\u001b[33m\"\u001b[39m,\n\u001b[32m----> \u001b[39m\u001b[32m3\u001b[39m seqs=\u001b[43mdf_seqs\u001b[49m, \u001b[38;5;66;03m# <-- custom sequences\u001b[39;00m\n\u001b[32m 4\u001b[39m tasks=\u001b[33mf\u001b[39m\u001b[33m\"\u001b[39m\u001b[33mcell_type in \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mspi1_cell_types\u001b[38;5;132;01m}\u001b[39;00m\u001b[33m\"\u001b[39m,\n\u001b[32m 5\u001b[39m device=device,\n\u001b[32m 6\u001b[39m )\n\u001b[32m 7\u001b[39m get_ipython().system(\u001b[33m'\u001b[39m\u001b[33m ls attrs_custom_seqs_monoctypes\u001b[39m\u001b[33m'\u001b[39m)\n", "\u001b[31mNameError\u001b[39m: name 'df_seqs' is not defined" ] } ], "source": [ "predict_attributions_seqlet_calling(\n", " output_prefix=\"example/attrs_custom_seqs_monoctypes\",\n", " seqs=df_seqs, # <-- custom sequences\n", " tasks=f\"cell_type in {spi1_cell_types}\",\n", " device=device,\n", ")\n", "! ls attrs_custom_seqs_monoctypes" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:32:07.927982Z", "iopub.status.busy": "2025-11-21T06:32:07.927858Z", "iopub.status.idle": "2025-11-21T06:32:07.931232Z", "shell.execute_reply": "2025-11-21T06:32:07.930844Z" } }, "outputs": [ { "data": { "text/plain": [ "524288" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import random\n", "import torch\n", "from grelu.sequence.format import strings_to_one_hot\n", "from decima.constants import DECIMA_CONTEXT_SIZE\n", "\n", "DECIMA_CONTEXT_SIZE" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:32:07.932508Z", "iopub.status.busy": "2025-11-21T06:32:07.932376Z", "iopub.status.idle": "2025-11-21T06:32:08.108325Z", "shell.execute_reply": "2025-11-21T06:32:08.104954Z" } }, "outputs": [ { "data": { "text/plain": [ "torch.Size([1, 5, 524288])" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "seqs = torch.cat(\n", " [\n", " strings_to_one_hot(\n", " [\"\".join(random.choice([\"A\", \"T\", \"C\", \"G\"]) for _ in range(DECIMA_CONTEXT_SIZE))]\n", " ), # one-hot encoded sequence\n", " torch.ones(1, 1, DECIMA_CONTEXT_SIZE), # binary mask for the gene\n", " ],\n", " dim=1,\n", ")\n", "seqs.shape" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:32:08.109954Z", "iopub.status.busy": "2025-11-21T06:32:08.109816Z", "iopub.status.idle": "2025-11-21T06:32:34.000117Z", "shell.execute_reply": "2025-11-21T06:32:33.999519Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'rep0:latest', 720.03MB. 1 files...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Done. 00:00:00.6 (1145.4MB/s)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'metadata:latest', 3122.32MB. 1 files...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Done. 00:00:01.8 (1748.9MB/s)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/home/celikm5/Projects/decima/src/decima/interpret/attributer.py:66: UserWarning: `off_tasks` is not provided. Using all other tasks as off_tasks.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\r", "Computing attributions...: 0%| | 0/1 [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
cell_typetissueorgandiseasestudydatasetregionsubregioncelltype_coarsen_cellstotal_countsn_genessize_factortrain_pearsonval_pearsontest_pearson
agg_4063alveolar macrophagealveolar systemlungCOVID-19GSE155249scimilaritynannanNaN14538.001524e+061471136293.4720250.9430590.8372100.849998
agg_4064alveolar macrophagealveolar systemlunghealthyGSE155249scimilaritynannanNaN12797.598244e+061367334158.5144960.9328190.8310240.843684
agg_4065alveolar macrophageleft lunglungNAENCODEscimilaritynannanNaN4053.000961e+061659546501.3758570.9360810.8479240.845485
agg_4066alveolar macrophagelingula of left lunglunghealthya3ffde6c-7ad2-498a-903c-d58e732f7470scimilaritynannanNaN8541.713753e+061511042773.0097350.8939270.8060000.804835
agg_4067alveolar macrophagelower lobe of left lunglungNAENCODEscimilaritynannanNaN7631.344798e+071797349020.8044870.9405860.8546800.863014
...................................................
agg_6644macrophageuterusuterushealthy32f2fd23-ec74-486f-9544-e5b2f41725f5scimilaritynannanNaN4254.340830e+061523336624.1367390.9547530.8502470.843175
agg_6645macrophageuterusuterushealthye5f58829-1a66-40b5-a624-9046778e74f5scimilaritynannanNaN2313.007554e+071478727615.7621570.8394760.7305540.719085
agg_6646macrophagevasculaturevasculaturehealthye5f58829-1a66-40b5-a624-9046778e74f5scimilaritynannanNaN124974.040685e+081819936829.4989640.9388620.8368190.833474
agg_6647macrophagevisceral fatadiposeobesityGSE128518scimilaritynannanNaN7292.078431e+061376034188.7161870.9415960.8273600.823912
agg_6648macrophagewhite adipose tissueadiposeNAGSE128890scimilaritynannanNaN451.381560e+05825727604.7480950.8593860.7453280.745539
\n", "

325 rows × 16 columns

\n", "" ], "text/plain": [ " cell_type tissue organ disease \\\n", "agg_4063 alveolar macrophage alveolar system lung COVID-19 \n", "agg_4064 alveolar macrophage alveolar system lung healthy \n", "agg_4065 alveolar macrophage left lung lung NA \n", "agg_4066 alveolar macrophage lingula of left lung lung healthy \n", "agg_4067 alveolar macrophage lower lobe of left lung lung NA \n", "... ... ... ... ... \n", "agg_6644 macrophage uterus uterus healthy \n", "agg_6645 macrophage uterus uterus healthy \n", "agg_6646 macrophage vasculature vasculature healthy \n", "agg_6647 macrophage visceral fat adipose obesity \n", "agg_6648 macrophage white adipose tissue adipose NA \n", "\n", " study dataset region subregion \\\n", "agg_4063 GSE155249 scimilarity nan nan \n", "agg_4064 GSE155249 scimilarity nan nan \n", "agg_4065 ENCODE scimilarity nan nan \n", "agg_4066 a3ffde6c-7ad2-498a-903c-d58e732f7470 scimilarity nan nan \n", "agg_4067 ENCODE scimilarity nan nan \n", "... ... ... ... ... \n", "agg_6644 32f2fd23-ec74-486f-9544-e5b2f41725f5 scimilarity nan nan \n", "agg_6645 e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan \n", "agg_6646 e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity nan nan \n", "agg_6647 GSE128518 scimilarity nan nan \n", "agg_6648 GSE128890 scimilarity nan nan \n", "\n", " celltype_coarse n_cells total_counts n_genes size_factor \\\n", "agg_4063 NaN 1453 8.001524e+06 14711 36293.472025 \n", "agg_4064 NaN 1279 7.598244e+06 13673 34158.514496 \n", "agg_4065 NaN 405 3.000961e+06 16595 46501.375857 \n", "agg_4066 NaN 854 1.713753e+06 15110 42773.009735 \n", "agg_4067 NaN 763 1.344798e+07 17973 49020.804487 \n", "... ... ... ... ... ... \n", "agg_6644 NaN 425 4.340830e+06 15233 36624.136739 \n", "agg_6645 NaN 231 3.007554e+07 14787 27615.762157 \n", "agg_6646 NaN 12497 4.040685e+08 18199 36829.498964 \n", "agg_6647 NaN 729 2.078431e+06 13760 34188.716187 \n", "agg_6648 NaN 45 1.381560e+05 8257 27604.748095 \n", "\n", " train_pearson val_pearson test_pearson \n", "agg_4063 0.943059 0.837210 0.849998 \n", "agg_4064 0.932819 0.831024 0.843684 \n", "agg_4065 0.936081 0.847924 0.845485 \n", "agg_4066 0.893927 0.806000 0.804835 \n", "agg_4067 0.940586 0.854680 0.863014 \n", "... ... ... ... \n", "agg_6644 0.954753 0.850247 0.843175 \n", "agg_6645 0.839476 0.730554 0.719085 \n", "agg_6646 0.938862 0.836819 0.833474 \n", "agg_6647 0.941596 0.827360 0.823912 \n", "agg_6648 0.859386 0.745328 0.745539 \n", "\n", "[325 rows x 16 columns]" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result.cell_metadata.query(\"cell_type.str.endswith('macrophage')\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The results and metadata stored in anndata format which you can access directly if needed but most operation are supported by DecimaResult object." ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:32:37.883264Z", "iopub.status.busy": "2025-11-21T06:32:37.883127Z", "iopub.status.idle": "2025-11-21T06:32:37.885853Z", "shell.execute_reply": "2025-11-21T06:32:37.885430Z" } }, "outputs": [ { "data": { "text/plain": [ "AnnData object with n_obs × n_vars = 8856 × 18457\n", " obs: 'cell_type', 'tissue', 'organ', 'disease', 'study', 'dataset', 'region', 'subregion', 'celltype_coarse', 'n_cells', 'total_counts', 'n_genes', 'size_factor', 'train_pearson', 'val_pearson', 'test_pearson'\n", " var: 'chrom', 'start', 'end', 'strand', 'gene_type', 'frac_nan', 'mean_counts', 'n_tracks', 'gene_start', 'gene_end', 'gene_length', 'gene_mask_start', 'gene_mask_end', 'frac_N', 'fold', 'dataset', 'gene_id', 'pearson', 'size_factor_pearson', 'ensembl_canonical_tss'\n", " layers: 'preds', 'v1_rep0', 'v1_rep1', 'v1_rep2', 'v1_rep3'" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result.anndata" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These are the cell metadata contained in the Decima object." ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:32:37.887047Z", "iopub.status.busy": "2025-11-21T06:32:37.886918Z", "iopub.status.idle": "2025-11-21T06:32:37.897015Z", "shell.execute_reply": "2025-11-21T06:32:37.896599Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
cell_typetissueorgandiseasestudydatasetregionsubregioncelltype_coarsen_cellstotal_countsn_genessize_factortrain_pearsonval_pearsontest_pearson
agg_0Amygdala excitatoryAmygdala_AmygdalaCNShealthyjhpce#tran2021brain_atlasAmygdalaAmygdalaNaN3311.592883e+071700041431.4651860.9424590.8413770.865640
agg_1Amygdala excitatoryAmygdala_Basolateral nuclear group (BLN) - lat...CNShealthySCR_016152brain_atlasAmygdalaBasolateral nuclear group (BLN) - lateral nucl...NaN113692.952133e+081808040765.3414810.9430980.8389360.861092
agg_2Amygdala excitatoryAmygdala_Bed nucleus of stria terminalis and n...CNShealthySCR_016152brain_atlasAmygdalaBed nucleus of stria terminalis and nearby - BNSTNaN1392.593231e+061541842556.3870200.9521700.8545440.866654
agg_3Amygdala excitatoryAmygdala_Central nuclear group - CENCNShealthySCR_016152brain_atlasAmygdalaCentral nuclear group - CENNaN38929.946371e+071795942884.6414300.9597440.8635850.881554
agg_4Amygdala excitatoryAmygdala_Corticomedial nuclear group (CMN) - a...CNShealthySCR_016152brain_atlasAmygdalaCorticomedial nuclear group (CMN) - anterior c...NaN29451.281619e+081788541816.7419330.9513650.8543040.868902
...................................................
agg_9533vascular associated smooth muscle cellupper lobe of right lunglungNAENCODEscimilaritynannanNaN213.483375e+04851535404.9117680.7352130.6656470.654491
agg_9535vascular associated smooth muscle cellurinary bladderurinaryhealthyGSE129845scimilaritynannanNaN248.498500e+04733726189.4157890.8098520.6900220.656160
agg_9536vascular associated smooth muscle celluterusuterusNAENCODEscimilaritynannanNaN2725.700762e+051476944938.4038670.9153290.8089410.839993
agg_9537vascular associated smooth muscle celluterusuterushealthye5f58829-1a66-40b5-a624-9046778e74f5scimilaritynannanNaN4721.089170e+071451430145.4221520.8523390.7176820.727469
agg_9538vascular associated smooth muscle cellvasculaturevasculaturehealthye5f58829-1a66-40b5-a624-9046778e74f5scimilaritynannanNaN18535.992697e+071676436464.2733710.9098550.7804130.796351
\n", "

8856 rows × 16 columns

\n", "
" ], "text/plain": [ " cell_type \\\n", "agg_0 Amygdala excitatory \n", "agg_1 Amygdala excitatory \n", "agg_2 Amygdala excitatory \n", "agg_3 Amygdala excitatory \n", "agg_4 Amygdala excitatory \n", "... ... \n", "agg_9533 vascular associated smooth muscle cell \n", "agg_9535 vascular associated smooth muscle cell \n", "agg_9536 vascular associated smooth muscle cell \n", "agg_9537 vascular associated smooth muscle cell \n", "agg_9538 vascular associated smooth muscle cell \n", "\n", " tissue organ \\\n", "agg_0 Amygdala_Amygdala CNS \n", "agg_1 Amygdala_Basolateral nuclear group (BLN) - lat... CNS \n", "agg_2 Amygdala_Bed nucleus of stria terminalis and n... CNS \n", "agg_3 Amygdala_Central nuclear group - CEN CNS \n", "agg_4 Amygdala_Corticomedial nuclear group (CMN) - a... CNS \n", "... ... ... \n", "agg_9533 upper lobe of right lung lung \n", "agg_9535 urinary bladder urinary \n", "agg_9536 uterus uterus \n", "agg_9537 uterus uterus \n", "agg_9538 vasculature vasculature \n", "\n", " disease study dataset \\\n", "agg_0 healthy jhpce#tran2021 brain_atlas \n", "agg_1 healthy SCR_016152 brain_atlas \n", "agg_2 healthy SCR_016152 brain_atlas \n", "agg_3 healthy SCR_016152 brain_atlas \n", "agg_4 healthy SCR_016152 brain_atlas \n", "... ... ... ... \n", "agg_9533 NA ENCODE scimilarity \n", "agg_9535 healthy GSE129845 scimilarity \n", "agg_9536 NA ENCODE scimilarity \n", "agg_9537 healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity \n", "agg_9538 healthy e5f58829-1a66-40b5-a624-9046778e74f5 scimilarity \n", "\n", " region subregion \\\n", "agg_0 Amygdala Amygdala \n", "agg_1 Amygdala Basolateral nuclear group (BLN) - lateral nucl... \n", "agg_2 Amygdala Bed nucleus of stria terminalis and nearby - BNST \n", "agg_3 Amygdala Central nuclear group - CEN \n", "agg_4 Amygdala Corticomedial nuclear group (CMN) - anterior c... \n", "... ... ... \n", "agg_9533 nan nan \n", "agg_9535 nan nan \n", "agg_9536 nan nan \n", "agg_9537 nan nan \n", "agg_9538 nan nan \n", "\n", " celltype_coarse n_cells total_counts n_genes size_factor \\\n", "agg_0 NaN 331 1.592883e+07 17000 41431.465186 \n", "agg_1 NaN 11369 2.952133e+08 18080 40765.341481 \n", "agg_2 NaN 139 2.593231e+06 15418 42556.387020 \n", "agg_3 NaN 3892 9.946371e+07 17959 42884.641430 \n", "agg_4 NaN 2945 1.281619e+08 17885 41816.741933 \n", "... ... ... ... ... ... \n", "agg_9533 NaN 21 3.483375e+04 8515 35404.911768 \n", "agg_9535 NaN 24 8.498500e+04 7337 26189.415789 \n", "agg_9536 NaN 272 5.700762e+05 14769 44938.403867 \n", "agg_9537 NaN 472 1.089170e+07 14514 30145.422152 \n", "agg_9538 NaN 1853 5.992697e+07 16764 36464.273371 \n", "\n", " train_pearson val_pearson test_pearson \n", "agg_0 0.942459 0.841377 0.865640 \n", "agg_1 0.943098 0.838936 0.861092 \n", "agg_2 0.952170 0.854544 0.866654 \n", "agg_3 0.959744 0.863585 0.881554 \n", "agg_4 0.951365 0.854304 0.868902 \n", "... ... ... ... \n", "agg_9533 0.735213 0.665647 0.654491 \n", "agg_9535 0.809852 0.690022 0.656160 \n", "agg_9536 0.915329 0.808941 0.839993 \n", "agg_9537 0.852339 0.717682 0.727469 \n", "agg_9538 0.909855 0.780413 0.796351 \n", "\n", "[8856 rows x 16 columns]" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result.cell_metadata" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly, these are the gene metadata contained in the Decima object." ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:32:37.898318Z", "iopub.status.busy": "2025-11-21T06:32:37.898189Z", "iopub.status.idle": "2025-11-21T06:32:37.908345Z", "shell.execute_reply": "2025-11-21T06:32:37.907862Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
chromstartendstrandgene_typefrac_nanmean_countsn_tracksgene_startgene_endgene_lengthgene_mask_startgene_mask_endfrac_Nfolddatasetgene_idpearsonsize_factor_pearsonensembl_canonical_tss
STRADAchr176338153863905826-protein_coding0.0000002.20807476166368233663741986596501638402234900.000000['fold1']trainENSG000002661730.4699230.47662763741799.0
ETV4chr174321917243743460-protein_coding0.0308730.92586350044352784443579620517761638402156160.000000['fold1']trainENSG000001758320.7380920.61328143546340.0
USP25chr211556618516090473+protein_coding0.0000003.650355860415730025158800691500441638403138840.000000['fold6']trainENSG000001553130.9052220.78444615729982.0
ZSWIM5chr14494576145470049-protein_coding0.0006202.190115612345016399453062092898101638404536500.000000['fold5']trainENSG000001624150.9617720.79513145206605.0
C21orf58chr214596342746487715-protein_coding0.0007911.65046773544630018146323875236941638401875340.000000['fold6']trainENSG000001602980.6452680.41236846323870.0
...............................................................
NPDC1chr9136685731137210019-protein_coding0.0000002.625285785213703946313704617967161638401705560.000000['fold3']testENSG000001072810.3163220.178204137046177.0
ZNF425chr7148765876149290164-protein_coding0.0010481.2929576511149102784149126324235401638401873800.000000['fold7']trainENSG000002049470.8212920.737081149126324.0
COL5A1chr9134477934135002222+protein_coding0.0021591.49266462091346417741348448432030691638403669090.000000['fold3']testENSG000001306350.7666240.456999134641803.0
BRD3chr9133708087134232375-protein_coding0.0000003.1904508675134030305134068535382301638402020700.004662['fold3']testENSG000001699250.3440620.280283134068026.0
EVI5Lchr1976663938190681+protein_coding0.0000001.959605757078302337864976347431638401985830.000000['fold3']testENSG000001424590.8101520.7048287830218.0
\n", "

18457 rows × 20 columns

\n", "
" ], "text/plain": [ " chrom start end strand gene_type frac_nan \\\n", "STRADA chr17 63381538 63905826 - protein_coding 0.000000 \n", "ETV4 chr17 43219172 43743460 - protein_coding 0.030873 \n", "USP25 chr21 15566185 16090473 + protein_coding 0.000000 \n", "ZSWIM5 chr1 44945761 45470049 - protein_coding 0.000620 \n", "C21orf58 chr21 45963427 46487715 - protein_coding 0.000791 \n", "... ... ... ... ... ... ... \n", "NPDC1 chr9 136685731 137210019 - protein_coding 0.000000 \n", "ZNF425 chr7 148765876 149290164 - protein_coding 0.001048 \n", "COL5A1 chr9 134477934 135002222 + protein_coding 0.002159 \n", "BRD3 chr9 133708087 134232375 - protein_coding 0.000000 \n", "EVI5L chr19 7666393 8190681 + protein_coding 0.000000 \n", "\n", " mean_counts n_tracks gene_start gene_end gene_length \\\n", "STRADA 2.208074 7616 63682336 63741986 59650 \n", "ETV4 0.925863 5004 43527844 43579620 51776 \n", "USP25 3.650355 8604 15730025 15880069 150044 \n", "ZSWIM5 2.190115 6123 45016399 45306209 289810 \n", "C21orf58 1.650467 7354 46300181 46323875 23694 \n", "... ... ... ... ... ... \n", "NPDC1 2.625285 7852 137039463 137046179 6716 \n", "ZNF425 1.292957 6511 149102784 149126324 23540 \n", "COL5A1 1.492664 6209 134641774 134844843 203069 \n", "BRD3 3.190450 8675 134030305 134068535 38230 \n", "EVI5L 1.959605 7570 7830233 7864976 34743 \n", "\n", " gene_mask_start gene_mask_end frac_N fold dataset \\\n", "STRADA 163840 223490 0.000000 ['fold1'] train \n", "ETV4 163840 215616 0.000000 ['fold1'] train \n", "USP25 163840 313884 0.000000 ['fold6'] train \n", "ZSWIM5 163840 453650 0.000000 ['fold5'] train \n", "C21orf58 163840 187534 0.000000 ['fold6'] train \n", "... ... ... ... ... ... \n", "NPDC1 163840 170556 0.000000 ['fold3'] test \n", "ZNF425 163840 187380 0.000000 ['fold7'] train \n", "COL5A1 163840 366909 0.000000 ['fold3'] test \n", "BRD3 163840 202070 0.004662 ['fold3'] test \n", "EVI5L 163840 198583 0.000000 ['fold3'] test \n", "\n", " gene_id pearson size_factor_pearson \\\n", "STRADA ENSG00000266173 0.469923 0.476627 \n", "ETV4 ENSG00000175832 0.738092 0.613281 \n", "USP25 ENSG00000155313 0.905222 0.784446 \n", "ZSWIM5 ENSG00000162415 0.961772 0.795131 \n", "C21orf58 ENSG00000160298 0.645268 0.412368 \n", "... ... ... ... \n", "NPDC1 ENSG00000107281 0.316322 0.178204 \n", "ZNF425 ENSG00000204947 0.821292 0.737081 \n", "COL5A1 ENSG00000130635 0.766624 0.456999 \n", "BRD3 ENSG00000169925 0.344062 0.280283 \n", "EVI5L ENSG00000142459 0.810152 0.704828 \n", "\n", " ensembl_canonical_tss \n", "STRADA 63741799.0 \n", "ETV4 43546340.0 \n", "USP25 15729982.0 \n", "ZSWIM5 45206605.0 \n", "C21orf58 46323870.0 \n", "... ... \n", "NPDC1 137046177.0 \n", "ZNF425 149126324.0 \n", "COL5A1 134641803.0 \n", "BRD3 134068026.0 \n", "EVI5L 7830218.0 \n", "\n", "[18457 rows x 20 columns]" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result.gene_metadata" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also access the genes and cells:" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:32:37.909461Z", "iopub.status.busy": "2025-11-21T06:32:37.909331Z", "iopub.status.idle": "2025-11-21T06:32:37.912200Z", "shell.execute_reply": "2025-11-21T06:32:37.911806Z" } }, "outputs": [ { "data": { "text/plain": [ "Index(['STRADA', 'ETV4', 'USP25', 'ZSWIM5', 'C21orf58', 'MIR497HG', 'CFAP74',\n", " 'GSE1', 'LPP', 'CLK1',\n", " ...\n", " 'STRIP2', 'TNFRSF1A', 'RBM14-RBM4', 'C1orf21', 'LINC00639', 'NPDC1',\n", " 'ZNF425', 'COL5A1', 'BRD3', 'EVI5L'],\n", " dtype='object', length=18457)" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result.genes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cell indexes can be also accessed:" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:32:37.913527Z", "iopub.status.busy": "2025-11-21T06:32:37.913396Z", "iopub.status.idle": "2025-11-21T06:32:37.916112Z", "shell.execute_reply": "2025-11-21T06:32:37.915683Z" } }, "outputs": [ { "data": { "text/plain": [ "Index(['agg_0', 'agg_1', 'agg_2', 'agg_3', 'agg_4', 'agg_5', 'agg_6', 'agg_7',\n", " 'agg_8', 'agg_9',\n", " ...\n", " 'agg_9528', 'agg_9529', 'agg_9530', 'agg_9531', 'agg_9532', 'agg_9533',\n", " 'agg_9535', 'agg_9536', 'agg_9537', 'agg_9538'],\n", " dtype='object', length=8856)" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result.cells" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Predicted gene expression for specific gene can be accessed:" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:32:37.917242Z", "iopub.status.busy": "2025-11-21T06:32:37.917117Z", "iopub.status.idle": "2025-11-21T06:32:37.931953Z", "shell.execute_reply": "2025-11-21T06:32:37.931543Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SPI1
agg_00.256442
agg_10.221014
agg_20.179371
agg_30.219646
agg_40.217516
......
agg_95330.493780
agg_95350.292091
agg_95360.370765
agg_95370.168036
agg_95380.239733
\n", "

8856 rows × 1 columns

\n", "
" ], "text/plain": [ " SPI1\n", "agg_0 0.256442\n", "agg_1 0.221014\n", "agg_2 0.179371\n", "agg_3 0.219646\n", "agg_4 0.217516\n", "... ...\n", "agg_9533 0.493780\n", "agg_9535 0.292091\n", "agg_9536 0.370765\n", "agg_9537 0.168036\n", "agg_9538 0.239733\n", "\n", "[8856 rows x 1 columns]" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result.predicted_expression_matrix(genes=[\"SPI1\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or for all the genes:" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:32:37.933257Z", "iopub.status.busy": "2025-11-21T06:32:37.933128Z", "iopub.status.idle": "2025-11-21T06:32:37.946479Z", "shell.execute_reply": "2025-11-21T06:32:37.946096Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
STRADAETV4USP25ZSWIM5C21orf58MIR497HGCFAP74GSE1LPPCLK1...STRIP2TNFRSF1ARBM14-RBM4C1orf21LINC00639NPDC1ZNF425COL5A1BRD3EVI5L
agg_02.9734381.8455654.5925315.0998021.7748790.3568122.5908364.6297744.8971713.326940...2.8360600.2970151.8838494.2935931.4635653.1835342.3402022.3749422.9119163.230072
agg_12.9542131.8967264.6885575.5104401.6669290.3527252.2926254.4595354.9152863.192858...3.1257040.2425431.9081774.4394241.2367393.4948242.4256722.0545682.7134083.491463
agg_22.9388512.1972474.8614105.6175201.7733810.3808672.3949174.4150384.8363993.390717...3.0820980.2632852.0064564.3834551.2085904.0138192.4083812.2973432.8922223.695785
agg_33.0459722.1385734.8637915.2736041.7600970.4635552.3917023.9409754.8577633.410926...2.8828900.2903271.9229634.5501891.4305203.6931182.2971032.1218872.6261173.223912
agg_43.0255182.0190964.6029485.2570011.7553380.3821902.4328104.3924804.9594883.250500...3.0822960.2585402.0382774.4648071.2490433.6658002.4008202.2558622.9256193.471005
..................................................................
agg_95332.3335620.6333224.6758252.7930230.7520300.6920830.5035314.3279486.9031933.695593...0.5497952.2701811.5632184.3954220.5500881.3302521.0444713.7593692.4913461.872717
agg_95350.8350370.3587731.9648960.3074490.3372400.8341960.0938851.8537943.7007904.467302...0.1768851.3708981.0227083.4002670.0521621.9088700.2534171.4481111.6220331.064292
agg_95363.0080391.2093244.7983923.9318701.4013281.6385550.9697204.7792016.6319314.127797...1.1742981.8705302.5068745.1517760.9676441.8099472.2053564.2440052.9744672.659873
agg_95371.2419360.4550592.9199950.5716720.4864481.1755860.1453972.4121484.7591184.913945...0.3710351.3610731.6680854.0057380.0786111.5717500.5081872.0671502.3237641.429850
agg_95381.7155070.7009553.0447320.8586960.9034061.7631680.2153042.6044784.5497084.839124...0.5943101.8012982.0759963.9338600.1655901.9702680.9935212.2323472.4733881.902884
\n", "

8856 rows × 18457 columns

\n", "
" ], "text/plain": [ " STRADA ETV4 USP25 ZSWIM5 C21orf58 MIR497HG \\\n", "agg_0 2.973438 1.845565 4.592531 5.099802 1.774879 0.356812 \n", "agg_1 2.954213 1.896726 4.688557 5.510440 1.666929 0.352725 \n", "agg_2 2.938851 2.197247 4.861410 5.617520 1.773381 0.380867 \n", "agg_3 3.045972 2.138573 4.863791 5.273604 1.760097 0.463555 \n", "agg_4 3.025518 2.019096 4.602948 5.257001 1.755338 0.382190 \n", "... ... ... ... ... ... ... \n", "agg_9533 2.333562 0.633322 4.675825 2.793023 0.752030 0.692083 \n", "agg_9535 0.835037 0.358773 1.964896 0.307449 0.337240 0.834196 \n", "agg_9536 3.008039 1.209324 4.798392 3.931870 1.401328 1.638555 \n", "agg_9537 1.241936 0.455059 2.919995 0.571672 0.486448 1.175586 \n", "agg_9538 1.715507 0.700955 3.044732 0.858696 0.903406 1.763168 \n", "\n", " CFAP74 GSE1 LPP CLK1 ... STRIP2 TNFRSF1A \\\n", "agg_0 2.590836 4.629774 4.897171 3.326940 ... 2.836060 0.297015 \n", "agg_1 2.292625 4.459535 4.915286 3.192858 ... 3.125704 0.242543 \n", "agg_2 2.394917 4.415038 4.836399 3.390717 ... 3.082098 0.263285 \n", "agg_3 2.391702 3.940975 4.857763 3.410926 ... 2.882890 0.290327 \n", "agg_4 2.432810 4.392480 4.959488 3.250500 ... 3.082296 0.258540 \n", "... ... ... ... ... ... ... ... \n", "agg_9533 0.503531 4.327948 6.903193 3.695593 ... 0.549795 2.270181 \n", "agg_9535 0.093885 1.853794 3.700790 4.467302 ... 0.176885 1.370898 \n", "agg_9536 0.969720 4.779201 6.631931 4.127797 ... 1.174298 1.870530 \n", "agg_9537 0.145397 2.412148 4.759118 4.913945 ... 0.371035 1.361073 \n", "agg_9538 0.215304 2.604478 4.549708 4.839124 ... 0.594310 1.801298 \n", "\n", " RBM14-RBM4 C1orf21 LINC00639 NPDC1 ZNF425 COL5A1 \\\n", "agg_0 1.883849 4.293593 1.463565 3.183534 2.340202 2.374942 \n", "agg_1 1.908177 4.439424 1.236739 3.494824 2.425672 2.054568 \n", "agg_2 2.006456 4.383455 1.208590 4.013819 2.408381 2.297343 \n", "agg_3 1.922963 4.550189 1.430520 3.693118 2.297103 2.121887 \n", "agg_4 2.038277 4.464807 1.249043 3.665800 2.400820 2.255862 \n", "... ... ... ... ... ... ... \n", "agg_9533 1.563218 4.395422 0.550088 1.330252 1.044471 3.759369 \n", "agg_9535 1.022708 3.400267 0.052162 1.908870 0.253417 1.448111 \n", "agg_9536 2.506874 5.151776 0.967644 1.809947 2.205356 4.244005 \n", "agg_9537 1.668085 4.005738 0.078611 1.571750 0.508187 2.067150 \n", "agg_9538 2.075996 3.933860 0.165590 1.970268 0.993521 2.232347 \n", "\n", " BRD3 EVI5L \n", "agg_0 2.911916 3.230072 \n", "agg_1 2.713408 3.491463 \n", "agg_2 2.892222 3.695785 \n", "agg_3 2.626117 3.223912 \n", "agg_4 2.925619 3.471005 \n", "... ... ... \n", "agg_9533 2.491346 1.872717 \n", "agg_9535 1.622033 1.064292 \n", "agg_9536 2.974467 2.659873 \n", "agg_9537 2.323764 1.429850 \n", "agg_9538 2.473388 1.902884 \n", "\n", "[8856 rows x 18457 columns]" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result.predicted_expression_matrix()" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:32:37.947703Z", "iopub.status.busy": "2025-11-21T06:32:37.947568Z", "iopub.status.idle": "2025-11-21T06:33:00.525878Z", "shell.execute_reply": "2025-11-21T06:33:00.525331Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: \u001b[33mWARNING\u001b[0m A graphql request initiated by the public wandb API timed out (timeout=19 sec). Create a new API with an integer timeout larger than 19, e.g., `api = wandb.Api(timeout=29)` to increase the graphql timeout.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'rep0:latest', 720.03MB. 1 files...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Done. 00:00:00.7 (1008.5MB/s)\n" ] }, { "data": { "text/plain": [ "DecimaResult(anndata=AnnData object with n_obs × n_vars = 8856 × 18457\n", " obs: 'cell_type', 'tissue', 'organ', 'disease', 'study', 'dataset', 'region', 'subregion', 'celltype_coarse', 'n_cells', 'total_counts', 'n_genes', 'size_factor', 'train_pearson', 'val_pearson', 'test_pearson'\n", " var: 'chrom', 'start', 'end', 'strand', 'gene_type', 'frac_nan', 'mean_counts', 'n_tracks', 'gene_start', 'gene_end', 'gene_length', 'gene_mask_start', 'gene_mask_end', 'frac_N', 'fold', 'dataset', 'gene_id', 'pearson', 'size_factor_pearson', 'ensembl_canonical_tss'\n", " layers: 'preds', 'v1_rep0', 'v1_rep1', 'v1_rep2', 'v1_rep3')" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result.load_model(device=device)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Prepare an input for th SPI1 genes.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Takes around ~10 seconds on GPU and ~5 minutes to call attributions on CPU. " ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:33:00.527547Z", "iopub.status.busy": "2025-11-21T06:33:00.527394Z", "iopub.status.idle": "2025-11-21T06:33:01.199621Z", "shell.execute_reply": "2025-11-21T06:33:01.198754Z" } }, "outputs": [], "source": [ "attrs = result.attributions(\n", " gene=\"SPI1\",\n", " tasks=result.query_cells(f\"cell_type in {spi1_cell_types}\"),\n", " off_tasks=result.query_cells(f'organ == \"blood\" and cell_type not in {spi1_cell_types}'),\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Attributions can be visualized and processed with attributions object:" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:33:01.200991Z", "iopub.status.busy": "2025-11-21T06:33:01.200846Z", "iopub.status.idle": "2025-11-21T06:33:04.240855Z", "shell.execute_reply": "2025-11-21T06:33:04.240273Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
peakstartendattributionp-valuefrom_tss
0pos.SPI1@3716387716390212.8172522.186883e-1137
1pos.SPI1@-1211637191637445.5956591.899081e-05-121
2pos.SPI1@-571637831638039.3074843.054640e-05-57
3pos.SPI1@621639021639091.2811833.068997e-0562
4pos.SPI1@-791637611637650.8332696.109865e-05-79
.....................
72neg.SPI1@443164283164293-0.7173494.916059e-04443
73neg.SPI1@23600187440187445-0.2674384.916059e-0423600
74neg.SPI1@32783196623196630-0.4618134.918151e-0432783
75neg.SPI1@1735165575165592-1.4374984.918151e-041735
76neg.SPI1@31668195508195512-0.2134034.918151e-0431668
\n", "

135 rows × 6 columns

\n", "
" ], "text/plain": [ " peak start end attribution p-value from_tss\n", "0 pos.SPI1@37 163877 163902 12.817252 2.186883e-11 37\n", "1 pos.SPI1@-121 163719 163744 5.595659 1.899081e-05 -121\n", "2 pos.SPI1@-57 163783 163803 9.307484 3.054640e-05 -57\n", "3 pos.SPI1@62 163902 163909 1.281183 3.068997e-05 62\n", "4 pos.SPI1@-79 163761 163765 0.833269 6.109865e-05 -79\n", ".. ... ... ... ... ... ...\n", "72 neg.SPI1@443 164283 164293 -0.717349 4.916059e-04 443\n", "73 neg.SPI1@23600 187440 187445 -0.267438 4.916059e-04 23600\n", "74 neg.SPI1@32783 196623 196630 -0.461813 4.918151e-04 32783\n", "75 neg.SPI1@1735 165575 165592 -1.437498 4.918151e-04 1735\n", "76 neg.SPI1@31668 195508 195512 -0.213403 4.918151e-04 31668\n", "\n", "[135 rows x 6 columns]" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "attrs.peaks" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:33:04.242160Z", "iopub.status.busy": "2025-11-21T06:33:04.242016Z", "iopub.status.idle": "2025-11-21T06:33:04.251058Z", "shell.execute_reply": "2025-11-21T06:33:04.250687Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
chromstartendnamescorestrandattribution
38chr114721635047216357pos.SPI1@1622193.33494.0.543797
49chr114725759747257605pos.SPI1@1209713.31931.0.680714
65chr114725763347257637neg.SPI1@1209393.31455.-0.221530
63chr114725773447257739neg.SPI1@1208373.32086.-0.273840
43chr114734573147345736neg.SPI1@328403.35317.-0.298483
........................
69chr114739548347395492neg.SPI1@-169163.31094.-0.567760
39chr114740021147400221neg.SPI1@-216453.35527.-0.900000
37chr114740022547400235neg.SPI1@-216593.35844.-0.729126
68chr114740037647400382neg.SPI1@-218063.31094.-0.329538
58chr114740070347400709neg.SPI1@-221333.33067.-0.325769
\n", "

135 rows × 7 columns

\n", "
" ], "text/plain": [ " chrom start end name score strand attribution\n", "38 chr11 47216350 47216357 pos.SPI1@162219 3.33494 . 0.543797\n", "49 chr11 47257597 47257605 pos.SPI1@120971 3.31931 . 0.680714\n", "65 chr11 47257633 47257637 neg.SPI1@120939 3.31455 . -0.221530\n", "63 chr11 47257734 47257739 neg.SPI1@120837 3.32086 . -0.273840\n", "43 chr11 47345731 47345736 neg.SPI1@32840 3.35317 . -0.298483\n", ".. ... ... ... ... ... ... ...\n", "69 chr11 47395483 47395492 neg.SPI1@-16916 3.31094 . -0.567760\n", "39 chr11 47400211 47400221 neg.SPI1@-21645 3.35527 . -0.900000\n", "37 chr11 47400225 47400235 neg.SPI1@-21659 3.35844 . -0.729126\n", "68 chr11 47400376 47400382 neg.SPI1@-21806 3.31094 . -0.329538\n", "58 chr11 47400703 47400709 neg.SPI1@-22133 3.33067 . -0.325769\n", "\n", "[135 rows x 7 columns]" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "attrs.peaks_to_bed()" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:33:04.252450Z", "iopub.status.busy": "2025-11-21T06:33:04.252317Z", "iopub.status.idle": "2025-11-21T06:33:05.951332Z", "shell.execute_reply": "2025-11-21T06:33:05.950859Z" } }, "outputs": [ { "data": { "image/png": "" }, "metadata": { "image/png": { "height": 200, "width": 1000 } }, "output_type": "display_data" } ], "source": [ "attrs.plot_peaks()" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:33:05.952850Z", "iopub.status.busy": "2025-11-21T06:33:05.952701Z", "iopub.status.idle": "2025-11-21T06:33:06.731536Z", "shell.execute_reply": "2025-11-21T06:33:06.731101Z" } }, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "attrs.plot_seqlogo(relative_loc=-45)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This comment takes around ~1 minutes and detects motifs in the attributions using FIMO. The motifs are ranked by their attribution scores:" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:33:06.732828Z", "iopub.status.busy": "2025-11-21T06:33:06.732690Z", "iopub.status.idle": "2025-11-21T06:33:10.008101Z", "shell.execute_reply": "2025-11-21T06:33:10.007545Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
motifpeakstartendstrandscorep-valuematched_seqsite_attr_scoremotif_attr_scorefrom_tss
3874ZNF746.H13CORE.0.PSG.Aneg.SPI1@1917165744165770-26.3554527.435330e-10AGGGAGGAGGGAGGAAGGTGGGAGGA-0.010775-0.0162531904
3453ZN263.H13CORE.1.P.Bneg.SPI1@1898165732165753+24.0087221.311946e-09GGGGAGGAGGACAGGGAGGAG-0.006567-0.0166371892
781ZN479.H13CORE.0.P.Cneg.SPI1@-174163668163686-22.9373692.837623e-09GCCCCCAAAGTCATCCCT-0.007155-0.013835-172
1036ZNF746.H13CORE.0.PSG.Aneg.SPI1@-191163639163665+24.4629953.833248e-09TCTCCCTCCCATCCTCCCTCCCCAGC-0.002449-0.001297-201
3545ZNF746.H13CORE.0.PSG.Aneg.SPI1@1898165732165758-23.5233917.853286e-09GGGGAGGAGGACAGGGAGGAGGGAGG-0.005327-0.0107471892
....................................
1088CREB3.H13CORE.0.SM.Bneg.SPI1@-21163819163833+1.7546824.999340e-04GCGGTGATGTCACC-0.206348-0.585193-21
2067RXRB.H13CORE.2.PS.Aneg.SPI1@1182165019165030-12.213856NaNCCATGACCTCT-0.008323-0.0242331179
2913KLF7.H13CORE.0.P.Bneg.SPI1@1813165662165672+15.217368NaNGGGGGCGGGG0.0089730.0256251822
2986KLF7.H13CORE.0.P.Bneg.SPI1@1832165662165672+15.217368NaNGGGGGCGGGG0.0089730.0256251822
7451KLF7.H13CORE.0.P.Bpos.SPI1@1820165662165672+15.217368NaNGGGGGCGGGG0.0089730.0256251822
\n", "

8556 rows × 11 columns

\n", "
" ], "text/plain": [ " motif peak start end strand score \\\n", "3874 ZNF746.H13CORE.0.PSG.A neg.SPI1@1917 165744 165770 - 26.355452 \n", "3453 ZN263.H13CORE.1.P.B neg.SPI1@1898 165732 165753 + 24.008722 \n", "781 ZN479.H13CORE.0.P.C neg.SPI1@-174 163668 163686 - 22.937369 \n", "1036 ZNF746.H13CORE.0.PSG.A neg.SPI1@-191 163639 163665 + 24.462995 \n", "3545 ZNF746.H13CORE.0.PSG.A neg.SPI1@1898 165732 165758 - 23.523391 \n", "... ... ... ... ... ... ... \n", "1088 CREB3.H13CORE.0.SM.B neg.SPI1@-21 163819 163833 + 1.754682 \n", "2067 RXRB.H13CORE.2.PS.A neg.SPI1@1182 165019 165030 - 12.213856 \n", "2913 KLF7.H13CORE.0.P.B neg.SPI1@1813 165662 165672 + 15.217368 \n", "2986 KLF7.H13CORE.0.P.B neg.SPI1@1832 165662 165672 + 15.217368 \n", "7451 KLF7.H13CORE.0.P.B pos.SPI1@1820 165662 165672 + 15.217368 \n", "\n", " p-value matched_seq site_attr_score \\\n", "3874 7.435330e-10 AGGGAGGAGGGAGGAAGGTGGGAGGA -0.010775 \n", "3453 1.311946e-09 GGGGAGGAGGACAGGGAGGAG -0.006567 \n", "781 2.837623e-09 GCCCCCAAAGTCATCCCT -0.007155 \n", "1036 3.833248e-09 TCTCCCTCCCATCCTCCCTCCCCAGC -0.002449 \n", "3545 7.853286e-09 GGGGAGGAGGACAGGGAGGAGGGAGG -0.005327 \n", "... ... ... ... \n", "1088 4.999340e-04 GCGGTGATGTCACC -0.206348 \n", "2067 NaN CCATGACCTCT -0.008323 \n", "2913 NaN GGGGGCGGGG 0.008973 \n", "2986 NaN GGGGGCGGGG 0.008973 \n", "7451 NaN GGGGGCGGGG 0.008973 \n", "\n", " motif_attr_score from_tss \n", "3874 -0.016253 1904 \n", "3453 -0.016637 1892 \n", "781 -0.013835 -172 \n", "1036 -0.001297 -201 \n", "3545 -0.010747 1892 \n", "... ... ... \n", "1088 -0.585193 -21 \n", "2067 -0.024233 1179 \n", "2913 0.025625 1822 \n", "2986 0.025625 1822 \n", "7451 0.025625 1822 \n", "\n", "[8556 rows x 11 columns]" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_motifs = attrs.scan_motifs()\n", "df_motifs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you just want attribution tensor from input one_hot encoded sequence prepare your input and call attributions object:" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:33:10.009453Z", "iopub.status.busy": "2025-11-21T06:33:10.009308Z", "iopub.status.idle": "2025-11-21T06:33:10.050722Z", "shell.execute_reply": "2025-11-21T06:33:10.050157Z" } }, "outputs": [ { "data": { "text/plain": [ "torch.Size([1, 5, 524288])" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "one_hot_seq, gene_mask = result.prepare_one_hot(\"SPI1\")\n", "inputs = torch.vstack([one_hot_seq, gene_mask]).unsqueeze(0)\n", "inputs.shape # (batch_size, 5, seq_len)" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "execution": { "iopub.execute_input": "2025-11-21T06:33:10.052042Z", "iopub.status.busy": "2025-11-21T06:33:10.051892Z", "iopub.status.idle": "2025-11-21T06:33:11.111093Z", "shell.execute_reply": "2025-11-21T06:33:11.110606Z" } }, "outputs": [ { "data": { "text/plain": [ "tensor([[[-0.0000e+00, 0.0000e+00, -0.0000e+00, ..., -0.0000e+00,\n", " 0.0000e+00, 0.0000e+00],\n", " [-0.0000e+00, -0.0000e+00, -2.6888e-05, ..., 0.0000e+00,\n", " 0.0000e+00, 0.0000e+00],\n", " [-0.0000e+00, 1.2651e-04, 0.0000e+00, ..., 3.7016e-05,\n", " -0.0000e+00, 1.5136e-05],\n", " [ 1.7333e-04, -0.0000e+00, 0.0000e+00, ..., -0.0000e+00,\n", " 1.2473e-05, -0.0000e+00]]], device='cuda:0')" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from decima.interpret.attributer import DecimaAttributer\n", "\n", "attributer = DecimaAttributer(\n", " model=result.model,\n", " tasks=result.query_cells(f\"cell_type in {spi1_cell_types}\"),\n", " off_tasks=result.query_cells(f'organ == \"blood\" and cell_type not in {spi1_cell_types}'),\n", " transform=\"specificity\",\n", " method=\"inputxgradient\",\n", ")\n", "attrs = attributer.attribute(inputs=inputs)\n", "\n", "attrs # (batch_size, 4, seq_len) gene mask is removed from final attributions" ] } ], "metadata": { "kernelspec": { "display_name": "decima", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.14" } }, "nbformat": 4, "nbformat_minor": 2 }