decima package¶
Subpackages¶
- decima.cli package
- decima.core package
- Submodules
- decima.core.metadata module
CellMetadata
CellMetadata.name
CellMetadata.cell_type
CellMetadata.tissue
CellMetadata.organ
CellMetadata.disease
CellMetadata.study
CellMetadata.dataset
CellMetadata.region
CellMetadata.subregion
CellMetadata.celltype_coarse
CellMetadata.n_cells
CellMetadata.total_counts
CellMetadata.n_genes
CellMetadata.size_factor
CellMetadata.train_pearson
CellMetadata.val_pearson
CellMetadata.test_pearson
CellMetadata.__annotations__
CellMetadata.__dataclass_fields__
CellMetadata.__dataclass_params__
CellMetadata.__eq__()
CellMetadata.__hash__
CellMetadata.__init__()
CellMetadata.__match_args__
CellMetadata.__repr__()
CellMetadata.cell_type
CellMetadata.celltype_coarse
CellMetadata.dataset
CellMetadata.disease
CellMetadata.from_series()
CellMetadata.n_cells
CellMetadata.n_genes
CellMetadata.name
CellMetadata.organ
CellMetadata.region
CellMetadata.size_factor
CellMetadata.study
CellMetadata.subregion
CellMetadata.test_pearson
CellMetadata.tissue
CellMetadata.total_counts
CellMetadata.train_pearson
CellMetadata.val_pearson
GeneMetadata
GeneMetadata.name
GeneMetadata.chrom
GeneMetadata.start
GeneMetadata.end
GeneMetadata.strand
GeneMetadata.gene_type
GeneMetadata.frac_nan
GeneMetadata.mean_counts
GeneMetadata.n_tracks
GeneMetadata.gene_start
GeneMetadata.gene_end
GeneMetadata.gene_length
GeneMetadata.gene_mask_start
GeneMetadata.gene_mask_end
GeneMetadata.frac_N
GeneMetadata.fold
GeneMetadata.dataset
GeneMetadata.gene_id
GeneMetadata.pearson
GeneMetadata.size_factor_pearson
GeneMetadata.__annotations__
GeneMetadata.__dataclass_fields__
GeneMetadata.__dataclass_params__
GeneMetadata.__eq__()
GeneMetadata.__hash__
GeneMetadata.__init__()
GeneMetadata.__match_args__
GeneMetadata.__repr__()
GeneMetadata.chrom
GeneMetadata.dataset
GeneMetadata.end
GeneMetadata.fold
GeneMetadata.frac_N
GeneMetadata.frac_nan
GeneMetadata.from_series()
GeneMetadata.gene_end
GeneMetadata.gene_id
GeneMetadata.gene_length
GeneMetadata.gene_mask_end
GeneMetadata.gene_mask_start
GeneMetadata.gene_start
GeneMetadata.gene_type
GeneMetadata.mean_counts
GeneMetadata.n_tracks
GeneMetadata.name
GeneMetadata.pearson
GeneMetadata.size_factor_pearson
GeneMetadata.start
GeneMetadata.strand
- decima.core.result module
DecimaResult
DecimaResult.__annotations__
DecimaResult.__init__()
DecimaResult.__repr__()
DecimaResult.attributions()
DecimaResult.cell_metadata
DecimaResult.cells
DecimaResult.gene_metadata
DecimaResult.gene_sequence()
DecimaResult.genes
DecimaResult.get_cell_metadata()
DecimaResult.get_gene_metadata()
DecimaResult.load()
DecimaResult.load_model()
DecimaResult.model
DecimaResult.predicted_expression_matrix()
DecimaResult.prepare_one_hot()
DecimaResult.query_cells()
DecimaResult.query_tasks()
DecimaResult.shape
- Module contents
- decima.data package
- decima.dataloaders package
- decima.hub package
- decima.interpret package
- Submodules
- decima.interpret.attributions module
Attribution
Attribution.__init__()
Attribution.__repr__()
Attribution.chrom
Attribution.end
Attribution.fasta_str()
Attribution.find_peaks()
Attribution.from_seq()
Attribution.gene_end
Attribution.gene_start
Attribution.peaks_to_bed()
Attribution.plot_peaks()
Attribution.plot_seqlogo()
Attribution.save_bigwig()
Attribution.save_fasta()
Attribution.save_peaks()
Attribution.scan_motifs()
Attribution.start
Attribution.strand
attributions()
get_attribution_method()
- decima.interpret.ism module
- decima.interpret.save_attributions module
- Module contents
- decima.model package
- Submodules
- decima.model.decima_model module
- decima.model.lightning module
LightningModel
LightningModel.__annotations__
LightningModel.__init__()
LightningModel.add_transform()
LightningModel.configure_optimizers()
LightningModel.count_params()
LightningModel.format_input()
LightningModel.forward()
LightningModel.get_task_idxs()
LightningModel.make_predict_loader()
LightningModel.make_test_loader()
LightningModel.make_train_loader()
LightningModel.on_save_checkpoint()
LightningModel.on_test_epoch_end()
LightningModel.on_validation_epoch_end()
LightningModel.parse_logger()
LightningModel.predict_on_dataset()
LightningModel.predict_step()
LightningModel.reset_transform()
LightningModel.test_step()
LightningModel.train_on_dataset()
LightningModel.training_step()
LightningModel.validation_step()
- decima.model.loss module
- decima.model.metrics module
- Module contents
- decima.plot package
- decima.tools package
- decima.train package
- decima.utils package
- decima.vep package
Submodules¶
decima.constants module¶
Module contents¶
- class decima.DecimaResult(anndata)[source]¶
Bases:
object
Container for Decima results and model predictions.
This class provides a unified interface for loading pre-trained Decima models and associated metadata, making predictions, and performing attribution analyses.
- The DecimaResult object contains:
An AnnData object with gene expression and metadata
A trained model for making predictions
Methods for attribution analysis and interpretation
- Parameters:
anndata – AnnData object containing gene expression data and metadata
Examples
>>> # Load default pre-trained model and metadata >>> result = DecimaResult.load() >>> result.load_model( ... rep=0 ... ) >>> # Perform attribution analysis >>> attributions = result.attributions( ... output_dir="attrs_SP1I_classical_monoctypes", ... gene="SPI1", ... tasks='cell_type == "classical monocyte"', ... )
- Properties:
model: Decima model genes: List of gene names cells: List of cell names cell_metadata: Cell metadata gene_metadata: Gene metadata shape: Shape of the expression matrix attributions: Attributions for a gene
- attributions(gene, tasks=None, off_tasks=None, transform='specificity', method='inputxgradient', threshold=0.0005, min_seqlet_len=4, max_seqlet_len=25, additional_flanks=0)[source]¶
Get attributions for a specific gene.
- Parameters:
gene (
str
) – Gene nametasks (
Optional
[List
[str
]]) – List of cells to use as on taskoff_tasks (
Optional
[List
[str
]]) – List of cells to use as off tasktransform (
str
) – Attribution transform methodmethod (
str
) – Attribution methodn_peaks – Number of peaks to find
min_dist – Minimum distance between peaks
- Returns:
Container with inputs, predictions, attribution scores and TSS position
- Return type:
- classmethod load(anndata_path=None)[source]¶
Load a DecimaResult object from an anndata file or a path to an anndata file.
- Parameters:
anndata_path (
Union
[str
,AnnData
,None
]) – Path to anndata file or anndata object- Returns:
DecimaResult object
Examples
>>> result = DecimaResult.load() # Load default decima metadata >>> result = DecimaResult.load( ... "path/to/anndata.h5ad" ... ) # Load custom anndata object from file
- load_model(model=0, device='cpu')[source]¶
Load the trained model from a checkpoint path.
- Parameters:
- Returns:
self
Examples
>>> result = DecimaResult.load() >>> result.load_model() # Load default model (rep0) >>> result.load_model( ... model="path/to/checkpoint.ckpt" ... ) >>> result.load_model( ... model=2 ... )
- property model¶
Decima model.
- predicted_expression_matrix(genes=None)[source]¶
Get predicted expression matrix for all or specific genes.
- decima.predict_save_attributions(output_dir, genes=None, seqs=None, tasks=None, off_tasks=None, model=0, metadata_anndata=None, method='inputxgradient', device=None, plot_peaks=True, plot_seqlogo=False, seqlogo_window=50, dpi=100)[source]¶
Generate and save attribution analysis results for a gene. This function performs attribution analysis for a given gene and cell types, saving the following output files to the specified directory:
output_dir/ ├── peaks.bed # List of attribution peaks in BED format ├── peaks.png # Plot showing peak locations ├── qc.log # QC warnings about prediction reliability ├── motifs.tsv # Detected motifs in peak regions ├── attributions.h5 # Raw attribution score matrix ├── attributions.bigwig # Genome browser track of attribution scores └── attributions_seq_logos/ # Directory containing attribution plots
└── {peak}.png # Attribution plot for each peak region
- Parameters:
output_dir (
str
) – Directory to save output filesgene – Gene symbol or ID to analyze
tasks (
Optional
[List
[str
]]) – List of cell types to analyze attributions foroff_tasks (
Optional
[List
[str
]]) – Optional list of cell types to contrast againstmodel (
Union
[str
,int
,None
]) – Optional model to use for attribution analysismethod (
str
) – Method to use for attribution analysisdevice (
Optional
[str
]) – Device to use for attribution analysisdpi (
int
) – DPI for attribution plots.
- Raises:
FileExistsError – If output directory already exists.
Examples: >>> predict_save_attributions( … output_dir=”output_dir”, … genes=[ … “SPI1”, … “CD68”, … ], … tasks=”cell_type == ‘classical monocyte’”, … )
- decima.predict_variant_effect(df_variant, output_pq=None, tasks=None, model=0, metadata_anndata=None, chunksize=10000, batch_size=8, num_workers=16, device=None, include_cols=None, gene_col=None, distance_type='tss', min_distance=0, max_distance=inf, genome='hg38')[source]¶
Predict variant effect and save to parquet
- Parameters:
df_variant (pd.DataFrame) – DataFrame with variant information
output_path (str) – Path to save the parquet file
tasks (str, optional) – Tasks to predict. Defaults to None.
model (int, optional) – Model to use. Defaults to 0.
metadata_anndata (str, optional) – Path to anndata file. Defaults to None.
chunksize (int, optional) – Number of variants to predict in each chunk. Defaults to 10_000.
batch_size (int, optional) – Batch size. Defaults to 8.
num_workers (int, optional) – Number of workers. Defaults to 16.
device (str, optional) – Device to use. Defaults to “cpu”.
include_cols (list, optional) – Columns to include in the output. Defaults to None.
gene_col (str, optional) – Column name for gene names. Defaults to None.
distance_type (str, optional) – Type of distance. Defaults to “tss”.
min_distance (float, optional) – Minimum distance from the end of the gene. Defaults to 0 (inclusive).
max_distance (float, optional) – Maximum distance from the TSS. Defaults to inf (exclusive).
genome (str, optional) – Genome build. Defaults to “hg38”.
- Return type: