grelu.design#

grelu.design contains methods to design novel DNA sequences using trained sequence-to-function deep learning models.

Functions#

`evolve`(→ pandas.DataFrame)	Sequence design by greedy directed evolution
`ledidi`(seq, model[, prediction_transform, max_iter, ...])	Sequence design with Ledidi

Module Contents#

grelu.design.evolve(seqs: List[str] | pandas.DataFrame, model: grelu.lightning.LightningModel, method: str = 'ism', patterns: List[str] | None = None, prediction_transform: torch.nn.Module | None = None, seq_transform: torch.nn.Module | None = None, max_iter: int = 10, positions: List[int] = None, devices: str | int | List[int] = 'cpu', num_workers: int = 1, batch_size: int = 64, genome: str | None = None, for_each: bool = True, return_seqs: str = 'all', return_preds: bool = True, verbose: bool = True) → pandas.DataFrame[source]#

Sequence design by greedy directed evolution

Parameters:

seqs – a set of DNA sequences as strings or genomic intervals
model – LightningModel object containing a trained deep learning model
method – Either “ism” or “pattern”.
patterns – A list of subsequences to try inserting into the starting sequence.
prediction_transform – A module to transform the model output
seq_transform – A module to asign scores to sequences
max_iter – Number of iterations
positions – Positions to mutate. If None, all positions will be mutated
devices – Device(s) for inference
num_workers – Number of workers for inference
batch_size – Batch size for inference
genome – genome to use if intervals are provided as starting sequences
for_each – If multiple start sequences are provided, perform directed evolution independently from each one
return_seqs – “all”, “best” or “none”.
return_preds – If True, return all the individual model predictions in addition to the model prediction score.
verbose – Print status after each iteration

Returns:

A dataframe containing directed evolution results

grelu.design.ledidi(seq: str, model: Callable, prediction_transform: torch.nn.Module | None = None, max_iter: int = 20000, positions: List[int] | None = None, devices: str | int = 'cpu', num_workers: int = 1, **kwargs)[source]#

Sequence design with Ledidi

Parameters:

seq – an initial DNA sequence as a string.
model – A trained LightningModel object
prediction_transform – A module to transform the model output
max_iter – Number of iterations
positions – Positions to mutate. If None, all positions will be mutated
targets – List of targets for each loss function
devices – Index of device to use for inference
num_workers – Number of workers for inference
**kwargs – Other arguments to pass on to Ledidi

Returns:

Output DNA sequence(s) as strings.

grelu.design#

Functions#

Module Contents#

This Page