grelu.design#

Functions#

evolve(→ pandas.DataFrame)

Sequence design by greedy directed evolution

ledidi(seq, model[, prediction_transform, max_iter, ...])

Sequence design with Ledidi

Module Contents#

grelu.design.evolve(seqs: List[str] | pandas.DataFrame, model: grelu.lightning.LightningModel, method: str = 'ism', patterns: List[str] | None = None, prediction_transform: torch.nn.Module | None = None, seq_transform: torch.nn.Module | None = None, max_iter: int = 10, positions: List[int] = None, devices: str | int | List[int] = 'cpu', num_workers: int = 1, batch_size: int = 64, genome: str | None = None, for_each: bool = True, return_seqs: str = 'all', return_preds: bool = True, verbose: bool = True) pandas.DataFrame[source]#

Sequence design by greedy directed evolution

Parameters:
  • seqs – a set of DNA sequences as strings or genomic intervals

  • model – LightningModel object containing a trained deep learning model

  • method – Either “ism” or “pattern”.

  • patterns – A list of subsequences to try inserting into the starting sequence.

  • prediction_transform – A module to transform the model output

  • seq_transform – A module to asign scores to sequences

  • max_iter – Number of iterations

  • positions – Positions to mutate. If None, all positions will be mutated

  • devices – Device(s) for inference

  • num_workers – Number of workers for inference

  • batch_size – Batch size for inference

  • genome – genome to use if intervals are provided as starting sequences

  • for_each – If multiple start sequences are provided, perform directed evolution independently from each one

  • return_seqs – “all”, “best” or “none”.

  • return_preds – If True, return all the individual model predictions in addition to the model prediction score.

  • verbose – Print status after each iteration

Returns:

A dataframe containing directed evolution results

grelu.design.ledidi(seq: str, model: Callable, prediction_transform: torch.nn.Module | None = None, max_iter: int = 20000, positions: List[int] | None = None, devices: str | int = 'cpu', num_workers: int = 1, **kwargs)[source]#

Sequence design with Ledidi

Parameters:
  • seq – an initial DNA sequence as a string.

  • model – A trained LightningModel object

  • prediction_transform – A module to transform the model output

  • max_iter – Number of iterations

  • positions – Positions to mutate. If None, all positions will be mutated

  • targets – List of targets for each loss function

  • devices – Index of device to use for inference

  • num_workers – Number of workers for inference

  • **kwargs – Other arguments to pass on to Ledidi

Returns:

Output DNA sequence(s) as strings.