grelu.model.models#

Some general purpose model architectures.

Classes#

BaseModel

Base model class

ConvModel

A fully convolutional model that optionally includes pooling,

DilatedConvModel

A model architecture based on dilated convolutional layers with residual connections.

ConvGRUModel

A model consisting of a convolutional tower followed by a bidirectional GRU layer and optional pooling.

ConvTransformerModel

A model consisting of a convolutional tower followed by a transformer encoder layer and optional pooling.

ConvMLPModel

A convolutional tower followed by a Multi-head perceptron (MLP) layer.

BorzoiModel

Model consisting of Borzoi conv and transformer layers followed by U-net upsampling and optional pooling.

BorzoiPretrainedModel

Borzoi model with published weights (ported from Keras).

ExplaiNNModel

The ExplaiNN model architecture.

EnformerModel

Enformer model architecture.

EnformerPretrainedModel

Borzoi model with published weights (ported from Keras).

Module Contents#

class grelu.model.models.BaseModel(embedding: torch.nn.Module, head: torch.nn.Module)[source]#

Bases: torch.nn.Module

Base model class

embedding[source]#
head[source]#
forward(x: torch.Tensor) torch.Tensor[source]#

Forward pass

Parameters:

x – Input tensor of shape (N, C, L)

Returns:

Output tensor

class grelu.model.models.ConvModel(n_tasks: int, stem_channels: int = 64, stem_kernel_size: int = 15, n_conv: int = 2, channel_init: int = 64, channel_mult: float = 1, kernel_size: int = 5, dilation_init: int = 1, dilation_mult: float = 1, act_func: str = 'relu', norm: bool = False, pool_func: str | None = None, pool_size: int | None = None, residual: bool = False, dropout: float = 0.0, crop_len: int = 0, final_pool_func: str = 'avg', dtype=None, device=None)[source]#

Bases: BaseModel

A fully convolutional model that optionally includes pooling, residual connections, batch normalization, or dilated convolutions.

Parameters:
  • n_tasks – Number of channels in the output

  • stem_channels – Number of channels in the stem

  • stem_kernel_size – Kernel width for the stem

  • n_conv – Number of convolutional blocks, not including the stem

  • kernel_size – Convolutional kernel width

  • channel_init – Initial number of channels,

  • channel_mult – Factor by which to multiply the number of channels in each block

  • dilation_init – Initial dilation

  • dilation_mult – Factor by which to multiply the dilation in each block

  • act_func – Name of the activation function

  • pool_func – Name of the pooling function

  • pool_size – Width of the pooling layers

  • dropout – Dropout probability

  • norm – If True, apply batch norm

  • residual – If True, apply residual connection

  • crop_len – Number of positions to crop at either end of the output

  • final_pool_func – Name of the pooling function to apply to the final output. If None, no pooling will be applied at the end.

  • dtype – Data type for the layers.

  • device – Device for the layers.

class grelu.model.models.DilatedConvModel(n_tasks: int, channels: int = 64, stem_kernel_size: int = 21, kernel_size: int = 3, dilation_mult: float = 2, act_func: str = 'relu', n_conv: int = 8, crop_len: str | int = 'auto', final_pool_func: str = 'avg', dtype=None, device=None)[source]#

Bases: BaseModel

A model architecture based on dilated convolutional layers with residual connections. Inspired by the ChromBPnet model architecture.

Parameters:
  • n_tasks – Number of channels in the output

  • channels – Number of channels for all convolutional layers

  • stem_kernel_size – Kernel width for the stem

  • n_blocks – Number of convolutional blocks, not including the stem

  • kernel_size – Convolutional kernel width

  • dilation_mult – Factor by which to multiply the dilation in each block

  • act_func – Name of the activation function

  • crop_len – Number of positions to crop at either end of the output

  • final_pool_func – Name of the pooling function to apply to the final output. If None, no pooling will be applied at the end.

  • dtype – Data type for the layers.

  • device – Device for the layers.

class grelu.model.models.ConvGRUModel(n_tasks: int, stem_channels: int = 16, stem_kernel_size: int = 15, n_conv: int = 2, channel_init: int = 16, channel_mult: float = 1, kernel_size: int = 5, act_func: str = 'relu', conv_norm: bool = False, pool_func: str | None = None, pool_size: int | None = None, residual: bool = False, crop_len: int = 0, n_gru: int = 1, dropout: float = 0.0, gru_norm: bool = False, final_pool_func: str = 'avg', dtype=None, device=None)[source]#

Bases: BaseModel

A model consisting of a convolutional tower followed by a bidirectional GRU layer and optional pooling.

Parameters:
  • n_tasks – Number of channels in the output

  • stem_channels – Number of channels in the stem

  • stem_kernel_size – Kernel width for the stem

  • n_conv – Number of convolutional blocks, not including the stem

  • kernel_size – Convolutional kernel width

  • channel_init – Initial number of channels,

  • channel_mult – Factor by which to multiply the number of channels in each block

  • act_func – Name of the activation function

  • pool_func – Name of the pooling function

  • pool_size – Width of the pooling layers

  • conv_norm – If True, apply batch normalization in the convolutional layers.

  • residual – If True, apply residual connections in the convolutional layers.

  • crop_len – Number of positions to crop at either end of the output

  • n_gru – Number of GRU layers

  • dropout – Dropout for GRU and feed-forward layers

  • gru_norm – If True, include layer normalization in feed-forward network.

  • final_pool_func – Name of the pooling function to apply to the final output. If None, no pooling will be applied at the end.

  • dtype – Data type for the layers.

  • device – Device for the layers.

class grelu.model.models.ConvTransformerModel(n_tasks: int, stem_channels: int = 16, stem_kernel_size: int = 15, n_conv: int = 2, channel_init: int = 16, channel_mult: float = 1, kernel_size: int = 5, act_func: str = 'relu', norm: bool = False, pool_func: str | None = None, pool_size: int | None = None, residual: bool = False, crop_len: int = 0, n_transformers=1, key_len: int = 8, value_len: int = 8, n_heads: int = 1, n_pos_features: int = 4, pos_dropout: float = 0.0, attn_dropout: float = 0.0, ff_dropout: float = 0.0, final_pool_func: str = 'avg', dtype=None, device=None)[source]#

Bases: BaseModel

A model consisting of a convolutional tower followed by a transformer encoder layer and optional pooling.

Parameters:
  • n_tasks – Number of channels in the output

  • stem_channels – Number of channels in the stem

  • stem_kernel_size – Kernel width for the stem

  • n_conv – Number of convolutional blocks, not including the stem

  • kernel_size – Convolutional kernel width

  • channel_init – Initial number of channels,

  • channel_mult – Factor by which to multiply the number of channels in each block

  • act_func – Name of the activation function

  • pool_func – Name of the pooling function

  • pool_size – Width of the pooling layers

  • norm – If True, apply batch normalization in the convolutional layers.

  • residual – If True, apply residual connections in the convolutional layers.

  • crop_len – Number of positions to crop at either end of the output

  • n_transformers – Number of transformer encoder layers

  • n_heads – Number of heads in each multi-head attention layer

  • n_pos_features – Number of positional embedding features

  • key_len – Length of the key vectors

  • value_len – Length of the value vectors.

  • pos_dropout – Dropout probability in the positional embeddings

  • attn_dropout – Dropout probability in the output layer

  • ff_droppout – Dropout probability in the linear feed-forward layers

  • final_pool_func – Name of the pooling function to apply to the final output. If None, no pooling will be applied at the end.

  • dtype – Data type for the layers.

  • device – Device for the layers.

class grelu.model.models.ConvMLPModel(seq_len: int, n_tasks: int, stem_channels: int = 16, stem_kernel_size: int = 15, n_conv: int = 2, channel_init: int = 16, channel_mult: float = 1, kernel_size: int = 5, act_func: str = 'relu', conv_norm: bool = False, pool_func: str | None = None, pool_size: int | None = None, residual: bool = True, mlp_norm: bool = False, mlp_act_func: str | None = 'relu', mlp_hidden_size: List[int] = [8], dropout: float = 0.0, dtype=None, device=None)[source]#

Bases: BaseModel

A convolutional tower followed by a Multi-head perceptron (MLP) layer.

Parameters:
  • n_tasks – Number of channels in the output

  • seq_len – Input length

  • stem_channels – Number of channels in the stem

  • stem_kernel_size – Kernel width for the stem

  • n_conv – Number of convolutional blocks, not including the stem

  • kernel_size – Convolutional kernel width

  • channel_init – Initial number of channels,

  • channel_mult – Factor by which to multiply the number of channels in each block

  • act_func – Name of the activation function

  • pool_func – Name of the pooling function

  • pool_size – Width of the pooling

  • conv_norm – If True, apply batch norm in the convolutional layers

  • residual – If True, apply residual connection

  • mlp_norm – If True, apply layer norm in the MLP layers

  • mlp_hidden_size – A list containing the dimensions for each hidden layer of the MLP.

  • dropout – Dropout probability for the MLP layers.

  • dtype – Data type for the layers.

  • device – Device for the layers.

class grelu.model.models.BorzoiModel(n_tasks: int, stem_channels: int = 512, stem_kernel_size: int = 15, init_channels: int = 608, channels: int = 1536, n_conv: int = 7, kernel_size: int = 5, n_transformers: int = 8, key_len: int = 64, value_len: int = 192, pos_dropout: float = 0.0, attn_dropout: float = 0.0, n_heads: int = 8, n_pos_features: int = 32, crop_len: int = 16, final_act_func: str | None = None, final_pool_func: str | None = 'avg', flash_attn=False, dtype=None, device=None)[source]#

Bases: BaseModel

Model consisting of Borzoi conv and transformer layers followed by U-net upsampling and optional pooling.

Parameters:
  • stem_channels – Number of channels in the first (stem) convolutional layer

  • stem_kernel_size – Width of the convolutional kernel in the first (stem) convolutional layer

  • init_channels – Number of channels in the first convolutional block after the stem

  • channels – Number of channels in the output of the convolutional tower

  • kernel_size – Width of the convolutional kernel

  • n_conv – Number of convolutional/pooling blocks

  • n_transformers – Number of stacked transformer blocks

  • n_pos_features – Number of features in the positional embeddings

  • n_heads – Number of attention heads

  • key_len – Length of the key vectors

  • value_len – Length of the value vectors.

  • pos_dropout – Dropout probability in the positional embeddings

  • attn_dropout – Dropout probability in the attention layer

  • crop_len – Number of positions to crop at either end of the output

  • head_act_func – Name of the activation function to use in the final layer

  • final_pool_func – Name of the pooling function to apply to the final output. If None, no pooling will be applied at the end.

  • flash_attn – If True, uses Flash Attention with Rotational Position Embeddings. key_len, value_len, pos_dropout and n_pos_features are ignored.

  • dtype – Data type for the layers.

  • device – Device for the layers.

class grelu.model.models.BorzoiPretrainedModel(n_tasks: int, fold: int = 0, n_transformers: int = 8, crop_len=0, final_pool_func='avg', dtype=None, device=None)[source]#

Bases: BaseModel

Borzoi model with published weights (ported from Keras).

class grelu.model.models.ExplaiNNModel(n_tasks: int, in_len: int, channels=300, kernel_size=19, dtype=None, device=None)[source]#

Bases: torch.nn.Module

The ExplaiNN model architecture.

Parameters:
  • n_tasks (int) – number of outputs

  • input_length (int) – length of the input sequences

  • channels (int) – number of independent CNN units (default=300)

  • kernel_size (int) – size of each unit’s conv. filter (default=19)

  • dtype – Data type for the layers.

  • device – Device for the layers.

class grelu.model.models.EnformerModel(n_tasks: int, n_conv: int = 7, channels: int = 1536, n_transformers: int = 11, n_heads: int = 8, key_len: int = 64, attn_dropout: float = 0.05, pos_dropout: float = 0.01, ff_dropout: float = 0.4, crop_len: int = 0, final_act_func: str | None = None, final_pool_func: str | None = 'avg', dtype=None, device=None)[source]#

Bases: BaseModel

Enformer model architecture.

Parameters:
  • n_tasks – Number of tasks for the model to predict

  • n_conv – Number of convolutional/pooling blocks

  • channels – Number of output channels for the convolutional tower

  • n_transformers – Number of stacked transformer blocks

  • n_heads – Number of attention heads

  • key_len – Length of the key vectors

  • value_len – Length of the value vectors.

  • pos_dropout – Dropout probability in the positional embeddings

  • attn_dropout – Dropout probability in the output layer

  • ff_droppout – Dropout probability in the linear feed-forward layers

  • crop_len – Number of positions to crop at either end of the output

  • final_act_func – Name of the activation function to use in the final layer

  • final_pool_func – Name of the pooling function to apply to the final output. If None, no pooling will be applied at the end.

  • dtype – Data type for the layers.

  • device – Device for the layers.

class grelu.model.models.EnformerPretrainedModel(n_tasks: int, n_transformers: int = 11, crop_len=0, final_pool_func='avg', dtype=None, device=None)[source]#

Bases: BaseModel

Borzoi model with published weights (ported from Keras).