grelu.model.models#
Some general purpose model architectures.
Classes#
Base model class |
|
A fully convolutional model that optionally includes pooling, |
|
A model architecture based on dilated convolutional layers with residual connections. |
|
A model consisting of a convolutional tower followed by a bidirectional GRU layer and optional pooling. |
|
A model consisting of a convolutional tower followed by a transformer encoder layer and optional pooling. |
|
A convolutional tower followed by a Multi-head perceptron (MLP) layer. |
|
Model consisting of Borzoi conv and transformer layers followed by U-net upsampling and optional pooling. |
|
Borzoi model with published weights (ported from Keras). |
|
The ExplaiNN model architecture. |
|
Enformer model architecture. |
|
Borzoi model with published weights (ported from Keras). |
Module Contents#
- class grelu.model.models.BaseModel(embedding: torch.nn.Module, head: torch.nn.Module)[source]#
Bases:
torch.nn.Module
Base model class
- class grelu.model.models.ConvModel(n_tasks: int, stem_channels: int = 64, stem_kernel_size: int = 15, n_conv: int = 2, channel_init: int = 64, channel_mult: float = 1, kernel_size: int = 5, dilation_init: int = 1, dilation_mult: float = 1, act_func: str = 'relu', norm: bool = False, pool_func: str | None = None, pool_size: int | None = None, residual: bool = False, dropout: float = 0.0, crop_len: int = 0, final_pool_func: str = 'avg', dtype=None, device=None)[source]#
Bases:
BaseModel
A fully convolutional model that optionally includes pooling, residual connections, batch normalization, or dilated convolutions.
- Parameters:
n_tasks – Number of channels in the output
stem_channels – Number of channels in the stem
stem_kernel_size – Kernel width for the stem
n_conv – Number of convolutional blocks, not including the stem
kernel_size – Convolutional kernel width
channel_init – Initial number of channels,
channel_mult – Factor by which to multiply the number of channels in each block
dilation_init – Initial dilation
dilation_mult – Factor by which to multiply the dilation in each block
act_func – Name of the activation function
pool_func – Name of the pooling function
pool_size – Width of the pooling layers
dropout – Dropout probability
norm – If True, apply batch norm
residual – If True, apply residual connection
crop_len – Number of positions to crop at either end of the output
final_pool_func – Name of the pooling function to apply to the final output. If None, no pooling will be applied at the end.
dtype – Data type for the layers.
device – Device for the layers.
- class grelu.model.models.DilatedConvModel(n_tasks: int, channels: int = 64, stem_kernel_size: int = 21, kernel_size: int = 3, dilation_mult: float = 2, act_func: str = 'relu', n_conv: int = 8, crop_len: str | int = 'auto', final_pool_func: str = 'avg', dtype=None, device=None)[source]#
Bases:
BaseModel
A model architecture based on dilated convolutional layers with residual connections. Inspired by the ChromBPnet model architecture.
- Parameters:
n_tasks – Number of channels in the output
channels – Number of channels for all convolutional layers
stem_kernel_size – Kernel width for the stem
n_blocks – Number of convolutional blocks, not including the stem
kernel_size – Convolutional kernel width
dilation_mult – Factor by which to multiply the dilation in each block
act_func – Name of the activation function
crop_len – Number of positions to crop at either end of the output
final_pool_func – Name of the pooling function to apply to the final output. If None, no pooling will be applied at the end.
dtype – Data type for the layers.
device – Device for the layers.
- class grelu.model.models.ConvGRUModel(n_tasks: int, stem_channels: int = 16, stem_kernel_size: int = 15, n_conv: int = 2, channel_init: int = 16, channel_mult: float = 1, kernel_size: int = 5, act_func: str = 'relu', conv_norm: bool = False, pool_func: str | None = None, pool_size: int | None = None, residual: bool = False, crop_len: int = 0, n_gru: int = 1, dropout: float = 0.0, gru_norm: bool = False, final_pool_func: str = 'avg', dtype=None, device=None)[source]#
Bases:
BaseModel
A model consisting of a convolutional tower followed by a bidirectional GRU layer and optional pooling.
- Parameters:
n_tasks – Number of channels in the output
stem_channels – Number of channels in the stem
stem_kernel_size – Kernel width for the stem
n_conv – Number of convolutional blocks, not including the stem
kernel_size – Convolutional kernel width
channel_init – Initial number of channels,
channel_mult – Factor by which to multiply the number of channels in each block
act_func – Name of the activation function
pool_func – Name of the pooling function
pool_size – Width of the pooling layers
conv_norm – If True, apply batch normalization in the convolutional layers.
residual – If True, apply residual connections in the convolutional layers.
crop_len – Number of positions to crop at either end of the output
n_gru – Number of GRU layers
dropout – Dropout for GRU and feed-forward layers
gru_norm – If True, include layer normalization in feed-forward network.
final_pool_func – Name of the pooling function to apply to the final output. If None, no pooling will be applied at the end.
dtype – Data type for the layers.
device – Device for the layers.
- class grelu.model.models.ConvTransformerModel(n_tasks: int, stem_channels: int = 16, stem_kernel_size: int = 15, n_conv: int = 2, channel_init: int = 16, channel_mult: float = 1, kernel_size: int = 5, act_func: str = 'relu', norm: bool = False, pool_func: str | None = None, pool_size: int | None = None, residual: bool = False, crop_len: int = 0, n_transformers=1, key_len: int = 8, value_len: int = 8, n_heads: int = 1, n_pos_features: int = 4, pos_dropout: float = 0.0, attn_dropout: float = 0.0, ff_dropout: float = 0.0, final_pool_func: str = 'avg', dtype=None, device=None)[source]#
Bases:
BaseModel
A model consisting of a convolutional tower followed by a transformer encoder layer and optional pooling.
- Parameters:
n_tasks – Number of channels in the output
stem_channels – Number of channels in the stem
stem_kernel_size – Kernel width for the stem
n_conv – Number of convolutional blocks, not including the stem
kernel_size – Convolutional kernel width
channel_init – Initial number of channels,
channel_mult – Factor by which to multiply the number of channels in each block
act_func – Name of the activation function
pool_func – Name of the pooling function
pool_size – Width of the pooling layers
norm – If True, apply batch normalization in the convolutional layers.
residual – If True, apply residual connections in the convolutional layers.
crop_len – Number of positions to crop at either end of the output
n_transformers – Number of transformer encoder layers
n_heads – Number of heads in each multi-head attention layer
n_pos_features – Number of positional embedding features
key_len – Length of the key vectors
value_len – Length of the value vectors.
pos_dropout – Dropout probability in the positional embeddings
attn_dropout – Dropout probability in the output layer
ff_droppout – Dropout probability in the linear feed-forward layers
final_pool_func – Name of the pooling function to apply to the final output. If None, no pooling will be applied at the end.
dtype – Data type for the layers.
device – Device for the layers.
- class grelu.model.models.ConvMLPModel(seq_len: int, n_tasks: int, stem_channels: int = 16, stem_kernel_size: int = 15, n_conv: int = 2, channel_init: int = 16, channel_mult: float = 1, kernel_size: int = 5, act_func: str = 'relu', conv_norm: bool = False, pool_func: str | None = None, pool_size: int | None = None, residual: bool = True, mlp_norm: bool = False, mlp_act_func: str | None = 'relu', mlp_hidden_size: List[int] = [8], dropout: float = 0.0, dtype=None, device=None)[source]#
Bases:
BaseModel
A convolutional tower followed by a Multi-head perceptron (MLP) layer.
- Parameters:
n_tasks – Number of channels in the output
seq_len – Input length
stem_channels – Number of channels in the stem
stem_kernel_size – Kernel width for the stem
n_conv – Number of convolutional blocks, not including the stem
kernel_size – Convolutional kernel width
channel_init – Initial number of channels,
channel_mult – Factor by which to multiply the number of channels in each block
act_func – Name of the activation function
pool_func – Name of the pooling function
pool_size – Width of the pooling
conv_norm – If True, apply batch norm in the convolutional layers
residual – If True, apply residual connection
mlp_norm – If True, apply layer norm in the MLP layers
mlp_hidden_size – A list containing the dimensions for each hidden layer of the MLP.
dropout – Dropout probability for the MLP layers.
dtype – Data type for the layers.
device – Device for the layers.
- class grelu.model.models.BorzoiModel(n_tasks: int, stem_channels: int = 512, stem_kernel_size: int = 15, init_channels: int = 608, channels: int = 1536, n_conv: int = 7, kernel_size: int = 5, n_transformers: int = 8, key_len: int = 64, value_len: int = 192, pos_dropout: float = 0.0, attn_dropout: float = 0.0, n_heads: int = 8, n_pos_features: int = 32, crop_len: int = 16, final_act_func: str | None = None, final_pool_func: str | None = 'avg', flash_attn=False, dtype=None, device=None)[source]#
Bases:
BaseModel
Model consisting of Borzoi conv and transformer layers followed by U-net upsampling and optional pooling.
- Parameters:
stem_channels – Number of channels in the first (stem) convolutional layer
stem_kernel_size – Width of the convolutional kernel in the first (stem) convolutional layer
init_channels – Number of channels in the first convolutional block after the stem
channels – Number of channels in the output of the convolutional tower
kernel_size – Width of the convolutional kernel
n_conv – Number of convolutional/pooling blocks
n_transformers – Number of stacked transformer blocks
n_pos_features – Number of features in the positional embeddings
n_heads – Number of attention heads
key_len – Length of the key vectors
value_len – Length of the value vectors.
pos_dropout – Dropout probability in the positional embeddings
attn_dropout – Dropout probability in the attention layer
crop_len – Number of positions to crop at either end of the output
head_act_func – Name of the activation function to use in the final layer
final_pool_func – Name of the pooling function to apply to the final output. If None, no pooling will be applied at the end.
flash_attn – If True, uses Flash Attention with Rotational Position Embeddings. key_len, value_len, pos_dropout and n_pos_features are ignored.
dtype – Data type for the layers.
device – Device for the layers.
- class grelu.model.models.BorzoiPretrainedModel(n_tasks: int, fold: int = 0, n_transformers: int = 8, crop_len=0, final_pool_func='avg', dtype=None, device=None)[source]#
Bases:
BaseModel
Borzoi model with published weights (ported from Keras).
- class grelu.model.models.ExplaiNNModel(n_tasks: int, in_len: int, channels=300, kernel_size=19, dtype=None, device=None)[source]#
Bases:
torch.nn.Module
The ExplaiNN model architecture.
- class grelu.model.models.EnformerModel(n_tasks: int, n_conv: int = 7, channels: int = 1536, n_transformers: int = 11, n_heads: int = 8, key_len: int = 64, attn_dropout: float = 0.05, pos_dropout: float = 0.01, ff_dropout: float = 0.4, crop_len: int = 0, final_act_func: str | None = None, final_pool_func: str | None = 'avg', dtype=None, device=None)[source]#
Bases:
BaseModel
Enformer model architecture.
- Parameters:
n_tasks – Number of tasks for the model to predict
n_conv – Number of convolutional/pooling blocks
channels – Number of output channels for the convolutional tower
n_transformers – Number of stacked transformer blocks
n_heads – Number of attention heads
key_len – Length of the key vectors
value_len – Length of the value vectors.
pos_dropout – Dropout probability in the positional embeddings
attn_dropout – Dropout probability in the output layer
ff_droppout – Dropout probability in the linear feed-forward layers
crop_len – Number of positions to crop at either end of the output
final_act_func – Name of the activation function to use in the final layer
final_pool_func – Name of the pooling function to apply to the final output. If None, no pooling will be applied at the end.
dtype – Data type for the layers.
device – Device for the layers.