grelu.model.blocks#

grelu.model.blocks defines larger blocks that form part of the architecture of sequence-to-function deep learning models. Each such block is composed of multiple layers.

Classes#

`LinearBlock`	Linear layer followed by optional normalization,
`ConvBlock`	Convolutional layer along with optional normalization,
`ChannelTransformBlock`	Convolutional layer with kernel size=1 along with optional normalization, activation
`Stem`	Convolutional layer followed by optional activation and pooling.
`SeparableConv`	Equivalent class to tf.keras.layers.SeparableConv1D
`ConvTower`	A module that consists of multiple convolutional blocks and takes a one-hot encoded
`FeedForwardBlock`	2-layer feed-forward network. Can be used to follow layers such as GRU and attention.
`GRUBlock`	Stacked bidirectional GRU layers followed by a feed-forward network.
`TransformerBlock`	A block containing a multi-head attention layer followed by a feed-forward
`TransformerTower`	Multiple stacked transformer encoder layers.
`UnetBlock`	Upsampling U-net block
`UnetTower`	Upsampling U-net tower for the Borzoi model

Module Contents#

class grelu.model.blocks.LinearBlock(in_len: int, out_len: int, act_func: str = 'relu', dropout: float = 0.0, norm: bool = False, norm_kwargs: dict | None = None, bias: bool = True, dtype=None, device=None)[source]#

Bases: torch.nn.Module

Linear layer followed by optional normalization, activation and dropout.

Parameters:

in_len – Length of input
out_len – Length of output
act_func – Name of activation function
dropout – Dropout probability
norm – If True, apply layer normalization
norm_kwargs – Optional dictionary of keyword arguments to pass to the normalization layer
bias – If True, include bias term.
dtype – Data type of the weights
device – Device on which to store the weights

norm[source]#

linear[source]#

dropout[source]#

act[source]#

forward(x: torch.Tensor) → torch.Tensor[source]#

Forward pass

Parameters:: x – Input tensor of shape (N, C, L)
Returns:: Output tensor

class grelu.model.blocks.ConvBlock(in_channels: int, out_channels: int, kernel_size: int, dilation: int = 1, act_func: str = 'relu', pool_func: str | None = None, pool_size: str | None = None, dropout: float = 0.0, norm: bool = True, norm_type='batch', norm_kwargs: dict | None = None, residual: bool = False, order: str = 'CDNRA', bias: bool = True, return_pre_pool: bool = False, dtype=None, device=None, **kwargs)[source]#

Bases: torch.nn.Module

Convolutional layer along with optional normalization, activation, dilation, dropout, residual connection, and pooling. The order of these operations can be specified, except for pooling, which always comes last.

Parameters:

in_channels – Number of channels in the input
out_channels – Number of channels in the output
kernel_size – Convolutional kernel width
dilation – Dilation
act_func – Name of the activation function
pool_func – Name of the pooling function
pool_size – Pooling width
dropout – Dropout probability
norm – If True, apply normalization layer
norm_type – Type of normalization to apply: ‘batch’, ‘syncbatch’, ‘layer’, ‘instance’ or None
norm_kwargs – Additional arguments to be passed to the normalization layer
residual – If True, apply residual connection
order – A string representing the order in which operations are to be performed on the input. For example, “CDNRA” means that the operations will be performed in the order: convolution, dropout, batch norm, residual addition, activation. Pooling is not included as it is always performed last.
return_pre_pool – If this is True and pool_func is not None, the final output will be a tuple (output after pooling, output_before_pooling). This is useful if the output before pooling is required by a later layer.
dtype – Data type of the weights
device – Device on which to store the weights
**kwargs – Additional arguments to be passed to nn.Conv1d

order = 'CDNRA'[source]#

conv[source]#

act[source]#

pool[source]#

dropout[source]#

residual = False[source]#

return_pre_pool = False[source]#

forward(x: torch.Tensor) → torch.Tensor[source]#

Parameters:: x – Input data.

class grelu.model.blocks.ChannelTransformBlock(in_channels: int, out_channels: int, norm: bool = False, act_func: str = 'relu', dropout: float = 0.0, order: str = 'CDNA', norm_type='batch', norm_kwargs: dict | None = None, if_equal: bool = False, dtype=None, device=None)[source]#

Bases: torch.nn.Module

Convolutional layer with kernel size=1 along with optional normalization, activation and dropout

Parameters:

in_channels – Number of channels in the input
out_channels – Number of channels in the output
act_func – Name of the activation function
dropout – Dropout probability
norm_type – Type of normalization to apply: ‘batch’, ‘syncbatch’, ‘layer’, ‘instance’ or None
norm_kwargs – Optional dictionary of keyword arguments to pass to the normalization layers
order – A string representing the order in which operations are to be performed on the input. For example, “CDNA” means that the operations will be performed in the order: convolution, dropout, batch norm, activation.
if_equal – If True, create a layer even if the input and output channels are equal.
device – Device on which to store the weights
dtype – Data type of the weights

order = 'CDNA'[source]#

conv[source]#

act[source]#

dropout[source]#

forward(x: torch.Tensor) → torch.Tensor[source]#

Forward pass

Parameters:: x – Input tensor of shape (N, C, L)
Returns:: Output tensor

class grelu.model.blocks.Stem(out_channels: int, kernel_size: int, act_func: str = 'relu', pool_func: str | None = None, pool_size: str | None = None, dtype=None, device=None)[source]#

Bases: torch.nn.Module

Convolutional layer followed by optional activation and pooling. Meant to take one-hot encoded DNA sequence as input

Parameters:

out_channels – Number of channels in the output
kernel_size – Convolutional kernel width
act_func – Name of the activation function
pool_func – Name of the pooling function
pool_size – Width of pooling layer
dtype – Data type of the weights
device – Device on which to store the weights

conv[source]#

act[source]#

pool[source]#

forward(x: torch.Tensor) → torch.Tensor[source]#

Forward pass

Parameters:: x – Input tensor of shape (N, C, L)
Returns:: Output tensor

class grelu.model.blocks.SeparableConv(in_channels: int, kernel_size: int, dtype=None, device=None)[source]#

Bases: torch.nn.Module

Equivalent class to tf.keras.layers.SeparableConv1D

Parameters:

in_channels – Number of channels in the input
kernel_size – Convolutional kernel width
dtype – Data type of the weights
device – Device on which to store the weights

depthwise[source]#

pointwise[source]#

forward(x: torch.Tensor) → torch.Tensor[source]#

Forward pass

Parameters:: x – Input tensor of shape (N, C, L)
Returns:: Output tensor

class grelu.model.blocks.ConvTower(stem_channels: int, stem_kernel_size: int, n_blocks: int = 2, channel_init: int = 16, channel_mult: float = 1, kernel_size: int = 5, dilation_init: int = 1, dilation_mult: float = 1, act_func: str = 'relu', norm: bool = False, norm_kwargs: dict | None = None, pool_func: str | None = None, pool_size: int | None = None, residual: bool = False, dropout: float = 0.0, order: str = 'CDNRA', crop_len: int | str = 0, dtype=None, device=None)[source]#

Bases: torch.nn.Module

A module that consists of multiple convolutional blocks and takes a one-hot encoded DNA sequence as input.

Parameters:

n_blocks – Number of convolutional blocks, including the stem
stem_channels – Number of channels in the stem,
stem_kernel_size – Kernel width for the stem
kernel_size – Convolutional kernel width
channel_init – Initial number of channels,
channel_mult – Factor by which to multiply the number of channels in each block
dilation_init – Initial dilation
dilation_mult – Factor by which to multiply the dilation in each block
act_func – Name of the activation function
pool_func – Name of the pooling function
pool_size – Width of the pooling layers
dropout – Dropout probability
norm – If True, apply batch norm
norm_kwargs – Optional dictionary of keyword arguments to pass to the normalization layers
residual – If True, apply residual connection
order – A string representing the order in which operations are to be performed on the input. For example, “CDNRA” means that the operations will be performed in the order: convolution, dropout, batch norm, residual addition, activation. Pooling is not included as it is always performed last.
crop_len – Number of positions to crop at either end of the output
dtype – Data type of the weights
device – Device on which to store

blocks[source]#

receptive_field[source]#

pool_factor = 1[source]#

out_channels[source]#

crop[source]#

forward(x: torch.Tensor) → torch.Tensor[source]#

Forward pass

Parameters:: x – Input tensor of shape (N, C, L)
Returns:: Output tensor

class grelu.model.blocks.FeedForwardBlock(in_len: int, dropout: float = 0.0, act_func: str = 'relu', norm_kwargs: dict | None = None, **kwargs)[source]#

Bases: torch.nn.Module

2-layer feed-forward network. Can be used to follow layers such as GRU and attention.

Parameters:

in_len – Length of the input tensor
dropout – Dropout probability
act_func – Name of the activation function
norm_kwargs – Optional dictionary of keyword arguments to pass to the normalization layers
**kwargs – Additional arguments to be passed to the linear layers

dense1[source]#

dense2[source]#

forward(x: torch.Tensor) → torch.Tensor[source]#

Forward pass

Parameters:: x – Input tensor of shape (N, C, L)
Returns:: Output tensor

class grelu.model.blocks.GRUBlock(in_channels: int, n_layers: int = 1, dropout: float = 0.0, act_func: str = 'relu', norm: bool = False, dtype=None, device=None)[source]#

Bases: torch.nn.Module

Stacked bidirectional GRU layers followed by a feed-forward network.

Parameters:

in_channels – The number of channels in the input
n_layers – The number of GRU layers
gru_hidden_size – Number of hidden elements in GRU layers
dropout – Dropout probability
act_func – Name of the activation function for feed-forward network
norm – If True, include layer normalization in feed-forward network.
dtype – Data type of the weights
device – Device on which to store the weights

gru[source]#

ffn[source]#

forward(x: torch.Tensor) → torch.Tensor[source]#

Forward pass

Parameters:: x – Input tensor of shape (N, C, L)
Returns:: Output tensor

class grelu.model.blocks.TransformerBlock(in_len: int, n_heads: int, attn_dropout: float, ff_dropout: float, flash_attn: bool, n_pos_features: int | None = None, key_len: int | None = None, value_len: int | None = None, pos_dropout: float | None = None, norm_kwargs: dict | None = None, dtype=None, device=None)[source]#

Bases: torch.nn.Module

A block containing a multi-head attention layer followed by a feed-forward network and residual connections.

Parameters:

in_len – Length of the input
n_heads – Number of attention heads
attn_dropout – Dropout probability in the output layer
ff_droppout – Dropout probability in the linear feed-forward layers
flash_attn – If True, uses Flash Attention with Rotational Position Embeddings. key_len, value_len, pos_dropout and n_pos_features are ignored.
n_pos_features – Number of positional embedding features
key_len – Length of the key vectors
value_len – Length of the value vectors.
pos_dropout – Dropout probability in the positional embeddings
norm_kwargs – Optional dictionary of keyword arguments to pass to the normalization layers
dtype – Data type of the weights
device – Device on which to store the weights

flash_attn_warn = False[source]#

norm[source]#

dropout[source]#

ffn[source]#

forward(x: torch.Tensor) → torch.Tensor[source]#

Forward pass

Parameters:: x – Input tensor of shape (N, C, L)
Returns:: Output tensor

class grelu.model.blocks.TransformerTower(in_channels: int, n_blocks: int = 1, n_heads: int = 1, n_pos_features: int = 32, key_len: int = 64, value_len: int = 64, pos_dropout: float = 0.0, attn_dropout: float = 0.0, ff_dropout: float = 0.0, norm_kwargs: dict | None = None, flash_attn: bool = False, dtype=None, device=None)[source]#

Bases: torch.nn.Module

Multiple stacked transformer encoder layers.

Parameters:

in_channels – Number of channels in the input
n_blocks – Number of stacked transformer blocks
n_heads – Number of attention heads
n_pos_features – Number of positional embedding features
key_len – Length of the key vectors
value_len – Length of the value vectors.
pos_dropout – Dropout probability in the positional embeddings
attn_dropout – Dropout probability in the attention layer
ff_dropout – Dropout probability in the feed-forward layers
norm_kwargs – Optional dictionary of keyword arguments to pass to the normalization layers
flash_attn – If True, uses Flash Attention with Rotational Position Embeddings
dtype – Data type of the weights
device – Device on which to store the weights

blocks[source]#

forward(x: torch.Tensor) → torch.Tensor[source]#

Forward pass

Parameters:: x – Input tensor of shape (N, C, L)
Returns:: Output tensor

class grelu.model.blocks.UnetBlock(in_channels: int, y_in_channels: int, norm_type='batch', norm_kwargs: dict | None = None, act_func='gelu_borzoi', dtype=None, device=None)[source]#

Bases: torch.nn.Module

Upsampling U-net block

Parameters:

in_channels – Number of channels in the input
y_in_channels – Number of channels in the higher-resolution representation.
norm_type – Type of normalization to apply: ‘batch’, ‘syncbatch’, ‘layer’, ‘instance’ or None
norm_kwargs – Optional dictionary of keyword arguments to pass to the normalization layers
act_func – Name of the activation function. Defaults to ‘gelu_borzoi’ which uses tanh approximation (different from PyTorch’s default GELU implementation).
dtype – Data type of the weights
device – Device on which to store the weights

conv[source]#

upsample[source]#

channel_transform[source]#

sconv[source]#

forward(x: torch.Tensor, y: torch.Tensor) → torch.Tensor[source]#

Forward pass

Parameters:: x – Input tensor of shape (N, C, L)
Returns:: Output tensor

class grelu.model.blocks.UnetTower(in_channels: int, y_in_channels: List[int], n_blocks: int, act_func: str = 'gelu_borzoi', **kwargs)[source]#

Bases: torch.nn.Module

Upsampling U-net tower for the Borzoi model

Parameters:

in_channels – Number of channels in the input
y_in_channels – Number of channels in the higher-resolution representations.
n_blocks – Number of U-net blocks
act_func – Name of the activation function. Defaults to ‘gelu_borzoi’ which uses tanh approximation (different from PyTorch’s default GELU implementation).
kwargs – Additional arguments to be passed to the U-net blocks

blocks[source]#

forward(x: torch.Tensor, ys: List[torch.Tensor]) → torch.Tensor[source]#

Forward pass

Parameters:

x – Input tensor of shape (N, C, L)
ys – Higher-resolution representations

Returns:

Output tensor

grelu.model.blocks#

Classes#

Module Contents#

This Page