
Blocks composed of multiple layers.



Linear layer followed by optional normalization,


Convolutional layer along with optional normalization,


Convolutional layer with kernel size=1 along with optional normalization, activation


Convolutional layer followed by optional activation and pooling.


Equivalent class to tf.keras.layers.SeparableConv1D


A module that consists of multiple convolutional blocks and takes a one-hot encoded


2-layer feed-forward network. Can be used to follow layers such as GRU and attention.


Stacked bidirectional GRU layers followed by a feed-forward network.


A block containing a multi-head attention layer followed by a feed-forward


Multiple stacked transformer encoder layers.


Upsampling U-net block


Upsampling U-net tower for the Borzoi model

Module Contents#

class grelu.model.blocks.LinearBlock(in_len: int, out_len: int, act_func: str = 'relu', dropout: float = 0.0, norm: bool = False, bias: bool = True)[source]#

Bases: torch.nn.Module

Linear layer followed by optional normalization, activation and dropout.

  • in_len – Length of input

  • out_len – Length of output

  • act_func – Name of activation function

  • dropout – Dropout probability

  • norm – If True, apply layer normalization

  • bias – If True, include bias term.

forward(x: torch.Tensor) torch.Tensor[source]#

Forward pass


x – Input tensor of shape (N, C, L)


Output tensor

class grelu.model.blocks.ConvBlock(in_channels: int, out_channels: int, kernel_size: int, dilation: int = 1, act_func: str = 'relu', pool_func: str | None = None, pool_size: str | None = None, dropout: float = 0.0, norm: bool = True, residual: bool = False, order: str = 'CDNRA', bias: bool = True, return_pre_pool: bool = False, **kwargs)[source]#

Bases: torch.nn.Module

Convolutional layer along with optional normalization, activation, dilation, dropout, residual connection, and pooling. The order of these operations can be specified, except for pooling, which always comes last.

  • in_channels – Number of channels in the input

  • out_channels – Number of channels in the output

  • kernel_size – Convolutional kernel width

  • dilation – Dilation

  • act_func – Name of the activation function

  • pool_func – Name of the pooling function

  • pool_size – Pooling width

  • dropout – Dropout probability

  • norm – If True, apply batch norm

  • residual – If True, apply residual connection

  • order – A string representing the order in which operations are to be performed on the input. For example, “CDNRA” means that the operations will be performed in the order: convolution, dropout, batch norm, residual addition, activation. Pooling is not included as it is always performed last.

  • return_pre_pool – If this is True and pool_func is not None, the final output will be a tuple (output after pooling, output_before_pooling). This is useful if the output before pooling is required by a later layer.

  • **kwargs – Additional arguments to be passed to nn.Conv1d

forward(x: torch.Tensor) torch.Tensor[source]#

x – Input data.

class grelu.model.blocks.ChannelTransformBlock(in_channels: int, out_channels: int, norm: bool = False, act_func: str = 'relu', dropout: float = 0.0, order: str = 'CDNA', if_equal: bool = False)[source]#

Bases: torch.nn.Module

Convolutional layer with kernel size=1 along with optional normalization, activation and dropout

  • in_channels – Number of channels in the input

  • out_channels – Number of channels in the output

  • act_func – Name of the activation function

  • dropout – Dropout probability

  • norm – If True, apply batch norm

  • order – A string representing the order in which operations are to be performed on the input. For example, “CDNA” means that the operations will be performed in the order: convolution, dropout, batch norm, activation.

  • if_equal – If True, create a layer even if the input and output channels are equal.

forward(x: torch.Tensor) torch.Tensor[source]#

Forward pass


x – Input tensor of shape (N, C, L)


Output tensor

class grelu.model.blocks.Stem(out_channels: int, kernel_size: int, act_func: str = 'relu', pool_func: str | None = None, pool_size: str | None = None)[source]#

Bases: torch.nn.Module

Convolutional layer followed by optional activation and pooling. Meant to take one-hot encoded DNA sequence as input

  • out_channels – Number of channels in the output

  • kernel_size – Convolutional kernel width

  • act_func – Name of the activation function

  • pool_func – Name of the pooling function

  • pool_size – Width of pooling layer

forward(x: torch.Tensor) torch.Tensor[source]#

Forward pass


x – Input tensor of shape (N, C, L)


Output tensor

class grelu.model.blocks.SeparableConv(in_channels: int, kernel_size: int)[source]#

Bases: torch.nn.Module

Equivalent class to tf.keras.layers.SeparableConv1D

  • in_channels – Number of channels in the input

  • kernel_size – Convolutional kernel width

forward(x: torch.Tensor) torch.Tensor[source]#

Forward pass


x – Input tensor of shape (N, C, L)


Output tensor

class grelu.model.blocks.ConvTower(stem_channels: int, stem_kernel_size: int, n_blocks: int = 2, channel_init: int = 16, channel_mult: float = 1, kernel_size: int = 5, dilation_init: int = 1, dilation_mult: float = 1, act_func: str = 'relu', norm: bool = False, pool_func: str | None = None, pool_size: int | None = None, residual: bool = False, dropout: float = 0.0, order: str = 'CDNRA', crop_len: int | str = 0)[source]#

Bases: torch.nn.Module

A module that consists of multiple convolutional blocks and takes a one-hot encoded DNA sequence as input.

  • n_blocks – Number of convolutional blocks, including the stem

  • stem_channels – Number of channels in the stem,

  • stem_kernel_size – Kernel width for the stem

  • kernel_size – Convolutional kernel width

  • channel_init – Initial number of channels,

  • channel_mult – Factor by which to multiply the number of channels in each block

  • dilation_init – Initial dilation

  • dilation_mult – Factor by which to multiply the dilation in each block

  • act_func – Name of the activation function

  • pool_func – Name of the pooling function

  • pool_size – Width of the pooling layers

  • dropout – Dropout probability

  • norm – If True, apply batch norm

  • residual – If True, apply residual connection

  • order – A string representing the order in which operations are to be performed on the input. For example, “CDNRA” means that the operations will be performed in the order: convolution, dropout, batch norm, residual addition, activation. Pooling is not included as it is always performed last.

  • crop_len – Number of positions to crop at either end of the output

forward(x: torch.Tensor) torch.Tensor[source]#

Forward pass


x – Input tensor of shape (N, C, L)


Output tensor

class grelu.model.blocks.FeedForwardBlock(in_len: int, dropout: float = 0.0, act_func: str = 'relu')[source]#

Bases: torch.nn.Module

2-layer feed-forward network. Can be used to follow layers such as GRU and attention.

  • in_len – Length of the input tensor

  • dropout – Dropout probability

  • act_func – Name of the activation function

forward(x: torch.Tensor) torch.Tensor[source]#

Forward pass


x – Input tensor of shape (N, C, L)


Output tensor

class grelu.model.blocks.GRUBlock(in_channels: int, n_layers: int = 1, dropout: float = 0.0, act_func: str = 'relu', norm: bool = False)[source]#

Bases: torch.nn.Module

Stacked bidirectional GRU layers followed by a feed-forward network.

  • in_channels – The number of channels in the input

  • n_layers – The number of GRU layers

  • gru_hidden_size – Number of hidden elements in GRU layers

  • dropout – Dropout probability

  • act_func – Name of the activation function for feed-forward network

  • norm – If True, include layer normalization in feed-forward network.

forward(x: torch.Tensor) torch.Tensor[source]#

Forward pass


x – Input tensor of shape (N, C, L)


Output tensor

class grelu.model.blocks.TransformerBlock(in_len: int, n_heads: int, n_pos_features: int, key_len: int, value_len: int, pos_dropout: float, attn_dropout: float, ff_dropout: float)[source]#

Bases: torch.nn.Module

A block containing a multi-head attention layer followed by a feed-forward network and residual connections.

  • in_len – Length of the input

  • n_heads – Number of attention heads

  • n_pos_features – Number of positional embedding features

  • key_len – Length of the key vectors

  • value_len – Length of the value vectors.

  • pos_dropout – Dropout probability in the positional embeddings

  • attn_dropout – Dropout probability in the output layer

  • ff_droppout – Dropout probability in the linear feed-forward layers

forward(x: torch.Tensor) torch.Tensor[source]#

Forward pass


x – Input tensor of shape (N, C, L)


Output tensor

class grelu.model.blocks.TransformerTower(in_channels: int, n_blocks: int = 1, n_heads: int = 1, n_pos_features: int = 32, key_len: int = 64, value_len: int = 64, pos_dropout: float = 0.0, attn_dropout: float = 0.0, ff_dropout: float = 0.0)[source]#

Bases: torch.nn.Module

Multiple stacked transformer encoder layers.

  • in_channels – Number of channels in the input

  • n_blocks – Number of stacked transformer blocks

  • n_heads – Number of attention heads

  • n_pos_features – Number of positional embedding features

  • key_len – Length of the key vectors

  • value_len – Length of the value vectors.

  • pos_dropout – Dropout probability in the positional embeddings

  • attn_dropout – Dropout probability in the output layer

  • ff_droppout – Dropout probability in the linear feed-forward layers

forward(x: torch.Tensor) torch.Tensor[source]#

Forward pass


x – Input tensor of shape (N, C, L)


Output tensor

class grelu.model.blocks.UnetBlock(in_channels: int, y_in_channels: int)[source]#

Bases: torch.nn.Module

Upsampling U-net block

  • in_channels – Number of channels in the input

  • y_in_channels – Number of channels in the higher-resolution representation.

forward(x: torch.Tensor, y: torch.Tensor) torch.Tensor[source]#

Forward pass


x – Input tensor of shape (N, C, L)


Output tensor

class grelu.model.blocks.UnetTower(in_channels: int, y_in_channels: List[int], n_blocks: int)[source]#

Bases: torch.nn.Module

Upsampling U-net tower for the Borzoi model

  • in_channels – Number of channels in the input

  • y_in_channels – Number of channels in the higher-resolution representations.

  • n_blocks – Number of U-net blocks

forward(x: torch.Tensor, ys: List[torch.Tensor]) torch.Tensor[source]#

Forward pass

  • x – Input tensor of shape (N, C, L)

  • ys – Higher-resolution representations


Output tensor