textbox.module.layers

Common Layers in text generation

class textbox.module.layers.Highway(num_highway_layers, input_size)[source]

Bases: Module

Highway Layers

Parameters
  • num_highway_layers (-) – number of highway layers.

  • input_size (-) – size of highway input.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class textbox.module.layers.TransformerLayer(embedding_size, ffn_size, num_heads, attn_dropout_ratio=0.0, attn_weight_dropout_ratio=0.0, ffn_dropout_ratio=0.0, with_external=False)[source]

Bases: Module

Transformer Layer, including

a multi-head self-attention, a external multi-head self-attention layer (only for conditional decoder) and a point-wise feed-forward layer.

Parameters
  • self_padding_mask (torch.bool) – the padding mask for the multi head attention sublayer.

  • self_attn_mask (torch.bool) – the attention mask for the multi head attention sublayer.

  • external_states (torch.Tensor) – the external context for decoder, e.g., hidden states from encoder.

  • external_padding_mask (torch.bool) – the padding mask for the external states.

Returns

the output of the point-wise feed-forward sublayer, is the output of the transformer layer

Return type

feedforward_output (torch.Tensor)

forward(x, kv=None, self_padding_mask=None, self_attn_mask=None, external_states=None, external_padding_mask=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

gelu(x)[source]
reset_parameters()[source]
training: bool