Transformer Decoder¶
- class textbox.module.Decoder.transformer_decoder.TransformerDecoder(embedding_size, ffn_size, num_dec_layers, num_heads, attn_dropout_ratio=0.0, attn_weight_dropout_ratio=0.0, ffn_dropout_ratio=0.0, with_external=True)[source]¶
Bases:
Module
The stacked Transformer decoder layers.
- forward(x, kv=None, self_padding_mask=None, self_attn_mask=None, external_states=None, external_padding_mask=None)[source]¶
Implement the decoding process step by step.
- Parameters
x (Torch.Tensor) – target sequence embedding, shape: [batch_size, sequence_length, embedding_size].
kv (Torch.Tensor) – the cached history latent vector, shape: [batch_size, sequence_length, embedding_size], default: None.
self_padding_mask (Torch.Tensor) – padding mask of target sequence, shape: [batch_size, sequence_length], default: None.
self_attn_mask (Torch.Tensor) – diagonal attention mask matrix of target sequence, shape: [batch_size, sequence_length, sequence_length], default: None.
external_states (Torch.Tensor) – output features of encoder, shape: [batch_size, sequence_length, feature_size], default: None.
external_padding_mask (Torch.Tensor) – padding mask of source sequence, shape: [batch_size, sequence_length], default: None.
- Returns
output features, shape: [batch_size, sequence_length, ffn_size].
- Return type
Torch.Tensor
- training: bool¶