textbox.module.Attention¶
Attention Layers¶
- class textbox.module.Attention.attention_mechanism.BahdanauAttention(source_size, target_size)[source]¶
Bases:
Module
- Bahdanau Attention is proposed in the following paper:
Neural Machine Translation by Jointly Learning to Align and Translate.
- Reference:
- forward(hidden_states, encoder_outputs, encoder_masks)[source]¶
Bahdanau attention
- Parameters
hidden_states – shape: [batch_size, tgt_len, target_size]
encoder_outputs – shape: [batch_size, src_len, source_size]
encoder_masks – shape: [batch_size, src_len]
- Returns
context: shape: [batch_size, tgt_len, source_size]
probs: shape: [batch_size, tgt_len, src_len]
- Return type
tuple
- score(hidden_states, encoder_outputs)[source]¶
Calculate the attention scores between encoder outputs and decoder states.
- training: bool¶
- class textbox.module.Attention.attention_mechanism.LuongAttention(source_size, target_size, alignment_method='concat', is_coverage=False)[source]¶
Bases:
Module
Luong Attention is proposed in the following paper: Effective Approaches to Attention-based Neural Machine Translation.
- Reference:
- forward(hidden_states, encoder_outputs, encoder_masks, coverages=None)[source]¶
Luong attention
- Parameters
hidden_states – shape: [batch_size, tgt_len, target_size]
encoder_outputs – shape: [batch_size, src_len, source_size]
encoder_masks – shape: [batch_size, src_len]
- Returns
context: shape: [batch_size, tgt_len, source_size]
probs: shape: [batch_size, tgt_len, src_len]
- Return type
tuple
- score(hidden_states, encoder_outputs, coverages=None)[source]¶
Calculate the attention scores between encoder outputs and decoder states.
- training: bool¶
- class textbox.module.Attention.attention_mechanism.MonotonicAttention(source_size, target_size, init_r=- 4)[source]¶
Bases:
Module
- Monotonic Attention is proposed in the following paper:
Online and Linear-Time Attention by Enforcing Monotonic Alignments.
- Reference:
- hard(hidden_states, encoder_outputs, encoder_masks, previous_probs=None)[source]¶
Hard monotonic attention (Test)
- Parameters
hidden_states – shape: [batch_size, tgt_len, target_size]
encoder_outputs – shape: [batch_size, src_len, source_size]
encoder_masks – shape: [batch_size, src_len]
previous_probs – shape: [batch_size, tgt_len, src_len]
- Returns
context: shape: [batch_size, tgt_len, source_size]
probs: shape: [batch_size, tgt_len, src_len]
- Return type
tuple
- score(hidden_states, encoder_outputs)[source]¶
Calculate the attention scores between encoder outputs and decoder states.
- soft(hidden_states, encoder_outputs, encoder_masks, previous_probs=None)[source]¶
Soft monotonic attention (Train)
- Parameters
hidden_states – shape: [batch_size, tgt_len, target_size]
encoder_outputs – shape: [batch_size, src_len, source_size]
encoder_masks – shape: [batch_size, src_len]
previous_probs – shape: [batch_size, tgt_len, src_len]
- Returns
context: shape: [batch_size, tgt_len, source_size]
probs: shape: [batch_size, tgt_len, src_len]
- Return type
tuple
- training: bool¶
- class textbox.module.Attention.attention_mechanism.MultiHeadAttention(embedding_size, num_heads, attn_weight_dropout_ratio=0.0, return_distribute=False)[source]¶
Bases:
Module
- Multi-head Attention is proposed in the following paper:
Attention Is All You Need.
- Reference:
- forward(query, key, value, key_padding_mask=None, attn_mask=None)[source]¶
Multi-head attention
- Parameters
query – shape: [batch_size, tgt_len, embedding_size]
value (key and) – shape: [batch_size, src_len, embedding_size]
key_padding_mask – shape: [batch_size, src_len]
attn_mask – shape: [batch_size, tgt_len, src_len]
- Returns
attn_repre: shape: [batch_size, tgt_len, embedding_size]
attn_weights: shape: [batch_size, tgt_len, src_len]
- Return type
tuple
- training: bool¶
- class textbox.module.Attention.attention_mechanism.SelfAttentionMask(init_size=100)[source]¶
Bases:
Module
- forward(size)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶