textbox.module.Attention¶

Attention Layers¶

class textbox.module.Attention.attention_mechanism.BahdanauAttention(source_size, target_size)[source]¶

Bases: Module

Bahdanau Attention is proposed in the following paper:: Neural Machine Translation by Jointly Learning to Align and Translate.
Reference:: https://arxiv.org/abs/1409.0473

forward(hidden_states, encoder_outputs, encoder_masks)[source]¶

Bahdanau attention

Parameters

hidden_states – shape: [batch_size, tgt_len, target_size]
encoder_outputs – shape: [batch_size, src_len, source_size]
encoder_masks – shape: [batch_size, src_len]

Returns

context: shape: [batch_size, tgt_len, source_size]
probs: shape: [batch_size, tgt_len, src_len]

Return type

tuple

score(hidden_states, encoder_outputs)[source]¶: Calculate the attention scores between encoder outputs and decoder states.

training: bool¶

class textbox.module.Attention.attention_mechanism.LuongAttention(source_size, target_size, alignment_method='concat', is_coverage=False)[source]¶

Bases: Module

Luong Attention is proposed in the following paper: Effective Approaches to Attention-based Neural Machine Translation.

Reference:: https://arxiv.org/abs/1508.04025

forward(hidden_states, encoder_outputs, encoder_masks, coverages=None)[source]¶

Luong attention

Parameters

hidden_states – shape: [batch_size, tgt_len, target_size]
encoder_outputs – shape: [batch_size, src_len, source_size]
encoder_masks – shape: [batch_size, src_len]

Returns

context: shape: [batch_size, tgt_len, source_size]
probs: shape: [batch_size, tgt_len, src_len]

Return type

tuple

score(hidden_states, encoder_outputs, coverages=None)[source]¶: Calculate the attention scores between encoder outputs and decoder states.

training: bool¶

class textbox.module.Attention.attention_mechanism.MonotonicAttention(source_size, target_size, init_r=- 4)[source]¶

Bases: Module

Monotonic Attention is proposed in the following paper:: Online and Linear-Time Attention by Enforcing Monotonic Alignments.
Reference:: https://arxiv.org/abs/1704.00784

exclusive_cumprod(x)[source]¶: Exclusive cumulative product [a, b, c] => [1, a, a * b]

gaussian_noise(*size)[source]¶: Additive gaussian nosie to encourage discreteness

hard(hidden_states, encoder_outputs, encoder_masks, previous_probs=None)[source]¶

Hard monotonic attention (Test)

Parameters

hidden_states – shape: [batch_size, tgt_len, target_size]
encoder_outputs – shape: [batch_size, src_len, source_size]
encoder_masks – shape: [batch_size, src_len]
previous_probs – shape: [batch_size, tgt_len, src_len]

Returns

context: shape: [batch_size, tgt_len, source_size]
probs: shape: [batch_size, tgt_len, src_len]

Return type

tuple

safe_cumprod(x)[source]¶: Numerically stable cumulative product by cumulative sum in log-space

score(hidden_states, encoder_outputs)[source]¶: Calculate the attention scores between encoder outputs and decoder states.

soft(hidden_states, encoder_outputs, encoder_masks, previous_probs=None)[source]¶

Soft monotonic attention (Train)

Parameters

hidden_states – shape: [batch_size, tgt_len, target_size]
encoder_outputs – shape: [batch_size, src_len, source_size]
encoder_masks – shape: [batch_size, src_len]
previous_probs – shape: [batch_size, tgt_len, src_len]

Returns

context: shape: [batch_size, tgt_len, source_size]
probs: shape: [batch_size, tgt_len, src_len]

Return type

tuple

training: bool¶

class textbox.module.Attention.attention_mechanism.MultiHeadAttention(embedding_size, num_heads, attn_weight_dropout_ratio=0.0, return_distribute=False)[source]¶

Bases: Module

Multi-head Attention is proposed in the following paper:: Attention Is All You Need.
Reference:: https://arxiv.org/abs/1706.03762

forward(query, key, value, key_padding_mask=None, attn_mask=None)[source]¶

Multi-head attention

Parameters

query – shape: [batch_size, tgt_len, embedding_size]
value (key and) – shape: [batch_size, src_len, embedding_size]
key_padding_mask – shape: [batch_size, src_len]
attn_mask – shape: [batch_size, tgt_len, src_len]

Returns

attn_repre: shape: [batch_size, tgt_len, embedding_size]
attn_weights: shape: [batch_size, tgt_len, src_len]

Return type

tuple

reset_parameters()[source]¶

training: bool¶

class textbox.module.Attention.attention_mechanism.SelfAttentionMask(init_size=100)[source]¶

Bases: Module

forward(size)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static get_mask(size)[source]¶

training: bool¶