bitorch_engine.layers.qembedding.binary.layer.BinaryEmbeddingBag

class bitorch_engine.layers.qembedding.binary.layer.BinaryEmbeddingBag(*args: int, num_embeddings: int, embedding_dim: int, padding_idx: int | None = None, **kwargs: int)[source]

An binary embedding bag implementation.

This module implements a binarized version of the standard embedding layer, utilizing boolean weights for embeddings. It is specifically designed for scenarios requiring binary weight parameters and includes an experimental optimizer for training.

Note

This module is EXPERIMENTAL and not guaranteed to be error-free or always operational.

Training requires a custom optimizer due to the boolean nature of weight parameters and gradients.

Note on Boolean Weight Representation in PyTorch:

PyTorch represents boolean (bool) type tensors using the Char type, which occupies 8 bits per value. Thus, despite being boolean in nature, the weights in this implementation are not truly 1-bit weights, as each boolean value is stored in an 8-bit format. This is important to consider when evaluating the memory efficiency and computational performance of models using these binary weights.

The boolean weight.shape = (num_embeddings, embedding_dim), which is as same as the standard embedding layers, with 4x memory reduction by using Char tensor type.

Note on Optimizer Requirements for Boolean Weights:

When both weights and their gradients are of boolean type, the optimizer must employ a specialized update mechanism. Traditional gradient descent methods cannot be directly applied since boolean values do not support the typical arithmetic operations involved in weight updates. Instead, the optimizer should implement logic that decides the binary state of weights based on certain criteria or rules derived from the boolean gradients. This might involve strategies like flipping the state of a weight based on the presence or absence of a gradient, or using a voting system across multiple training steps to determine the change. The development of such an optimizer requires careful consideration to effectively train models with binary weights while adhering to the limitations and characteristics of boolean algebra.

Parameters:
  • num_embeddings (int) – Size of the dictionary of embeddings.

  • embedding_dim (int) – The size of each embedding vector.

  • padding_idx (Optional[int]) – Specifies a padding index. Embeddings at this index will be zeroed out.

Methods

__init__

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward

Computes the binary embedding bag for given input indices.

reset_parameters

Resets parameters by zeroing out the padding index if specified.

Attributes

training

__init__(*args: int, num_embeddings: int, embedding_dim: int, padding_idx: int | None = None, **kwargs: int) None[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(input: Tensor) Tensor[source]

Computes the binary embedding bag for given input indices.

Parameters:

input (Tensor) – Tensor of indices to fetch embeddings for.

Returns:

The resulting tensor after applying binary embedding bag logic.

Return type:

Tensor

reset_parameters() None[source]

Resets parameters by zeroing out the padding index if specified.