bitorch_engine.layers.qembedding.binary.layer.BinaryEmbeddingBagForward
- class bitorch_engine.layers.qembedding.binary.layer.BinaryEmbeddingBagForward(*args, **kwargs)[source]
An experimental PyTorch function for forward pass of binary embedding bag.
This class represents a custom autograd function for a binary embedding bag, designed to work with boolean weight parameters. Specialized optimizers are required for training due to the boolean nature of weights and their gradients.
The forward pass performs an embedding lookup and binarizes the output based on the majority of ones in the sliced embeddings.
Note: This class is experimental and may not be error-free or always functional.
- Parameters:
input (Tensor) – Input tensor containing indices for embedding lookup.
weight (Tensor) – Boolean weight tensor for embeddings.
is_train (bool) – Flag indicating if the forward pass is for training.
- Returns:
The result tensor after applying binary embedding logic.
- Return type:
Tensor
Methods
Implements the backward pass for the binary embedding bag function.
The forward pass performs an embedding lookup and binarizes the output based on the majority of ones in the sliced embeddings.
Attributes
- static backward(ctx: BackwardCFunction, output_gradient: Tensor) Tuple[Tensor, ...] [source]
Implements the backward pass for the binary embedding bag function.
This method simply passes the output gradient unchanged as the input gradient, which is a placeholder for future implementations of gradient calculations for boolean weights.
Note on Optimizer Requirements for Boolean Weights:
When both weights and their gradients are of boolean type, the optimizer must employ a specialized update mechanism. Traditional gradient descent methods cannot be directly applied since boolean values do not support the typical arithmetic operations involved in weight updates. Instead, the optimizer should implement logic that decides the binary state of weights based on certain criteria or rules derived from the boolean gradients. This might involve strategies like flipping the state of a weight based on the presence or absence of a gradient, or using a voting system across multiple training steps to determine the change. The development of such an optimizer requires careful consideration to effectively train models with binary weights while adhering to the limitations and characteristics of boolean algebra. “sparse_update_embedding_qweight” method can be used for qweight-update in an optimizer.
- Parameters:
ctx (Any) – Autograd context saving input and weight tensors for backward computation.
output_gradient (torch.Tensor) – Gradient of the loss with respect to the output of the forward pass.
- Returns:
A tuple containing gradients for each input argument. Currently, only the gradient with respect to the weight tensor is calculated.
- Return type:
Tuple[None, torch.Tensor, None]
- static forward(ctx, input: Tensor, weight: Tensor, is_train: bool)[source]
The forward pass performs an embedding lookup and binarizes the output based on the majority of ones in the sliced embeddings.
Note: This class is experimental and may not be error-free or always functional.
- Parameters:
input (Tensor) – Input tensor containing indices for embedding lookup.
weight (Tensor) – Boolean weight tensor for embeddings.
is_train (bool) – Flag indicating if the forward pass is for training.
- Returns:
The result tensor after applying binary embedding logic.
- Return type:
Tensor