bitorch_engine.layers.qlinear.binary.cutlass.layer.BinaryMatMulFunction

class bitorch_engine.layers.qlinear.binary.cutlass.layer.BinaryMatMulFunction(*args, **kwargs)[source]

This class implements a custom autograd function for binary matrix multiplication. It utilizes specialized hardware acceleration (e.g., CUTLASS) for efficient binary operations and is optimized for handling binary inputs with padding to multiples of 128 for better performance.

The forward pass performs binary matrix multiplication with additional steps to handle padding, while the backward pass computes gradients with respect to the inputs, considering clipping thresholds.

Methods

`backward`	Computes the gradients for the backward pass of the binary matrix multiplication.
`forward`	Performs the forward pass of the binary matrix multiplication.

Attributes

static backward(ctx: BackwardCFunction, output_gradient: Tensor) → Tuple[Tensor, Tensor, Tensor, Tensor][source]

Computes the gradients for the backward pass of the binary matrix multiplication.

Parameters:

ctx – The context object where saved tensors are retrieved.
output_gradient (torch.Tensor) – The gradient of the loss with respect to the output of the forward pass.

Returns:

Gradients with respect to the inputs x, y, and their respective clipping values x_clip and y_clip.

Return type:

Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]

static forward(ctx, x, y, x_clip, y_clip) → Tensor[source]

Performs the forward pass of the binary matrix multiplication.

Parameters:

ctx – The context object for storing information for backward computation.
x (torch.Tensor) – The first input tensor.
y (torch.Tensor) – The second input tensor.
x_clip (torch.Tensor) – The clipping value for the first input tensor.
y_clip (torch.Tensor) – The clipping value for the second input tensor.

Returns:

The output tensor resulting from the binary matrix multiplication.

Return type:

torch.Tensor