bitorch_engine.layers.qconv.binary.cutlass.layer.BinaryConv2dCutlass

class bitorch_engine.layers.qconv.binary.cutlass.layer.BinaryConv2dCutlass(*args, **kwargs)[source]

A specialized convolutional layer class that implements binary convolution using the CUTLASS library. This class inherits from BinaryConv2dBase and adds specific initializations and methods for operating with binary weights and possibly quantized activations.

bits_binary_word

CUTLASS utilizes uint8_t as the container type for binary operations.

bias_a

Bias parameter for activation quantization.

Type:

torch.nn.Parameter

scale_a

Scale parameter for activation quantization.

Type:

torch.nn.Parameter

scale_w

Scale parameter for weight quantization.

Type:

torch.nn.Parameter

Methods

__init__

Initializes the BinaryConv2dCutlass layer with additional parameters specific to CUTLASS implementation.

forward

Defines the forward pass of the binary convolutional layer.

generate_quantized_weight

Performs bit-packing on the 32-bit weights to generate quantized weights.

prepare_params

Prepares and initializes the model parameters for training, specifically converting floating-point weights to int8 format.

set_activation

Adjusts the activation values by initializing scale_a based on the layer's input and adds bias.

set_weight_data

Sets the weight data from the input tensor and re-initializes from pre-trained weights if available.

Attributes

training

__init__(*args, **kwargs)[source]

Initializes the BinaryConv2dCutlass layer with additional parameters specific to CUTLASS implementation. :param *args: Variable length argument list for base class. :param **kwargs: Arbitrary keyword arguments for base class.

forward(x: Tensor) Tensor[source]

Defines the forward pass of the binary convolutional layer. :param x: Input tensor with shape (N, C_in, H, W).

Returns:

The output tensor of the convolution operation.

generate_quantized_weight(qweight_only: bool = False) None[source]

Performs bit-packing on the 32-bit weights to generate quantized weights. :param qweight_only: If True, the original weight tensor is discarded to save memory.

prepare_params() None[source]

Prepares and initializes the model parameters for training, specifically converting floating-point weights to int8 format.

This method leverages the init_weight function to convert the model’s floating-point weights to int8, achieving a significant reduction in memory usage. It also computes a scale for the weights, which is essential for maintaining the numerical fidelity of the model’s computations in the lower precision format. The conversion to int8 format is particularly beneficial for accelerating training and inference on hardware that supports lower precision arithmetic.

Note

This method MUST be called after model initialization and before training starts to ensure the weights are properly prepared for efficient computation.

One can use “prepare_bie_layers” method from project_root.utils.model_helper to call this function.

set_activation(x: Tensor) Tensor[source]

Adjusts the activation values by initializing scale_a based on the layer’s input and adds bias. :param x: The input tensor to the convolutional layer.

Returns:

The adjusted input tensor.

set_weight_data(x: Tensor) None[source]

Sets the weight data from the input tensor and re-initializes from pre-trained weights if available. :param x: The input tensor to set as the new weight data.