bitorch_engine.layers.qconv.binary.cutlass.layer.BinaryConv2dCutlass

class bitorch_engine.layers.qconv.binary.cutlass.layer.BinaryConv2dCutlass(*args, **kwargs)[source]

A specialized convolutional layer class that implements binary convolution using the CUTLASS library. This class inherits from BinaryConv2dBase and adds specific initializations and methods for operating with binary weights and possibly quantized activations.

bits_binary_word: CUTLASS utilizes uint8_t as the container type for binary operations.

bias_a

Bias parameter for activation quantization.

Type:: torch.nn.Parameter

scale_a

Scale parameter for activation quantization.

Type:: torch.nn.Parameter

scale_w

Scale parameter for weight quantization.

Type:: torch.nn.Parameter

Methods

`__init__`	Initializes the BinaryConv2dCutlass layer with additional parameters specific to CUTLASS implementation.
`forward`	Defines the forward pass of the binary convolutional layer.
`generate_quantized_weight`	Performs bit-packing on the 32-bit weights to generate quantized weights.
`prepare_params`	Prepares and initializes the model parameters for training, specifically converting floating-point weights to int8 format.
`set_activation`	Adjusts the activation values by initializing scale_a based on the layer's input and adds bias.
`set_weight_data`	Sets the weight data from the input tensor and re-initializes from pre-trained weights if available.

Attributes

training

__init__(*args, **kwargs)[source]: Initializes the BinaryConv2dCutlass layer with additional parameters specific to CUTLASS implementation. :param *args: Variable length argument list for base class. :param **kwargs: Arbitrary keyword arguments for base class.

forward(x: Tensor) → Tensor[source]

Defines the forward pass of the binary convolutional layer. :param x: Input tensor with shape (N, C_in, H, W).

Returns:: The output tensor of the convolution operation.

generate_quantized_weight(qweight_only: bool = False) → None[source]: Performs bit-packing on the 32-bit weights to generate quantized weights. :param qweight_only: If True, the original weight tensor is discarded to save memory.

prepare_params() → None[source]

Prepares and initializes the model parameters for training, specifically converting floating-point weights to int8 format.

This method leverages the init_weight function to convert the model’s floating-point weights to int8, achieving a significant reduction in memory usage. It also computes a scale for the weights, which is essential for maintaining the numerical fidelity of the model’s computations in the lower precision format. The conversion to int8 format is particularly beneficial for accelerating training and inference on hardware that supports lower precision arithmetic.

Note

This method MUST be called after model initialization and before training starts to ensure the weights are properly prepared for efficient computation.

One can use “prepare_bie_layers” method from project_root.utils.model_helper to call this function.

set_activation(x: Tensor) → Tensor[source]

Adjusts the activation values by initializing scale_a based on the layer’s input and adds bias. :param x: The input tensor to the convolutional layer.

Returns:: The adjusted input tensor.

set_weight_data(x: Tensor) → None[source]: Sets the weight data from the input tensor and re-initializes from pre-trained weights if available. :param x: The input tensor to set as the new weight data.