bitorch_engine.layers.qconv.binary.cutlass.layer.BinaryConv2dCutlass
- class bitorch_engine.layers.qconv.binary.cutlass.layer.BinaryConv2dCutlass(*args, **kwargs)[source]
A specialized convolutional layer class that implements binary convolution using the CUTLASS library. This class inherits from BinaryConv2dBase and adds specific initializations and methods for operating with binary weights and possibly quantized activations.
- bits_binary_word
CUTLASS utilizes uint8_t as the container type for binary operations.
- bias_a
Bias parameter for activation quantization.
- Type:
torch.nn.Parameter
- scale_a
Scale parameter for activation quantization.
- Type:
torch.nn.Parameter
- scale_w
Scale parameter for weight quantization.
- Type:
torch.nn.Parameter
Methods
Initializes the BinaryConv2dCutlass layer with additional parameters specific to CUTLASS implementation.
Defines the forward pass of the binary convolutional layer.
Performs bit-packing on the 32-bit weights to generate quantized weights.
Prepares and initializes the model parameters for training, specifically converting floating-point weights to int8 format.
Adjusts the activation values by initializing scale_a based on the layer's input and adds bias.
Sets the weight data from the input tensor and re-initializes from pre-trained weights if available.
Attributes
training
- __init__(*args, **kwargs)[source]
Initializes the BinaryConv2dCutlass layer with additional parameters specific to CUTLASS implementation. :param *args: Variable length argument list for base class. :param **kwargs: Arbitrary keyword arguments for base class.
- forward(x: Tensor) Tensor [source]
Defines the forward pass of the binary convolutional layer. :param x: Input tensor with shape (N, C_in, H, W).
- Returns:
The output tensor of the convolution operation.
- generate_quantized_weight(qweight_only: bool = False) None [source]
Performs bit-packing on the 32-bit weights to generate quantized weights. :param qweight_only: If True, the original weight tensor is discarded to save memory.
- prepare_params() None [source]
Prepares and initializes the model parameters for training, specifically converting floating-point weights to int8 format.
This method leverages the init_weight function to convert the model’s floating-point weights to int8, achieving a significant reduction in memory usage. It also computes a scale for the weights, which is essential for maintaining the numerical fidelity of the model’s computations in the lower precision format. The conversion to int8 format is particularly beneficial for accelerating training and inference on hardware that supports lower precision arithmetic.
Note
This method MUST be called after model initialization and before training starts to ensure the weights are properly prepared for efficient computation.
One can use “prepare_bie_layers” method from project_root.utils.model_helper to call this function.