bitorch_engine.layers.qlinear.binary.layer.BinaryLinearBase

class bitorch_engine.layers.qlinear.binary.layer.BinaryLinearBase(input_features: int, out_features: int, device: device | None = None, dtype: dtype = torch.float32, symmetric: bool = True)[source]

Base class for binary linear layers, supporting both floating-point and quantized weights.

This class is designed to facilitate the creation of binary linear layers, where weights can be represented in a quantized format for efficient computation, especially on hardware that supports binary operations. It provides a foundation for implementing various types of binary linear operations, including fully connected layers and convolutional layers with binary weights.

bits_binary_word

Number of bits in a binary word, default is 8.

Type:

int

input_features

Dimension of input features after bit-packing.

Type:

int

output_features

Dimension of output features or hidden states.

Type:

int

weight

Floating-point weights, used for training and initialization.

Type:

nn.Parameter

qweight

Quantized weights, used for inference when training is False.

Type:

nn.Parameter

device

Device on which the layer’s tensors will be allocated.

Type:

torch.device

dtype

Data type of the layer’s floating-point weights.

Type:

torch.dtype

symmetric

Indicates if the quantization should be symmetric around 0.

Type:

bool

reset_parameters()[source]

Initializes or resets the layer’s parameters.

set_weight_data()[source]

Sets the layer’s floating-point weights from an external tensor.

set_quantized_weight_data()[source]

Sets the layer’s quantized weights from an external tensor.

generate_quantized_weight()[source]

Converts the floating-point weights to quantized format.

_check_forward()[source]

Validates the compatibility of input tensor and weights before forward pass.

opt_weight(property)

Returns the optimal weights for the current mode (training or inference).

set_bits_binary_word()[source]

Sets the number of bits used in a binary word for quantization.

Methods

__init__

Initializes the BinaryLinearBase class with specified configurations.

generate_quantized_weight

Converts the floating-point weights to a quantized format through bit-packing.

prepare_params

Prepares and initializes the model parameters for training.

reset_parameters

Initializes or resets the floating-point weight parameters using Kaiming uniform initialization.

set_bits_binary_word

Sets the number of bits used in a binary word for the purpose of quantization.

set_quantized_weight_data

Sets the quantized weight parameter from an external tensor, disabling gradient computation.

set_weight_data

Sets the floating-point weight parameter from an external tensor.

Attributes

opt_weight

Returns the optimal weight parameter for the current mode (training or inference).

training

__init__(input_features: int, out_features: int, device: device | None = None, dtype: dtype = torch.float32, symmetric: bool = True) None[source]

Initializes the BinaryLinearBase class with specified configurations.

Parameters:
  • input_features (int) – Dimension of input features after bit-packing.

  • out_features (int) – Dimension of output features or hidden states.

  • device (torch.device, optional) – Device on which to allocate tensors. Defaults to None.

  • dtype (torch.dtype, optional) – Data type for floating-point weights. Defaults to torch.float.

  • symmetric (bool, optional) – If True, quantization is symmetric around 0. Defaults to True.

generate_quantized_weight(qweight_only: bool = False) None[source]

Converts the floating-point weights to a quantized format through bit-packing.

Parameters:

qweight_only (bool, optional) – If True, only updates the qweight without modifying the floating-point weight. Defaults to False.

property opt_weight: Parameter

Returns the optimal weight parameter for the current mode (training or inference).

Returns:

The floating-point weights during training or the quantized weights during inference.

Return type:

nn.Parameter

prepare_params() None[source]

Prepares and initializes the model parameters for training.

Note

This method MUST be called after model initialization and before training starts to ensure the weights are properly prepared for efficient computation.

reset_parameters() None[source]

Initializes or resets the floating-point weight parameters using Kaiming uniform initialization.

set_bits_binary_word(num_bit: int) None[source]

Sets the number of bits used in a binary word for the purpose of quantization.

Parameters:

num_bit (int) – The number of bits to use in a binary word.

set_quantized_weight_data(x: Tensor) None[source]

Sets the quantized weight parameter from an external tensor, disabling gradient computation.

Parameters:

x (torch.Tensor) – A tensor containing the new quantized weight data.

set_weight_data(x: Tensor) None[source]

Sets the floating-point weight parameter from an external tensor.

Parameters:

x (torch.Tensor) – A tensor containing the new weight data.