bitorch_engine.layers.qlinear.binary.layer.BinaryLinearBase
- class bitorch_engine.layers.qlinear.binary.layer.BinaryLinearBase(input_features: int, out_features: int, device: device | None = None, dtype: dtype = torch.float32, symmetric: bool = True)[source]
Base class for binary linear layers, supporting both floating-point and quantized weights.
This class is designed to facilitate the creation of binary linear layers, where weights can be represented in a quantized format for efficient computation, especially on hardware that supports binary operations. It provides a foundation for implementing various types of binary linear operations, including fully connected layers and convolutional layers with binary weights.
- bits_binary_word
Number of bits in a binary word, default is 8.
- Type:
int
- input_features
Dimension of input features after bit-packing.
- Type:
int
- output_features
Dimension of output features or hidden states.
- Type:
int
- weight
Floating-point weights, used for training and initialization.
- Type:
nn.Parameter
- qweight
Quantized weights, used for inference when training is False.
- Type:
nn.Parameter
- device
Device on which the layer’s tensors will be allocated.
- Type:
torch.device
- dtype
Data type of the layer’s floating-point weights.
- Type:
torch.dtype
- symmetric
Indicates if the quantization should be symmetric around 0.
- Type:
bool
- _check_forward()[source]
Validates the compatibility of input tensor and weights before forward pass.
- opt_weight(property)
Returns the optimal weights for the current mode (training or inference).
Methods
Initializes the BinaryLinearBase class with specified configurations.
Converts the floating-point weights to a quantized format through bit-packing.
Prepares and initializes the model parameters for training.
Initializes or resets the floating-point weight parameters using Kaiming uniform initialization.
Sets the number of bits used in a binary word for the purpose of quantization.
Sets the quantized weight parameter from an external tensor, disabling gradient computation.
Sets the floating-point weight parameter from an external tensor.
Attributes
Returns the optimal weight parameter for the current mode (training or inference).
training
- __init__(input_features: int, out_features: int, device: device | None = None, dtype: dtype = torch.float32, symmetric: bool = True) None [source]
Initializes the BinaryLinearBase class with specified configurations.
- Parameters:
input_features (int) – Dimension of input features after bit-packing.
out_features (int) – Dimension of output features or hidden states.
device (torch.device, optional) – Device on which to allocate tensors. Defaults to None.
dtype (torch.dtype, optional) – Data type for floating-point weights. Defaults to torch.float.
symmetric (bool, optional) – If True, quantization is symmetric around 0. Defaults to True.
- generate_quantized_weight(qweight_only: bool = False) None [source]
Converts the floating-point weights to a quantized format through bit-packing.
- Parameters:
qweight_only (bool, optional) – If True, only updates the qweight without modifying the floating-point weight. Defaults to False.
- property opt_weight: Parameter
Returns the optimal weight parameter for the current mode (training or inference).
- Returns:
The floating-point weights during training or the quantized weights during inference.
- Return type:
nn.Parameter
- prepare_params() None [source]
Prepares and initializes the model parameters for training.
Note
This method MUST be called after model initialization and before training starts to ensure the weights are properly prepared for efficient computation.
- reset_parameters() None [source]
Initializes or resets the floating-point weight parameters using Kaiming uniform initialization.
- set_bits_binary_word(num_bit: int) None [source]
Sets the number of bits used in a binary word for the purpose of quantization.
- Parameters:
num_bit (int) – The number of bits to use in a binary word.