bitorch_engine.layers.qlinear.nbit.layer

Classes

`MPQLinearBase`(in_channels, out_channels[, ...])	Base class for mixed precision quantized (MPQ) linear layers, designed to support the computational needs of large language models (LLMs) with mixed precision quantization, such as 16-bit activations and 4-bit weights for efficient inference.
`MPQWeightParameter`([data, requires_grad, ...])	A custom parameter class for quantized weights, extending torch.nn.Parameter, with additional attributes specific to quantization.
`nBitLinearBase`(in_channels, out_channels[, ...])	A base class for n-bit Quantization-Aware Training (QAT) linear layers.
`nBitLinearParameter`([data, requires_grad])	A custom parameter class for n-bit linear layer, extending torch.nn.Parameter.