bitorch_engine.layers.qlinear.nbit.cuda.utils
Functions
Creates a mapping of quantization groups for handling irregular group sizes in quantized models. |
|
Packs the fp16 weight into a quantized weight format using the attributes defined in the QweightParameter. |
|
Reconstructs the fp16 weight tensor from the input quantized weight parameter. |