bitorch_engine.layers.qlinear.nbit.cuda.utils.unpack_qweight
- bitorch_engine.layers.qlinear.nbit.cuda.utils.unpack_qweight(qweight: MPQWeightParameter) Tensor [source]
Reconstructs the fp16 weight tensor from the input quantized weight parameter.
- Parameters:
qweight (MPQWeightParameter) – The quantized weight parameter object containing all necessary quantization information.
- Returns:
The reconstructed weight tensor in fp16 format.
- Return type:
torch.Tensor
- Raises:
ValueError – If essential attributes are missing in the input qweight parameter.
NotImplementedError – For quantization types that are not yet supported.
- Supported quantization styles:
GPTQ style with g_index.
GPTQ style without g_index.
Mixed-bit quantization.