bitorch_engine.functions.cuda.functions.q4_unpack_tensor

bitorch_engine.functions.cuda.functions.q4_unpack_tensor(input: Tensor, is_transpose: bool = False) → Tensor[source]

Unpacks a tensor that has been previously packed using 4-bit quantization into its original format.

This function is designed to work with tensors that have been quantized and packed, reducing their bit representation from a standard format (int32) down to 4-bit representations, and then packed two values into a single int8 type. This unpacking function reverses that process, reconstructing the original quantized values as a new tensor.

Parameters:

input (torch.Tensor) – The input tensor that contains packed 4-bit quantized values. It must be of dtype int8.
is_transpose (bool, optional) – Indicates whether the unpacked tensor should be transposed. The default is False, meaning no transposition will occur.

Returns:

A tensor containing the unpacked quantized values. The dtype of this tensor: will depend on the implementation of the q4_unpack function in the functions_cuda module, typically returning values in a format suitable for further processing or analysis.

Return type:

torch.Tensor

Raises:

AssertionError – If the input tensor’s dtype is not int8, an assertion error is raised to ensure the unpacking process is applied to a correctly formatted tensor.