bitorch_engine.layers.qlinear.nbit.cuda.utils.make_group_map

bitorch_engine.layers.qlinear.nbit.cuda.utils.make_group_map(q_groups: Tensor, num_qrows: int) → Tensor[source]

Creates a mapping of quantization groups for handling irregular group sizes in quantized models.

This function generates a tensor representing the mapping of groups, where each group might have a different size due to the quantization process. The mapping is used to organize or access quantized weights or parameters based on their group assignment.

Parameters:

q_groups (torch.Tensor) – A tensor containing information about the quantization groups. It is expected to hold pairs of values, where each pair consists of ‘bits’ and ‘start index’ for each group.
num_qrows (int) – The total number of quantization rows, representing the overall size of the quantization dimension.

Returns:

A tensor of short integers representing the group mapping. Each group is represented by its index followed by the inverse row index within the group.

Return type:

torch.Tensor

Example

Given q_groups tensor indicating group sizes and num_qrows indicating the total quantization rows, this function calculates the group mapping required for accessing or organizing the quantized parameters.