bitorch_engine.layers.qlinear.nbit.cuda.utils.make_group_map
- bitorch_engine.layers.qlinear.nbit.cuda.utils.make_group_map(q_groups: Tensor, num_qrows: int) Tensor [source]
Creates a mapping of quantization groups for handling irregular group sizes in quantized models.
This function generates a tensor representing the mapping of groups, where each group might have a different size due to the quantization process. The mapping is used to organize or access quantized weights or parameters based on their group assignment.
- Parameters:
q_groups (torch.Tensor) – A tensor containing information about the quantization groups. It is expected to hold pairs of values, where each pair consists of ‘bits’ and ‘start index’ for each group.
num_qrows (int) – The total number of quantization rows, representing the overall size of the quantization dimension.
- Returns:
A tensor of short integers representing the group mapping. Each group is represented by its index followed by the inverse row index within the group.
- Return type:
torch.Tensor
Example
Given q_groups tensor indicating group sizes and num_qrows indicating the total quantization rows, this function calculates the group mapping required for accessing or organizing the quantized parameters.