bitorch_engine.utils.model_helper.pack_bie_layers

bitorch_engine.utils.model_helper.pack_bie_layers(model: Module, qweight_only: bool = True, layers=None) None[source]

Packs the weights of quantization layers in a given model to prepare for efficient storage. This function should be invoked prior to using torch.save() for saving the model, ensuring that the quantized weights are properly compressed.

Parameters:
  • model – The model whose quantization layers’ weights are to be packed. This model should already be trained and contain quantization layers that support weight packing.

  • qweight_only – A boolean flag indicating whether only the weights are to be quantized and packed. If True, only weights are packed, excluding other parameters like biases. Defaults to True.

  • layers – A list of layer classes that should be considered for packing. If not provided, defaults to a predefined list of binary and n-bit quantized convolutional and linear layer bases. This allows customization of which layers are to be packed based on the model architecture.

Note

The function iterates through all sub-modules of the provided model, checking if any

module matches the types specified in the layers list. For each matching module, it calls the generate_quantized_weight method with the qweight_only parameter, which performs the actual weight packing process.