bitorch_engine.utils.model_helper
Functions
Post-processes the output tensor of a binary matrix multiplication operation. |
|
Flattens a 3D tensor into a 2D tensor by combining the first two dimensions. |
|
Initializes binary parameters using pre-trained weights if available. |
|
Loads a checkpoint into a given model. |
|
Packs the weights of quantization layers in a given model to prepare for efficient storage. |
|
This function takes as input a PyTorch tensor "weight" representing the embedding matrix, and pads its embedding dimension to the smallest multiple of 8 that is greater than or equal to the current embedding dimension. |
|
Pad the last two dimensions of a PyTorch tensor to the nearest multiple of 128. |
|
Prepares binary and n-bit quantized layers within a given model for training or inference. |
|
This method defines how to update quantized weights with quantized gradients. |
|
Saves the state of a quantized PyTorch model in a bit-packed format. |
|
Unflattens a 2D tensor back into a 3D tensor using the original shape, reversing the operation performed by flatten_x. |
|
Updates the zeros attribute of the qweight object based on its layer type. |