bitorch_engine.layers.qlinear.nbit.cutlass.q4_layer

Classes

Q4LinearCutlass(*args, **kwargs)

This class implements a quantized linear layer using the CUTLASS library, specifically designed for 4-bit quantization.

Q4LinearFunction(*args, **kwargs)

Implements a custom linear function with quantization for forward and backward passes.

Q4MatMul([dtype, device])

A custom PyTorch module for performing quantized matrix multiplication, specifically designed for 4-bit quantization.

Q4MatMulFunction(*args, **kwargs)

This class implements a custom autograd function for quantized matrix multiplication (MatMul) using 4-bit quantization.