bitorch_engine.layers.qlinear.nbit.cutlass.q8_layer

Classes

Q8LinearCutlass(*args, **kwargs)

Implements an 8-bit quantized linear layer using CUTLASS for efficient computation.

Q8LinearFunction(*args, **kwargs)

Implements a quantized linear function using 8-bit quantization for both activations and weights.