bitorch_engine.utils.convert.quantize_linear_with_q4_linear_cutlass

bitorch_engine.utils.convert.quantize_linear_with_q4_linear_cutlass(module: Module, names_to_replace: Iterable[str], parent_name: str = '')[source]

Replace all layers contained in names_to_replace within the given module with Q4LinearCutlass layers. :param module: the module which contains Linear layers :param names_to_replace: the list of layer names to be replaced :param parent_name: the name of the parent (usually empty when called directly) :return: the list of names of layers which were replaced