vllm.model_executor.layers.quantization.kernels.scaled_mm.ScaledMMLinearKernel
ScaledMMLinearKernel ¶
Bases: ABC
Source code in vllm/model_executor/layers/quantization/kernels/scaled_mm/ScaledMMLinearKernel.py
__init__ ¶
__init__(
c: ScaledMMLinearLayerConfig,
w_q_param_name: str,
w_s_param_name: str,
i_s_param_name: str,
i_zp_param_name: str,
azp_adj_param_name: str,
) -> None
Source code in vllm/model_executor/layers/quantization/kernels/scaled_mm/ScaledMMLinearKernel.py
_get_weight_params ¶
_get_weight_params(
layer: Module,
) -> tuple[
Tensor,
Tensor,
Optional[Tensor],
Optional[Tensor],
Optional[Tensor],
]
Source code in vllm/model_executor/layers/quantization/kernels/scaled_mm/ScaledMMLinearKernel.py
apply_weights abstractmethod
¶
can_implement abstractmethod
classmethod
¶
can_implement(
c: ScaledMMLinearLayerConfig,
) -> tuple[bool, Optional[str]]