vllm.model_executor.layers.quantization.utils.machete_utils
check_machete_supports_shape ¶
Source code in vllm/model_executor/layers/quantization/utils/machete_utils.py
query_machete_supported_act_types ¶
query_machete_supported_act_types(
zero_points: bool,
) -> list[ScalarType]
query_machete_supported_group_sizes ¶
Queries the supported group sizes for Machete based on the activation type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
act_type | dtype | The activation data type (torch.float16, torch.bfloat16). | required |
Returns:
Type | Description |
---|---|
list[int] | A list of supported group sizes. The group size must |
list[int] | be divisible by |
list[int] | -1 indicates per-channel quantization. |
Source code in vllm/model_executor/layers/quantization/utils/machete_utils.py
query_machete_supported_quant_types ¶
query_machete_supported_quant_types(
zero_points: bool,
) -> list[ScalarType]