vllm.model_executor.layers.quantization.utils.mxfp4_utils
_can_support_mxfp4 ¶
_can_support_mxfp4(
use_grouped_topk: bool = False,
topk_group: Optional[int] = None,
num_expert_group: Optional[int] = None,
expert_map: Optional[Tensor] = None,
custom_routing_function: Optional[Callable] = None,
e_score_correction_bias: Optional[Tensor] = None,
apply_router_weight_on_input: bool = False,
scoring_func: str = "softmax",
activation: str = "swigluoai",
expert_load_view: Optional[Tensor] = None,
logical_to_physical_map: Optional[Tensor] = None,
logical_replica_count: Optional[Tensor] = None,
)
Source code in vllm/model_executor/layers/quantization/utils/mxfp4_utils.py
_dequant_mxfp4 ¶
Source code in vllm/model_executor/layers/quantization/utils/mxfp4_utils.py
_dequant_mxfp4_fake ¶
_quant_dequant_mxfp4 ¶
Source code in vllm/model_executor/layers/quantization/utils/mxfp4_utils.py
_quant_dequant_mxfp4_fake ¶
_swizzle_mxfp4 ¶
weight swizzle for mxfp4 moe, used for OAI mxfp4 kernel