vllm.model_executor.models.qwen2_rm
Inference-only Qwen2-RM model compatible with HuggingFace weights.
Qwen2ForProcessRewardModel ¶
Bases: Qwen2RewardBaseModel
Source code in vllm/model_executor/models/qwen2_rm.py
__init__ ¶
__init__(*, vllm_config: VllmConfig, prefix: str = '')
Source code in vllm/model_executor/models/qwen2_rm.py
Qwen2ForRewardModel ¶
Bases: Qwen2RewardBaseModel
Source code in vllm/model_executor/models/qwen2_rm.py
__init__ ¶
__init__(*, vllm_config: VllmConfig, prefix: str = '')
Source code in vllm/model_executor/models/qwen2_rm.py
Qwen2RewardBaseModel ¶
Bases: Module
, SupportsLoRA
, SupportsPP
Source code in vllm/model_executor/models/qwen2_rm.py
make_empty_intermediate_tensors instance-attribute
¶
model instance-attribute
¶
model = Qwen2Model(
vllm_config=vllm_config,
prefix=maybe_prefix(prefix, "model"),
)
packed_modules_mapping class-attribute
instance-attribute
¶
packed_modules_mapping = {
"qkv_proj": ["q_proj", "k_proj", "v_proj"],
"gate_up_proj": ["gate_proj", "up_proj"],
}
score instance-attribute
¶
score = Sequential(
ColumnParallelLinear(
hidden_size,
hidden_size,
quant_config=quant_config,
return_bias=False,
),
ReLU(),
RowParallelLinear(
hidden_size,
num_labels,
quant_config=quant_config,
return_bias=False,
),
)
__init__ ¶
__init__(*, vllm_config: VllmConfig, prefix: str = '')
Source code in vllm/model_executor/models/qwen2_rm.py
forward ¶
forward(
input_ids: Tensor,
positions: Tensor,
intermediate_tensors: Optional[
IntermediateTensors
] = None,
inputs_embeds: Optional[Tensor] = None,
) -> Union[Tensor, IntermediateTensors]