vllm.v1.sample.logits_processor.interface
BatchUpdate dataclass
¶
Persistent batch state change info for logitsprocs
Source code in vllm/v1/sample/logits_processor/interface.py
__init__ ¶
__init__(
batch_size: int,
removed: Sequence[RemovedRequest],
moved: Sequence[MovedRequest],
added: Sequence[AddedRequest],
) -> None
LogitsProcessor ¶
Bases: ABC
Source code in vllm/v1/sample/logits_processor/interface.py
__init__ abstractmethod
¶
__init__(
vllm_config: VllmConfig,
device: device,
is_pin_memory: bool,
) -> None
apply abstractmethod
¶
is_argmax_invariant abstractmethod
¶
is_argmax_invariant() -> bool
True if logits processor has no impact on the argmax computation in greedy sampling. NOTE: may or may not have the same value for all instances of a given LogitsProcessor subclass, depending on subclass implementation.
Source code in vllm/v1/sample/logits_processor/interface.py
update_state abstractmethod
¶
update_state(batch_update: Optional[BatchUpdate]) -> None
Called when there are new output tokens, prior to each forward pass.