vllm.executor.multiproc_worker_utils
ProcessWorkerWrapper ¶
Local process wrapper for vllm.worker.Worker, for handling single-node multi-GPU tensor parallel.
Source code in vllm/executor/multiproc_worker_utils.py
process instance-attribute
¶
process: BaseProcess = Process(
target=_run_worker_process,
name="VllmWorkerProcess",
kwargs=dict(
worker_factory=worker_factory,
task_queue=_task_queue,
result_queue=result_queue,
vllm_config=vllm_config,
rank=rank,
),
daemon=True,
)
__init__ ¶
__init__(
result_handler: ResultHandler,
worker_factory: Callable[[VllmConfig, int], Any],
vllm_config: VllmConfig,
rank: int,
) -> None
Source code in vllm/executor/multiproc_worker_utils.py
_enqueue_task ¶
Source code in vllm/executor/multiproc_worker_utils.py
execute_method ¶
execute_method_async async
¶
kill_worker ¶
Result dataclass
¶
Result of task dispatched to worker
Source code in vllm/executor/multiproc_worker_utils.py
ResultFuture ¶
Synchronous future for non-async case
Source code in vllm/executor/multiproc_worker_utils.py
ResultHandler ¶
Bases: Thread
Handle results from all workers (in background thread)
Source code in vllm/executor/multiproc_worker_utils.py
__init__ ¶
close ¶
run ¶
Source code in vllm/executor/multiproc_worker_utils.py
WorkerMonitor ¶
Bases: Thread
Monitor worker status (in background thread)
Source code in vllm/executor/multiproc_worker_utils.py
__init__ ¶
__init__(
workers: List[ProcessWorkerWrapper],
result_handler: ResultHandler,
)
close ¶
Source code in vllm/executor/multiproc_worker_utils.py
run ¶
Source code in vllm/executor/multiproc_worker_utils.py
_run_worker_process ¶
_run_worker_process(
worker_factory: Callable[[VllmConfig, int], Any],
task_queue: Queue,
result_queue: Queue,
vllm_config: VllmConfig,
rank: int,
) -> None
Worker process event loop
Source code in vllm/executor/multiproc_worker_utils.py
_set_future_result ¶
_set_future_result(
future: Union[ResultFuture, Future], result: Result
)
Source code in vllm/executor/multiproc_worker_utils.py
set_multiprocessing_worker_envs ¶
Set up environment variables that should be used when there are workers in a multiprocessing environment. This should be called by the parent process before worker processes are created