vllm.v1.engine.mm_input_cache
MultiModalInputCacheClient ¶
Used by P0 to check whether multi-modal kwargs are cached in P1.
Source code in vllm/v1/engine/mm_input_cache.py
mm_cache instance-attribute
¶
mm_cache = get_lru_cache(
get_mm_input_cache_gb(), MultiModalCacheItemMetadata
)
__init__ ¶
__init__(
model_config: ModelConfig,
mm_registry: MultiModalRegistry,
) -> None
Source code in vllm/v1/engine/mm_input_cache.py
get_and_update ¶
get_and_update(
mm_kwargs: Sequence[MultiModalKwargsItem],
mm_hashes: list[str],
) -> list[Optional[MultiModalKwargsItem]]
Source code in vllm/v1/engine/mm_input_cache.py
MultiModalInputCacheServer ¶
Used by P1 to avoid requiring past multi-modal kwargs from P0.
Source code in vllm/v1/engine/mm_input_cache.py
mm_cache instance-attribute
¶
mm_cache = get_lru_cache(
get_mm_input_cache_gb(), MultiModalKwargsItem
)
__init__ ¶
__init__(
model_config: ModelConfig,
mm_registry: MultiModalRegistry,
) -> None
Source code in vllm/v1/engine/mm_input_cache.py
get_and_update ¶
get_and_update(
mm_kwargs: Sequence[Optional[MultiModalKwargsItem]],
mm_hashes: list[str],
) -> list[MultiModalKwargsItem]