vllm.model_executor.pooling_metadata
PoolingMetadata ¶
Metadata for pooling operations in the Pooler layer.
This class holds the necessary information for pooling operations, providing context for how to perform pooling and other related operations.
Attributes:
Name | Type | Description |
---|---|---|
seq_groups | List of (seq_ids, pooling_params). | |
seq_data | A mapping of sequence ID to additional sequence data. | |
prompt_lens | List of the lengths of each prompt. |
Source code in vllm/model_executor/pooling_metadata.py
__init__ ¶
__init__(
seq_groups: list[tuple[list[int], PoolingParams]],
seq_data: dict[int, Any],
prompt_lens: list[int],
pooling_cursor: Optional[PoolingCursor] = None,
) -> None
Source code in vllm/model_executor/pooling_metadata.py
build_pooling_cursor ¶
Source code in vllm/model_executor/pooling_metadata.py
PoolingTensors dataclass
¶
Tensors for pooling.
Source code in vllm/model_executor/pooling_metadata.py
from_pooling_metadata classmethod
¶
from_pooling_metadata(
pooling_metadata: PoolingMetadata, device: device
) -> PoolingTensors
Create PoolingTensors from PoolingMetadata.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pooling_metadata | PoolingMetadata | PoolingMetadata instance to convert. | required |
device | device | Device to store the tensors. | required |