vllm.transformers_utils.tokenizer_group
TokenizerGroup ¶
A group of tokenizers that can be used for LoRA adapters.
Source code in vllm/transformers_utils/tokenizer_group.py
lora_tokenizers instance-attribute
¶
lora_tokenizers = LRUCache[int, AnyTokenizer](
capacity=max(max_loras, max_num_seqs)
if enable_lora
else 0
)
__init__ ¶
__init__(
tokenizer_id: str,
enable_lora: bool,
max_num_seqs: int,
max_input_length: Optional[int],
**tokenizer_config,
)
Source code in vllm/transformers_utils/tokenizer_group.py
_raise_if_input_too_long ¶
_raise_if_input_too_long(
encoded_tokens: list[int],
lora_request: Optional[LoRARequest] = None,
)
Source code in vllm/transformers_utils/tokenizer_group.py
encode ¶
encode(
prompt: str,
max_length: Optional[int] = None,
truncation: Optional[bool] = None,
lora_request: Optional[LoRARequest] = None,
add_special_tokens: Optional[bool] = None,
) -> list[int]
Source code in vllm/transformers_utils/tokenizer_group.py
encode_async async
¶
encode_async(
prompt: str,
max_length: Optional[int] = None,
truncation: Optional[bool] = None,
lora_request: Optional[LoRARequest] = None,
add_special_tokens: Optional[bool] = None,
) -> list[int]
Source code in vllm/transformers_utils/tokenizer_group.py
get_lora_tokenizer ¶
get_lora_tokenizer(
lora_request: Optional[LoRARequest] = None,
) -> AnyTokenizer
Source code in vllm/transformers_utils/tokenizer_group.py
get_lora_tokenizer_async async
¶
get_lora_tokenizer_async(
lora_request: Optional[LoRARequest] = None,
) -> AnyTokenizer
Source code in vllm/transformers_utils/tokenizer_group.py
get_max_input_len ¶
get_max_input_len(
lora_request: Optional[LoRARequest] = None,
) -> Optional[int]
init_tokenizer_from_configs ¶
init_tokenizer_from_configs(
model_config: ModelConfig,
scheduler_config: SchedulerConfig,
lora_config: Optional[LoRAConfig],
)