vllm.transformers_utils.tokenizer_group
TokenizerGroup
¶
A group of tokenizers that can be used for LoRA adapters.
Source code in vllm/transformers_utils/tokenizer_group.py
lora_tokenizers
instance-attribute
¶
lora_tokenizers = LRUCache[int, AnyTokenizer](
capacity=max(max_loras, max_num_seqs)
if enable_lora
else 0
)
__init__
¶
__init__(
tokenizer_id: str,
enable_lora: bool,
max_num_seqs: int,
max_input_length: Optional[int],
**tokenizer_config,
)
Source code in vllm/transformers_utils/tokenizer_group.py
_raise_if_input_too_long
¶
_raise_if_input_too_long(
encoded_tokens: list[int],
lora_request: Optional[LoRARequest] = None,
)
Source code in vllm/transformers_utils/tokenizer_group.py
encode
¶
encode(
prompt: str,
max_length: Optional[int] = None,
truncation: Optional[bool] = None,
lora_request: Optional[LoRARequest] = None,
add_special_tokens: Optional[bool] = None,
) -> list[int]
Source code in vllm/transformers_utils/tokenizer_group.py
encode_async
async
¶
encode_async(
prompt: str,
max_length: Optional[int] = None,
truncation: Optional[bool] = None,
lora_request: Optional[LoRARequest] = None,
add_special_tokens: Optional[bool] = None,
) -> list[int]
Source code in vllm/transformers_utils/tokenizer_group.py
get_lora_tokenizer
¶
get_lora_tokenizer(
lora_request: Optional[LoRARequest] = None,
) -> AnyTokenizer
Source code in vllm/transformers_utils/tokenizer_group.py
get_lora_tokenizer_async
async
¶
get_lora_tokenizer_async(
lora_request: Optional[LoRARequest] = None,
) -> AnyTokenizer
Source code in vllm/transformers_utils/tokenizer_group.py
get_max_input_len
¶
get_max_input_len(
lora_request: Optional[LoRARequest] = None,
) -> Optional[int]
init_tokenizer_from_configs
¶
init_tokenizer_from_configs(
model_config: ModelConfig,
scheduler_config: SchedulerConfig,
lora_config: Optional[LoRAConfig],
)