vllm.model_executor.guided_decoding.lm_format_enforcer_decoding
_cached_build_vllm_token_enforcer_tokenizer_data
cached
¶
_normalize_json_schema_object
¶
Source code in vllm/model_executor/guided_decoding/lm_format_enforcer_decoding.py
get_local_lm_format_enforcer_guided_decoding_logits_processor
¶
get_local_lm_format_enforcer_guided_decoding_logits_processor(
guided_params: GuidedDecodingParams, tokenizer
) -> Optional[LogitsProcessor]
Given an OpenAI-compatible request, check for guided decoding parameters and get the necessary logits processor for the given guide. We cache logit processors by (guide, tokenizer), and on cache hit we make a shallow copy to reuse the same underlying FSM.