vllm.model_executor.guided_decoding.outlines_logits_processors
BaseLogitsProcessor
¶
Source code in vllm/model_executor/guided_decoding/outlines_logits_processors.py
_fsm_state
instance-attribute
¶
_fsm_state: defaultdict[int, Union[int, CFGState]] = (
defaultdict(int)
)
__call__
¶
Use the FSM to bias the logits before sampling the next token.
Source code in vllm/model_executor/guided_decoding/outlines_logits_processors.py
__init__
¶
__init__(guide: Guide, reasoner: Optional[ReasoningParser])
Source code in vllm/model_executor/guided_decoding/outlines_logits_processors.py
clone
¶
clone() -> BaseLogitsProcessor
CFGLogitsProcessor
¶
Bases: BaseLogitsProcessor
Source code in vllm/model_executor/guided_decoding/outlines_logits_processors.py
__init__
¶
__init__(
cfg: str,
tokenizer: PreTrainedTokenizerBase,
reasoner: Optional[ReasoningParser],
)
Compile the FSM that drives the context free grammar generation.
Parameters¶
cfg A string that represents a context-free grammar tokenizer The model's tokenizer
Source code in vllm/model_executor/guided_decoding/outlines_logits_processors.py
clone
¶
clone() -> CFGLogitsProcessor
JSONLogitsProcessor
¶
Bases: RegexLogitsProcessor
Source code in vllm/model_executor/guided_decoding/outlines_logits_processors.py
__init__
¶
__init__(
schema: Union[str, dict, BaseModel],
tokenizer: PreTrainedTokenizerBase,
whitespace_pattern: Union[str, None],
reasoner: Optional[ReasoningParser],
)
Compile the FSM that drives the JSON-guided generation.
Parameters
----------
schema
A JSON schema that encodes the structure we want the model to
generate
tokenizer
The model's tokenizer
whitespace_pattern
Pattern to use for JSON syntactic whitespace (doesn't impact
string literals)
Example: allow only a single space or newline with
`whitespace_pattern=r"[
]?"`
Source code in vllm/model_executor/guided_decoding/outlines_logits_processors.py
RegexLogitsProcessor
¶
Bases: BaseLogitsProcessor
Source code in vllm/model_executor/guided_decoding/outlines_logits_processors.py
__init__
¶
__init__(
regex_string: str,
tokenizer: PreTrainedTokenizerBase,
reasoner: Optional[ReasoningParser],
)
Compile the FSM that drives the regex-structured generation.
Parameters¶
regex_string A string that represents a regular expression tokenizer The model's tokenizer
Source code in vllm/model_executor/guided_decoding/outlines_logits_processors.py
_adapt_tokenizer
cached
¶
Adapt vLLM's tokenizer to use to compile the FSM.
The API of Outlines tokenizers is slightly different to that of
transformers
. The decoder of outlines, returns a list whereas
the decode of vLLM returns an str. To sync the vLLM decoder with
outlines internal api, the decoder should be adapted. In addition
we need to handle the missing spaces to Llama's tokenizer to be
able to compile FSMs for this model.