vllm.v1.outputs
EMPTY_MODEL_RUNNER_OUTPUT
module-attribute
¶
EMPTY_MODEL_RUNNER_OUTPUT = ModelRunnerOutput(
req_ids=[],
req_id_to_index={},
sampled_token_ids=[],
spec_token_ids=None,
logprobs=None,
prompt_logprobs_dict={},
pooler_output=[],
finished_sending=None,
finished_recving=None,
num_nans_in_logits=None,
)
LogprobsLists
¶
Bases: NamedTuple
Source code in vllm/v1/outputs.py
LogprobsTensors
¶
Bases: NamedTuple
Source code in vllm/v1/outputs.py
empty_cpu
staticmethod
¶
empty_cpu(
num_positions: int, num_tokens_per_position: int
) -> LogprobsTensors
Create empty LogprobsTensors on CPU.
Source code in vllm/v1/outputs.py
ModelRunnerOutput
dataclass
¶
Source code in vllm/v1/outputs.py
num_nans_in_logits
class-attribute
instance-attribute
¶
prompt_logprobs_dict
instance-attribute
¶
prompt_logprobs_dict: dict[str, Optional[LogprobsTensors]]
__init__
¶
__init__(
req_ids: list[str],
req_id_to_index: dict[str, int],
sampled_token_ids: list[list[int]],
spec_token_ids: Optional[list[list[int]]],
logprobs: Optional[LogprobsLists],
prompt_logprobs_dict: dict[
str, Optional[LogprobsTensors]
],
pooler_output: list[Optional[Tensor]],
finished_sending: Optional[set[str]] = None,
finished_recving: Optional[set[str]] = None,
num_nans_in_logits: Optional[dict[str, int]] = None,
) -> None
SamplerOutput
dataclass
¶
Source code in vllm/v1/outputs.py
__init__
¶
__init__(
sampled_token_ids: Tensor,
logprobs_tensors: Optional[LogprobsTensors],
) -> None