Skip to content

vllm.model_executor.models.fireredasr2

FireRedASR2AudioInputs

Bases: TensorSchema

Dimensions
  • b: Batch size
  • nmb: Number of mel bins
  • t: Time frames (M)
Source code in vllm/model_executor/models/fireredasr2.py
class FireRedASR2AudioInputs(TensorSchema):
    """
    Dimensions:
        - b: Batch size
        - nmb: Number of mel bins
        - t: Time frames (M)
    """

    input_features: Annotated[
        list[torch.Tensor] | None,
        TensorShape("b", "nmb", "t"),
    ]
    speech_lengths: Annotated[
        list[torch.Tensor] | None,
        TensorShape("b"),
    ]
    fake_token_lengths: Annotated[
        list[torch.Tensor] | None,
        TensorShape("b"),
    ]