Skip to content

vllm.benchmarks

Modules:

Name Description
datasets

This module defines a framework for sampling benchmark requests from various

latency

Benchmark the latency of processing a single batch of requests.

lib

Benchmark library utilities.

mm_processor

Benchmark multimodal processor latency.

plot

Generate plots for benchmark results.

serve

Benchmark online serving throughput.

startup

Benchmark the cold and warm startup time of vLLM models.

sweep
throughput

Benchmark offline inference throughput.