vllm.platforms.neuron
NeuronPlatform
¶
Bases: Platform
Source code in vllm/platforms/neuron.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 |
|
device_control_env_var
class-attribute
instance-attribute
¶
device_control_env_var: str = 'NEURON_RT_VISIBLE_CORES'
supported_quantization
class-attribute
instance-attribute
¶
check_and_update_config
classmethod
¶
check_and_update_config(vllm_config: VllmConfig) -> None
Source code in vllm/platforms/neuron.py
get_device_name
classmethod
¶
get_neuron_framework_to_use
¶
Return the specified framework if corresponding installations are available.
If no framework is specified, use neuronx-distributed-inference by default. If that's unavailable, check and switch to transformers-neuronx.
Source code in vllm/platforms/neuron.py
is_async_output_supported
classmethod
¶
use_neuronx_distributed
¶
Return True if the framework determined in get_neuron_framework_to_use() is NeuronFramework.NEURONX_DISTRIBUTED_INFERENCE, False otherwise. This is used to select the Neuron model framework and framework-specific configuration to apply during model compilation.
Source code in vllm/platforms/neuron.py
use_transformers_neuronx
¶
Return True if the framework determined in get_neuron_framework_to_use() is NeuronFramework.TRANSFORMERS_NEURONX, False otherwise. This is used to select the Neuron model framework and framework-specific configuration to apply during model compilation.