vllm.entrypoints.openai.serving_embedding
EmbeddingMixin
¶
Bases: OpenAIServing
Source code in vllm/entrypoints/openai/serving_embedding.py
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
|
_build_response
¶
_build_response(
ctx: ServeContext,
) -> Union[EmbeddingResponse, ErrorResponse]
Source code in vllm/entrypoints/openai/serving_embedding.py
_preprocess
async
¶
_preprocess(ctx: ServeContext) -> Optional[ErrorResponse]
Source code in vllm/entrypoints/openai/serving_embedding.py
OpenAIServingEmbedding
¶
Bases: EmbeddingMixin
Source code in vllm/entrypoints/openai/serving_embedding.py
chat_template_content_format
instance-attribute
¶
chat_template_content_format: Final = (
chat_template_content_format
)
__init__
¶
__init__(
engine_client: EngineClient,
model_config: ModelConfig,
models: OpenAIServingModels,
*,
request_logger: Optional[RequestLogger],
chat_template: Optional[str],
chat_template_content_format: ChatTemplateContentFormatOption,
) -> None
Source code in vllm/entrypoints/openai/serving_embedding.py
_validate_request
¶
_validate_request(
ctx: ServeContext[EmbeddingRequest],
) -> Optional[ErrorResponse]
Source code in vllm/entrypoints/openai/serving_embedding.py
create_embedding
async
¶
create_embedding(
request: EmbeddingRequest,
raw_request: Optional[Request] = None,
) -> Union[EmbeddingResponse, ErrorResponse]
Embedding API similar to OpenAI's API.
See https://platform.openai.com/docs/api-reference/embeddings/create for the API specification. This API mimics the OpenAI Embedding API.
Source code in vllm/entrypoints/openai/serving_embedding.py
_get_embedding
¶
_get_embedding(
output: EmbeddingOutput,
encoding_format: Literal["float", "base64"],
) -> Union[list[float], str]