vllm.renderers.embed_utils ¶
safe_load_prompt_embeds_async module-attribute ¶
safe_load_prompt_embeds_async = make_async(
safe_load_prompt_embeds
)
Async variant of safe_load_prompt_embeds that defers the decode to a thread-pool executor, so the asyncio event loop is not blocked by the base64 decode + torch.load work.