Official reference implementation for ComfyUI integration with vLLM-Omni's DALL-E compatible image generation API.
Custom ComfyUI nodes that enable text-to-image generation and image editing using vLLM-Omni's official diffusion API. This integration allows you to use vLLM-Omni's image capabilities (Qwen-Image, Z-Image-Turbo, etc.) directly within ComfyUI workflows.
Example: "telemark skier in the Adirondacks, 1880s clothing, steampunk goggles, action shot, powder skiing, a portrait by Nick Alm"
- Official vLLM-Omni DALL-E API: Uses the official OpenAI-compatible API
- Model Presets: Quick-select optimal settings for Qwen-Image, Z-Image-Turbo, and more
- Server Defaults: Use -1 for parameters to let server choose optimal values
- Advanced Parameters: true_cfg_scale, VAE slicing/tiling for memory optimization
- Full Parameter Control: Adjust width, height, steps, guidance scale, seed
- Negative Prompts: Guide what NOT to generate
- Batch Generation: Generate multiple images in a single request
- Edit existing images with text prompts
- Auto size calculation from input image aspect ratio
- Batch variations: Generate multiple edited versions
- Mask support for future inpainting capabilities
- Advanced CFG controls with dual guidance scales
- Async HTTP: Non-blocking network calls for better performance
- ComfyUI Native: Integrates seamlessly with ComfyUI's node graph system
- Flexible server configuration: Split base URL and endpoint path for easier setup
The node includes built-in presets for popular vLLM-Omni diffusion models:
| Preset | Inference Steps | Guidance Scale | Best For |
|---|---|---|---|
| Server Default (Recommended) | server default | server default | Let server decide (safest option) |
| Qwen-Image (Quality) | 50 | 4.0 | High quality, detailed images |
| Z-Image-Turbo (Speed) | 9 | 0.0 | Fast generation, good quality |
| Custom | manual | manual | Full manual control |
How it works:
- Select a preset from the dropdown to auto-populate parameters
- Presets only affect parameters still at default (-1 = server default)
- Manual adjustments override preset values
- "Server Default" relies on server-side model configuration (safest, always works)
Example: Select "Z-Image-Turbo" → steps auto-set to 9, guidance to 0.0
- ComfyUI installed and running
- Python 3.9+
- vLLM-Omni with official image generation API support
- Install:
pip install vllm-omni(0.6.0+) - Or build from source: vLLM-Omni GitHub
- Install:
- Dependencies (most already included with ComfyUI):
aiohttp>=3.8.0torch>=2.0.0pillow>=9.0.0numpy>=1.21.0
Clone this repository into your ComfyUI custom_nodes directory:
cd ComfyUI/custom_nodes
git clone https://github.com/yourusername/comfyui-vllm-omni.git
cd comfyui-vllm-omni
pip install -r requirements.txtYou need a running vLLM-Omni server with image generation support:
# Qwen-Image (quality)
vllm serve Qwen/Qwen-Image --omni --port 8000
# Z-Image-Turbo (speed)
vllm serve Tongyi-MAI/Z-Image-Turbo --omni --port 8000Note: The default server URL in the node is http://localhost:8000/v1/images/generations.
Restart ComfyUI to load the new custom node.
-
Add the Node: In ComfyUI, right-click → Add Node → image/generation/vllm-omni → vLLM-Omni Text-to-Image
-
Configure Parameters:
- prompt (required): Describe what you want to generate
- negative_prompt (optional): Describe what to avoid
- width / height: Image dimensions (default: 1024x1024)
- num_inference_steps: Denoising steps (default: 50)
- guidance_scale: CFG scale (default: 4.0)
- n: Number of images to generate (default: 1)
- seed: Random seed for reproducibility (0 = random)
- server_url: vLLM-Omni endpoint URL
-
Connect Output: Connect the IMAGE output to other nodes (e.g., SaveImage, PreviewImage)
-
Queue Prompt: Generate your images!
Positive: "a majestic dragon flying over snow-capped mountains at sunset, highly detailed, 4k"
Negative: "blurry, low quality, distorted, ugly"
Positive: "a cute robot reading a book in a cozy library, warm lighting, illustration style"
Negative: "dark, scary, realistic"
A ready-to-use workflow example is provided in the examples/ folder. You can drag and drop this JSON file into ComfyUI to get started quickly.
vllm-omni-generate.json - Basic text-to-image generation
- Simple workflow demonstrating the vLLM-Omni Text-to-Image node
- Shows how to connect the node to SaveImage for output
- Demonstrates model preset selection and parameter configuration
- Ready to use with Qwen-Image or Z-Image-Turbo server
- Download or clone this repository
- Open ComfyUI
- Drag and drop
examples/vllm-omni-generate.jsoninto the ComfyUI window - Adjust the
server_base_urlif your vLLM-Omni server is not on localhost:8000 - Select your model preset (or use "Server Default (Recommended)")
- Queue the workflow!
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
| prompt | STRING | "" | - | Text description of image to generate (required) |
| model_preset | CHOICE | Server Default | - | Quick preset selector for common models |
| negative_prompt | STRING | "" | - | What NOT to generate (optional) |
| width | INT | 1024 | 256-2048 | Image width in pixels (step: 64) |
| height | INT | 1024 | 256-2048 | Image height in pixels (step: 64) |
| num_inference_steps | INT | -1 | -1-200 | Number of denoising steps. -1 = use server default |
| guidance_scale | FLOAT | -1.0 | -1.0-20.0 | CFG scale (higher = more prompt adherence). -1.0 = use server default |
| true_cfg_scale | FLOAT | -1.0 | -1.0-20.0 | Advanced CFG control (model-specific). -1.0 = use server default |
| n | INT | 1 | 1-10 | Number of images to generate |
| seed | INT | 0 | 0-2³¹ | Random seed (0 = random) |
| vae_use_slicing | CHOICE | disabled | disabled/enabled | Enable VAE slicing for memory optimization |
| vae_use_tiling | CHOICE | disabled | disabled/enabled | Enable VAE tiling for very large images |
| server_base_url | STRING | http://localhost:8000 | - | Base URL of vLLM-Omni server |
| endpoint_path | STRING | /v1/images/generations | - | API endpoint path |
This node communicates with vLLM-Omni using the OpenAI DALL-E compatible API format:
POST /v1/images/generations
{
"prompt": "a cat on a laptop",
"n": 1,
"size": "1024x1024",
"response_format": "b64_json",
"negative_prompt": "",
"num_inference_steps": 50,
"guidance_scale": 4.0,
"true_cfg_scale": 4.0,
"vae_use_slicing": false,
"vae_use_tiling": false,
"seed": 42
}{
"created": 1234567890,
"data": [
{
"b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."
}
]
}Notes:
- The node automatically converts ComfyUI's separate width/height parameters to the OpenAI
sizeformat ("WIDTHxHEIGHT") - Parameters set to sentinel values (-1/-1.0) are omitted from the request, allowing the server to use its own defaults
Problem: Cannot connect to vLLM-Omni server
Solutions:
- Ensure the vLLM-Omni server is running
- Check the server URL and port in the node parameters
- Verify firewall settings allow connections
- Try
curl http://localhost:8000/healthto test server
Problem: Generation takes too long (>300s default timeout)
Solutions:
- Reduce
num_inference_steps(try 30-40 instead of 50) - Reduce image size (try 512x512 instead of 1024x1024)
- Check server GPU resources (might be OOM or slow)
Problem: No prompt provided
Solution: Enter a text prompt in the prompt field
Problem: Server returned unexpected response format
Solutions:
- Ensure you're using vLLM-Omni's image server (not text server)
- Check server logs for errors
- Verify server is running the correct endpoint
Problem: Tensor format mismatch
Solution: This should not happen with the current implementation, but if it does:
- Check that server is returning valid PNG data
- Verify base64 encoding is correct
- Report as a bug with server/client versions
You can run multiple vLLM-Omni servers with different models and switch between them:
# Server 1: Qwen-Image on port 8000
python -m vllm_omni.entrypoints.openai.serving_image --model Qwen/Qwen-Image --port 8000
# Server 2: Another model on port 8001
python -m vllm_omni.entrypoints.openai.serving_image --model AnotherModel --port 8001Then in the node, change server_url to http://localhost:8001/v1/images/generations.
Set a specific seed value (not 0) to get reproducible results:
seed: 42 → Same prompt + seed = same image
seed: 0 → Random seed each time = different images
Set n to generate multiple variations at once. The output will be a batch of images that you can process individually using ComfyUI's batch processing nodes.
The vLLM-Omni Image Edit node is marked as EXPERIMENTAL because:
- It uses the
/v1/images/editsendpoint - This endpoint is not yet part of the official vLLM-Omni API
- It may change or be removed in future releases
Current Status:
- ✅ Works with current experimental vLLM-Omni builds
⚠️ Not guaranteed to be stable across versions- 🔮 May become official in future releases
Recommendation: For production workflows, use the official Text-to-Image node instead.
┌──────────────────┐
│ ComfyUI Node │
│ (This Package) │
└────────┬─────────┘
│ HTTP POST /v1/images/generations
│ (OpenAI DALL-E format)
┌────────▼─────────┐
│ vLLM-Omni │
│ Image Server │
└────────┬─────────┘
│
┌────────▼─────────┐
│ Omni.generate() │
│ Diffusion Model │
└──────────────────┘
Data Flow:
- ComfyUI node collects parameters
- Converts to OpenAI API format (
sizestring, etc.) - Sends HTTP POST to vLLM-Omni server
- Server generates images using diffusion model
- Returns base64-encoded PNGs
- Node decodes to PIL → numpy → torch tensor
- Returns ComfyUI-compatible IMAGE tensor
comfyui-vllm-omni/
├── __init__.py # Node registration
├── vllm_omni_node.py # Main ComfyUI node class
├── vllm_api.py # HTTP client for vLLM-Omni API
├── utils.py # Image conversion utilities
├── requirements.txt # Python dependencies
├── pyproject.toml # Package metadata
└── README.md # This file
Currently, testing requires a live vLLM-Omni server. Future versions may include unit tests with mocked API responses.
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Test with a live vLLM-Omni server
- Submit a pull request
- Image Edit Endpoint: The
/v1/images/editsendpoint is experimental (see Experimental Features section) - Async Generation Only: Requires modern ComfyUI with async node support
- Single Server: No automatic load balancing or failover
- No Progress Bar: No real-time progress updates during generation
- Base64 Only: No direct file URL support (would require image hosting)
- No Authentication: Assumes open localhost server
Potential features for future releases:
- Image-to-image generation support
- Inpainting with mask support
- LoRA model selection
- ControlNet integration
- Progress bar during generation
- Connection pooling for better performance
- Model switching without server restart
- Authentication support for remote servers
MIT License - See LICENSE file for details
- vLLM-Omni: For providing the diffusion backend
- ComfyUI: For the excellent node-based UI framework
- Qwen-Image: For the powerful diffusion model
For issues and questions:
- GitHub Issues: Report bugs or request features
- vLLM-Omni Docs: vLLM-Omni documentation
- ComfyUI Docs: ComfyUI custom nodes guide
- Initial release
- Basic text-to-image generation
- OpenAI DALL-E compatible API
- Negative prompt support
- Batch generation support (n parameter)
- Configurable server URL
- Full parameter control (steps, guidance, size, seed)