-
Notifications
You must be signed in to change notification settings - Fork 460
DALL-E compatible image generation endpoint #292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
hsliuustc0106
merged 3 commits into
vllm-project:main
from
dougbtv:dalle-compat-image-api
Dec 23, 2025
+1,029
−0
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,249 @@ | ||
| # Image Generation API | ||
|
|
||
| vLLM-Omni provides an OpenAI DALL-E compatible API for text-to-image generation using diffusion models. | ||
|
|
||
| Each server instance runs a single model (specified at startup via `vllm serve <model> --omni`). | ||
|
|
||
| ## Quick Start | ||
|
|
||
| ### Start the Server | ||
|
|
||
| For example... | ||
|
|
||
| ```bash | ||
| # Qwen-Image | ||
| vllm serve Qwen/Qwen-Image --omni --port 8000 | ||
|
|
||
| # Z-Image Turbo | ||
| vllm serve Tongyi-MAI/Z-Image-Turbo --omni --port 8000 | ||
| ``` | ||
|
|
||
| ### Generate Images | ||
|
|
||
| **Using curl:** | ||
|
|
||
| ```bash | ||
| curl -X POST http://localhost:8000/v1/images/generations \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "prompt": "a dragon laying over the spine of the Green Mountains of Vermont", | ||
| "size": "1024x1024", | ||
| "seed": 42 | ||
| }' | jq -r '.data[0].b64_json' | base64 -d > dragon.png | ||
| ``` | ||
|
|
||
| **Using Python:** | ||
|
|
||
| ```python | ||
| import requests | ||
| import base64 | ||
| from PIL import Image | ||
| import io | ||
|
|
||
| response = requests.post( | ||
| "http://localhost:8000/v1/images/generations", | ||
| json={ | ||
| "prompt": "a black and white cat wearing a princess tiara", | ||
| "size": "1024x1024", | ||
| "num_inference_steps": 50, | ||
| "seed": 42, | ||
| } | ||
| ) | ||
|
|
||
| # Decode and save | ||
| img_data = response.json()["data"][0]["b64_json"] | ||
| img_bytes = base64.b64decode(img_data) | ||
| img = Image.open(io.BytesIO(img_bytes)) | ||
| img.save("cat.png") | ||
| ``` | ||
|
|
||
| **Using OpenAI SDK:** | ||
|
|
||
| ```python | ||
| from openai import OpenAI | ||
|
|
||
| client = OpenAI(base_url="http://localhost:8000/v1", api_key="none") | ||
|
|
||
| response = client.images.generate( | ||
| model="Qwen/Qwen-Image", | ||
| prompt="a horse jumping over a fence nearby a babbling brook", | ||
| n=1, | ||
| size="1024x1024", | ||
| response_format="b64_json" | ||
| ) | ||
|
|
||
| # Note: Extension parameters (seed, steps, cfg) require direct HTTP requests | ||
| ``` | ||
|
|
||
| ## API Reference | ||
|
|
||
| ### Endpoint | ||
|
|
||
| ``` | ||
| POST /v1/images/generations | ||
| Content-Type: application/json | ||
| ``` | ||
|
|
||
| ### Request Parameters | ||
|
|
||
| #### OpenAI Standard Parameters | ||
|
|
||
| | Parameter | Type | Default | Description | | ||
| |-----------|------|---------|-------------| | ||
| | `prompt` | string | **required** | Text description of the desired image | | ||
| | `model` | string | server's model | Model to use (optional, should match server if specified) | | ||
| | `n` | integer | 1 | Number of images to generate (1-10) | | ||
| | `size` | string | model defaults | Image dimensions in WxH format (e.g., "1024x1024", "512x512") | | ||
| | `response_format` | string | "b64_json" | Response format (only "b64_json" supported) | | ||
| | `user` | string | null | User identifier for tracking | | ||
|
|
||
| #### vllm-omni Extension Parameters | ||
|
|
||
| | Parameter | Type | Default | Description | | ||
| |-----------|------|---------|-------------| | ||
| | `negative_prompt` | string | null | Text describing what to avoid in the image | | ||
| | `num_inference_steps` | integer | model defaults | Number of diffusion steps | | ||
| | `guidance_scale` | float | model defaults | Classifier-free guidance scale (typically 0.0-20.0) | | ||
| | `true_cfg_scale` | float | model defaults | True CFG scale (model-specific parameter, may be ignored if not supported) | | ||
| | `seed` | integer | null | Random seed for reproducibility | | ||
|
|
||
| ### Response Format | ||
|
|
||
| ```json | ||
| { | ||
| "created": 1701234567, | ||
| "data": [ | ||
| { | ||
| "b64_json": "<base64-encoded PNG>", | ||
| "url": null, | ||
| "revised_prompt": null | ||
| } | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| ## Examples | ||
|
|
||
| ### Multiple Images | ||
|
|
||
| ```bash | ||
| curl -X POST http://localhost:8000/v1/images/generations \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "prompt": "a steampunk city set in a valley of the Adirondack mountains", | ||
| "n": 4, | ||
| "size": "1024x1024", | ||
| "seed": 123 | ||
| }' | ||
| ``` | ||
|
|
||
| This generates 4 images in a single request. | ||
|
|
||
| ### With Negative Prompt | ||
|
|
||
| ```python | ||
| response = requests.post( | ||
| "http://localhost:8000/v1/images/generations", | ||
| json={ | ||
| "prompt": "a portrait of a skier in deep powder snow", | ||
| "negative_prompt": "blurry, low quality, distorted, ugly", | ||
| "num_inference_steps": 100, | ||
| "size": "1024x1024", | ||
| } | ||
| ) | ||
| ``` | ||
|
|
||
| ## Parameter Handling | ||
|
|
||
| The API passes parameters directly to the diffusion pipeline without model-specific transformation: | ||
|
|
||
| - **Default values**: When parameters are not specified, the underlying model uses its own defaults | ||
| - **Pass-through design**: User-provided values are forwarded directly to the diffusion engine | ||
| - **Minimal validation**: Only basic type checking and range validation at the API level | ||
|
|
||
| ### Parameter Compatibility | ||
|
|
||
| The API passes parameters directly to the diffusion pipeline without model-specific validation. | ||
|
|
||
| - Unsupported parameters may be silently ignored by the model | ||
| - Incompatible values will result in errors from the underlying pipeline | ||
| - Recommended values vary by model - consult model documentation | ||
|
|
||
| **Best Practice:** Start with the model's recommended parameters, then adjust based on your needs. | ||
|
|
||
| ## Error Responses | ||
|
|
||
| ### 400 Bad Request | ||
|
|
||
| Invalid parameters (e.g., model mismatch): | ||
|
|
||
| ```json | ||
| { | ||
| "detail": "Invalid size format: '1024x'. Expected format: 'WIDTHxHEIGHT' (e.g., '1024x1024')." | ||
| } | ||
| ``` | ||
|
|
||
| ### 422 Unprocessable Entity | ||
|
|
||
| Validation errors (missing required fields): | ||
|
|
||
| ```json | ||
| { | ||
| "detail": [ | ||
| { | ||
| "loc": ["body", "prompt"], | ||
| "msg": "field required", | ||
| "type": "value_error.missing" | ||
| } | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| ### 503 Service Unavailable | ||
|
|
||
| Diffusion engine not initialized: | ||
|
|
||
| ```json | ||
| { | ||
| "detail": "Diffusion engine not initialized. Start server with a diffusion model." | ||
| } | ||
| ``` | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| ### Server Not Running | ||
|
|
||
| ```bash | ||
| # Check if server is responding | ||
| curl http://localhost:8000/v1/images/generations \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{"prompt": "test"}' | ||
| ``` | ||
|
|
||
| ### Out of Memory | ||
|
|
||
| If you encounter OOM errors: | ||
| 1. Reduce image size: `"size": "512x512"` | ||
| 2. Reduce inference steps: `"num_inference_steps": 25` | ||
| 3. Generate fewer images: `"n": 1` | ||
|
|
||
| ## Testing | ||
|
|
||
| Run the test suite to verify functionality: | ||
|
|
||
| ```bash | ||
| # All image generation tests | ||
| pytest tests/entrypoints/openai/test_image_server.py -v | ||
|
|
||
| # Specific test | ||
| pytest tests/entrypoints/openai/test_image_server.py::test_generate_single_image -v | ||
| ``` | ||
|
|
||
| ## Development | ||
|
|
||
| Enable debug logging to see prompts and generation details: | ||
|
|
||
| ```bash | ||
| vllm serve Qwen/Qwen-Image --omni \ | ||
| --uvicorn-log-level debug | ||
| ``` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.