[Doc] Add guides for custom docker image build on NVIDIA CUDA [Skip-CI] by loveysuby · Pull Request #1386 · vllm-project/vllm-omni

loveysuby · 2026-02-16T14:37:18Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Added NVIDIA CUDA build instructions to match the existing AMD ROCm guide.
Documents how to use docker/Dockerfile.cuda for custom builds, enabling source modifications and BASE_IMAGE customization. (added in #1439)

Test Plan

Runtime Environment: NVIDIA A100-SXM4-80GB (CUDA 13.0 / Driver 580.82.07)

verify docker build --check -f docker/Dockerfile.cuda with different BASE_IMAGE to specify vLLM base image.

DOCKER_BUILDKIT=1 docker build \
  --check \
  -f docker/Dockerfile.cuda \
  --build-arg BASE_IMAGE=vllm/vllm-openai:v0.18.0 \
  -t vllm-omni-cuda .

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please providing the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please pasting the results comparison before and after, or e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: Hyoseop Song <crad_on25@naver.com>

loveysuby · 2026-02-16T15:18:02Z

    You can use this docker image to serve models the same way you would with in vLLM! To do so, make sure you overwrite the default entrypoint (`vllm serve --omni`) which works only for models supported in the vLLM-Omni project.

 # --8<-- [end:pre-built-images]
+
+# --8<-- [start:build-docker]
+
+#### Build docker image
+
+```bash
+DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile.ci -t vllm-omni-cuda .


@congw729 Hi, I've written a guide for NVIDIA GPU users, but using the Dockerfile.ci as-is doesn't seem suitable for the purpose.

I have already verified the installation logic on an NVIDIA A100. Should I create a new, dedicated Dockerfile for users and re-test it? Let me know your thoughts, and I'll update the PR accordingly.

Switching to Draft for now.

@congw729 Hi, I've written a guide for NVIDIA GPU users, but using the Dockerfile.ci as-is doesn't seem suitable for the purpose.

I have already verified the installation logic on an NVIDIA A100. Should I create a new, dedicated Dockerfile for users and re-test it? Let me know your thoughts, and I'll update the PR accordingly.

Switching to Draft for now.

I think it's better to using a different Dockerfile. Docker.ci will install unnecessary packages for users.

Yes, the Dockerfile.ci install the vllm-omni in dev mode, which will include some unnecessary packages.

lishunyang12

A few things worth discussing in the CUDA build guide.

lishunyang12 · 2026-02-21T21:20:52Z

+
+```bash
+docker run --runtime nvidia --gpus all \
+  -v ~/.cache/huggingface:/root/.cache/huggingface \


This model needs significant GPU memory ("verified on 2 x H100s" above). Worth noting that or using --gpus 2 in the example.

hsliuustc0106 · 2026-02-22T13:47:26Z

@vllm-omni-reviewer

github-actions · 2026-02-22T13:49:07Z

🤖 VLLM-Omni PR Review

Code Review: Add guides for custom docker image build on NVIDIA CUDA

1. Overview

This PR adds documentation for building custom Docker images on NVIDIA CUDA, mirroring the existing AMD ROCm guide structure. The changes include:

A new tab entry in gpu.md for NVIDIA CUDA build instructions
A new build-docker section in cuda.inc.md with build and launch commands

Overall Assessment: Positive - The PR follows the existing documentation patterns and provides useful guidance for users who need custom Docker builds.

2. Code Quality

Strengths

Follows the existing documentation structure and include pattern (--8<--)
Provides both server and interactive launch modes
Shows how to customize the base vLLM version with VLLM_BASE_TAG
Uses DOCKER_BUILDKIT=1 for modern build behavior

Minor Issues

Version inconsistency between PR description and documentation:
- PR description tests with VLLM_BASE_TAG=v0.11.0
- Documentation example shows VLLM_BASE_TAG=v0.15.0
Consider aligning these or adding a note about available versions.
Missing --rm flag for interactive container:
- docs/getting_started/installation/gpu/cuda.inc.md:134
- Adding --rm would prevent leftover containers after exiting interactive sessions

3. Architecture & Design

Good: Follows the established documentation pattern with include files and tab structure
Good: Maintains consistency with the ROCm documentation approach
Good: Uses the same docker/Dockerfile.ci referenced in the PR description

4. Security & Safety

Acceptable: HF_TOKEN is passed via --env which is standard practice
Standard: Volume mount for HuggingFace cache follows common patterns
Note: Users should be aware that HF_TOKEN will be visible in process listing; this is a known Docker limitation and acceptable for this use case

5. Testing & Documentation

Test Plan

✅ PR includes test commands and environment details
✅ Screenshots provided showing successful build verification
✅ Uses docker build --check for validation

Documentation Completeness

✅ Build instructions are clear
✅ Launch instructions cover both server and interactive modes
✅ Follows existing documentation style

Suggestion

Consider adding a brief note about what modifications users might want to make when building custom images (e.g., "Modify the source code before building to include custom changes").

6. Specific Suggestions

`docs/getting_started/installation/gpu/cuda.inc.md`

Line 113-117: Consider adding a comment about when to use custom builds:

# Use this when you need to modify vLLM-Omni source code or use a specific vLLM version
DOCKER_BUILDKIT=1 docker build \
  -f docker/Dockerfile.ci \
  --build-arg VLLM_BASE_TAG=v0.15.0 \
  -t vllm-omni-cuda .

Line 134: Add --rm flag for cleaner interactive session management:

docker run --runtime nvidia --gpus all -it --rm \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  --env "HF_TOKEN=$HF_TOKEN" \
  -p 8091:8091 \
  --ipc=host \
  --entrypoint bash \
  vllm-omni-cuda

Line 125: The model name Qwen/Qwen3-Omni-30B-A3B-Instruct - verify this matches the current recommended model in other documentation sections for consistency.

7. Approval Status

LGTM with suggestions

The PR is well-structured and ready to merge. The suggestions above are minor improvements:

Optional: Add --rm flag to interactive session example
Optional: Align version numbers between PR description and documentation, or clarify that v0.15.0 is an example
Optional: Add a brief note about use cases for custom builds

These are non-blocking suggestions that could be addressed in a follow-up PR if preferred. The documentation is clear, follows established patterns, and provides valuable guidance for users.

This review was generated automatically by the VLLM-Omni PR Reviewer Bot
using glm-5.

loveysuby · 2026-02-23T15:02:41Z

@lishunyang12 @congw729 Thanks for the review feedback.

I've created Dockerfile.cuda in #1439 based on your suggestions.

Once #1439 is merged, I'll update this documentation PR:

Use Dockerfile.cuda instead of Dockerfile.ci
Change diverse version example to v0.14.0 (instead of default v0.15.0)
Add --gpus 2 and GPU memory note
Add --rm flag for interactive sessions

hsliuustc0106 · 2026-02-24T07:07:10Z

@vllm-omni-reviewer

Gaohan123 · 2026-03-17T14:16:27Z

Hello, any updates? Currently v0.16.0 has already released

loveysuby · 2026-03-27T01:39:47Z

@Gaohan123 I sent you a message on the vLLM Slack about this updates and #1439. Please take a look (cc: @tzhouam)

…ker-build-on-nvidia-cuda

…project#1439) Signed-off-by: Hyoseop Song <crad_on25@naver.com> Signed-off-by: Hyoseop Song <crad_on25@naver.com>

loveysuby · 2026-04-07T04:01:10Z

@Gaohan123 @lishunyang12 PTAL:
I revised docs by using dockerfile.cuda instead of ci-only build. (Dockerfile.cuda is merged main on #1439 based on this pr suggested.)

There was an image build test in the PR body, but since this had already been verified in #1439, I removed it. Please let me know if you have any requests for changes to the document content.

tzhouam · 2026-02-24T07:25:15Z

    You can use this docker image to serve models the same way you would with in vLLM! To do so, make sure you overwrite the default entrypoint (`vllm serve --omni`) which works only for models supported in the vLLM-Omni project.

 # --8<-- [end:pre-built-images]
+
+# --8<-- [start:build-docker]
+
+#### Build docker image
+
+```bash
+DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile.ci -t vllm-omni-cuda .


Yes, the Dockerfile.ci install the vllm-omni in dev mode, which will include some unnecessary packages.

tzhouam · 2026-04-10T01:55:12Z

+```bash
+DOCKER_BUILDKIT=1 docker build \
+  -f docker/Dockerfile.cuda \
+  --build-arg BASE_IMAGE=vllm/vllm-openai:v0.18.0 \


it should be 0.19.0 now

Revised in f74ba81.

Signed-off-by: Hyoseop Song <crad_on25@naver.com> Signed-off-by: Hyoseop Song <crad_on25@naver.com>

[Doc] add custom cuda build guides

cab1097

Signed-off-by: Hyoseop Song <crad_on25@naver.com>

loveysuby commented Feb 16, 2026

View reviewed changes

loveysuby marked this pull request as draft February 16, 2026 15:18

loveysuby marked this pull request as ready for review February 17, 2026 11:59

lishunyang12 reviewed Feb 21, 2026

View reviewed changes

loveysuby mentioned this pull request Feb 23, 2026

[CI/Build] Add Dockerfile.cuda for NVIDIA GPU users [Skip-CI] #1439

Merged

5 tasks

loveysuby added 3 commits April 6, 2026 11:24

Merge remote-tracking branch 'upstream/main' into docs/add-custom-doc…

995049e

…ker-build-on-nvidia-cuda

docs: revise custom image build guide based on Dockerfile.cuda (vllm-…

2ef5846

…project#1439) Signed-off-by: Hyoseop Song <crad_on25@naver.com> Signed-off-by: Hyoseop Song <crad_on25@naver.com>

Merge branch 'main' into docs/add-custom-docker-build-on-nvidia-cuda

2565171

Merge branch 'main' into docs/add-custom-docker-build-on-nvidia-cuda

34f5bfe

tzhouam changed the title ~~[Doc] Add guides for custom docker image build on NVIDIA CUDA~~ [Doc] Add guides for custom docker image build on NVIDIA CUDA [Skip-CI] Apr 10, 2026

tzhouam reviewed Apr 10, 2026

View reviewed changes

loveysuby added 2 commits April 10, 2026 12:41

Update base image version

f74ba81

Signed-off-by: Hyoseop Song <crad_on25@naver.com> Signed-off-by: Hyoseop Song <crad_on25@naver.com>

Merge branch 'main' into docs/add-custom-docker-build-on-nvidia-cuda

8d4211a

Conversation

loveysuby commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

loveysuby Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

congw729 Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

tzhouam Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

lishunyang12 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lishunyang12 Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 commented Feb 22, 2026

Uh oh!

github-actions Bot commented Feb 22, 2026

🤖 VLLM-Omni PR Review

Code Review: Add guides for custom docker image build on NVIDIA CUDA

1. Overview

2. Code Quality

Strengths

Minor Issues

3. Architecture & Design

4. Security & Safety

5. Testing & Documentation

Test Plan

Documentation Completeness

Suggestion

6. Specific Suggestions

docs/getting_started/installation/gpu/cuda.inc.md

7. Approval Status

LGTM with suggestions

Uh oh!

loveysuby commented Feb 23, 2026

Uh oh!

hsliuustc0106 commented Feb 24, 2026

Uh oh!

Gaohan123 commented Mar 17, 2026

Uh oh!

loveysuby commented Mar 27, 2026

Uh oh!

loveysuby commented Apr 7, 2026

Uh oh!

Uh oh!

tzhouam Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

tzhouam Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

loveysuby Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

loveysuby commented Feb 16, 2026 •

edited

Loading

loveysuby Feb 16, 2026 •

edited

Loading

lishunyang12 left a comment •

edited

Loading

lishunyang12 Feb 21, 2026 •

edited

Loading

`docs/getting_started/installation/gpu/cuda.inc.md`