docs: migrate Speculative Decoding docs to three-tier structure by dagil-nvidia · Pull Request #6001 · ai-dynamo/dynamo

dagil-nvidia · 2026-02-05T16:19:43Z

Summary

Create docs/features/speculative_decoding/README.md with feature overview and backend support matrix
Add deprecation notice to docs/backends/vllm/speculative_decoding.md pointing to new location
Add cross-reference from vLLM README to new feature documentation
Add redirect in conf.py for backward compatibility
Add toctree entry in hidden_toctree.rst

Part of the docs hierarchy refactoring effort to organize cross-cutting features.

Test plan

Docs build passes with standard commands:

# One-time setup
uv venv .venv-docs
uv pip install --python .venv-docs --group docs

# Build docs
uv run --python .venv-docs --no-project docs/generate_docs.py

No broken links in new content
Redirect configured for old path

Summary by CodeRabbit

Documentation
- Added new documentation pages covering speculative decoding and vLLM backend integration with setup instructions and examples
- Added cross-references between related documentation sections
- Implemented documentation redirects for improved navigation

Copy speculative decoding documentation to docs/features/speculative_decoding/ as a cross-cutting feature with backend support matrix. Changes: - Create docs/features/speculative_decoding/README.md with feature overview - Add deprecation notice to docs/backends/vllm/speculative_decoding.md - Add cross-reference from vLLM README to new feature location - Add redirect in conf.py for old path - Add toctree entry in hidden_toctree.rst Part of docs hierarchy refactoring effort. Signed-off-by: Dan Gil <[email protected]> Co-authored-by: Cursor <[email protected]>

copy-pr-bot · 2026-02-05T16:19:48Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-05T16:22:26Z

Walkthrough

Adds a new Speculative Decoding feature guide and vLLM-specific page, inserts cross-references from existing vLLM docs, adds a Sphinx redirect, and registers the new pages in the hidden toctree. All changes are documentation-only.

Changes

Cohort / File(s)	Summary
Feature Overview `docs/features/speculative_decoding/README.md`, `docs/features/speculative_decoding/speculative_decoding_vllm.md`	New comprehensive Speculative Decoding feature guide and a vLLM-specific page outlining workflow, prerequisites, quick-start (vLLM + Eagle3), examples, and backend notes.
vLLM Backend Docs `docs/backends/vllm/README.md`, `docs/backends/vllm/speculative_decoding.md`	Added cross-reference links to the new feature guide in Advanced Examples/Quickstart; added notes that content moved and will be removed in future.
Docs Configuration `docs/conf.py`, `docs/hidden_toctree.rst`	Added a Sphinx redirect mapping `backends/vllm/speculative_decoding` → `../../features/speculative_decoding/speculative_decoding_vllm.html` and registered new feature pages in the hidden toctree.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 I hopped through docs to find a trail,
Left notes of drafts and a verification tale,
Eagles and vLLM peeking through the trees,
Tokens scurry fast on speculative breeze,
A tiny rabbit cheers: "Read on, please!" 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	The description provides a clear summary of changes, includes a test plan with verification steps, but is missing a 'Related Issues' section with issue references as specified in the template.	Add a 'Related Issues' section citing the relevant GitHub issue(s) using action keywords like 'Closes' or 'Relates to' as shown in the template.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: migrating speculative decoding documentation to a three-tier structure, which aligns with the substantive changes across multiple documentation files.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

dagil-nvidia · 2026-02-05T16:33:44Z

/ok to test 3dd4010

- Create docs/features/speculative_decoding/speculative_decoding_vllm.md with vLLM-specific content following <feature>_<backend>.md convention - Update README.md to link to local backend-specific file - Update original backends file to redirect to new location - Update conf.py redirect to point to vllm-specific file Signed-off-by: Dan Gil <[email protected]> Co-authored-by: Cursor <[email protected]>

dagil-nvidia · 2026-02-05T16:51:45Z

/ok to test 0d6bbcb

dagil-nvidia · 2026-02-05T16:59:44Z

@coderabbitai full review

coderabbitai · 2026-02-05T16:59:55Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@docs/features/speculative_decoding/speculative_decoding_vllm.md`:
- Around line 89-105: The example response JSON for the /v1/chat/completions
example is using choices[].text; update it to the OpenAI-compatible chat schema
by replacing choices[].text with choices[].message containing a role (e.g.,
"assistant") and content (string or object) fields—i.e., change the choice
object structure used in the example (keep id, model, usage) so that each choice
uses message.role and message.content instead of text to match the vLLM/Dynamo
integration and OpenAI API.

🧹 Nitpick comments (1)

docs/features/speculative_decoding/speculative_decoding_vllm.md (1)
51-55: Avoid a fixed approval-time promise.

Hugging Face approval latency varies; a softer statement reduces the risk of stale guidance.
📝 Suggested tweak
-Approval usually takes around **5 minutes**.
+Approval time can vary depending on Hugging Face review/traffic.

docs/features/speculative_decoding/speculative_decoding_vllm.md

- Fix response schema: use message.role/content instead of text - Soften approval time claim (varies vs fixed 5 minutes) Signed-off-by: Dan Gil <[email protected]> Co-authored-by: Cursor <[email protected]>

dagil-nvidia · 2026-02-05T17:06:22Z

/ok to test a3df9f9

docs/backends/vllm/speculative_decoding.md

docs/conf.py

docs/backends/vllm/README.md

…ynamo#6001) Signed-off-by: Dan Gil <[email protected]> Co-authored-by: Cursor <[email protected]>

dagil-nvidia requested review from a team as code owners February 5, 2026 16:19

pull-request-size bot added the size/L label Feb 5, 2026

github-actions bot added docs documentation Improvements or additions to documentation labels Feb 5, 2026

dagil-nvidia changed the title ~~docs: add speculative decoding to docs/features/~~ docs: migrate Speculative Decoding docs to three-tier structure Feb 5, 2026

dagil-nvidia requested a review from rmccorm4 February 5, 2026 16:30

copy-pr-bot bot temporarily deployed to GITLAB February 5, 2026 16:51 Inactive

copy-pr-bot bot temporarily deployed to GITLAB February 5, 2026 16:55 Inactive

coderabbitai bot reviewed Feb 5, 2026

View reviewed changes

docs/features/speculative_decoding/speculative_decoding_vllm.md Show resolved Hide resolved

fix: address CodeRabbit review comments

a3df9f9

- Fix response schema: use message.role/content instead of text - Soften approval time claim (varies vs fixed 5 minutes) Signed-off-by: Dan Gil <[email protected]> Co-authored-by: Cursor <[email protected]>

rmccorm4 approved these changes Feb 5, 2026

View reviewed changes

rmccorm4 reviewed Feb 5, 2026

View reviewed changes

docs/backends/vllm/speculative_decoding.md Show resolved Hide resolved

rmccorm4 reviewed Feb 5, 2026

View reviewed changes

docs/conf.py Show resolved Hide resolved

rmccorm4 reviewed Feb 5, 2026

View reviewed changes

docs/backends/vllm/README.md Show resolved Hide resolved

dagil-nvidia merged commit 8aa7335 into main Feb 5, 2026
33 checks passed

dagil-nvidia deleted the dagil/docs-speculative-migration branch February 5, 2026 23:45

dagil-nvidia mentioned this pull request Feb 6, 2026

docs: cleanup of docs refactor for components, integrations, and features #6019

Merged

8 tasks

soodoshll pushed a commit to soodoshll/dynamo that referenced this pull request Feb 12, 2026

docs: migrate Speculative Decoding docs to three-tier structure (ai-d…

5a67e02

…ynamo#6001) Signed-off-by: Dan Gil <[email protected]> Co-authored-by: Cursor <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: migrate Speculative Decoding docs to three-tier structure#6001

docs: migrate Speculative Decoding docs to three-tier structure#6001
dagil-nvidia merged 3 commits intomainfrom
dagil/docs-speculative-migration

dagil-nvidia commented Feb 5, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Feb 5, 2026

Uh oh!

coderabbitai bot commented Feb 5, 2026 •

edited

Loading

Uh oh!

dagil-nvidia commented Feb 5, 2026

Uh oh!

dagil-nvidia commented Feb 5, 2026

Uh oh!

dagil-nvidia commented Feb 5, 2026

Uh oh!

coderabbitai bot commented Feb 5, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

dagil-nvidia commented Feb 5, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dagil-nvidia commented Feb 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Feb 5, 2026

Uh oh!

coderabbitai bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

dagil-nvidia commented Feb 5, 2026

Uh oh!

dagil-nvidia commented Feb 5, 2026

Uh oh!

dagil-nvidia commented Feb 5, 2026

Uh oh!

coderabbitai bot commented Feb 5, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dagil-nvidia commented Feb 5, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dagil-nvidia commented Feb 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 5, 2026 •

edited

Loading