docs: restructure KVBM documentation into three-tier format#5905
docs: restructure KVBM documentation into three-tier format#5905dagil-nvidia merged 15 commits intomainfrom
Conversation
WalkthroughThis pull request restructures KVBM documentation by consolidating scattered backend-specific guides into unified integration documentation, removing fragmented setup files, and introducing comprehensive design and operational guides. Updates navigation structure via toctree changes to reflect the new documentation organization. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/kvbm/kvbm_intro.rst (1)
23-23:⚠️ Potential issue | 🟡 MinorTighten spacing/wording in the memory list item.
“GPU memory(in future) ,” reads as a typo. Consider “GPU memory (future)” and remove the extra space before the comma.
🤖 Fix all issues with AI agents
In `@docs/integrations/flexkv_integration.md`:
- Around line 70-74: The docs reference three missing launch
scripts—agg_flexkv.sh, agg_flexkv_router.sh, and disagg_flexkv.sh—so either add
those scripts to the launch scripts set with the expected
aggregator/router/disaggregator start logic (matching the style and env var
usage of the other launch scripts) or update the documentation to point to the
actual script names or remove the calls; make the change by adding the three
shell scripts with clear usage/help text or editing the doc examples to use the
correct existing script names so the examples run as shown.
In `@docs/integrations/sglang_hicache.md`:
- Line 1: Remove the trailing whitespace at the end of line 1 in
docs/integrations/sglang_hicache.md (the empty/HTML comment line at the top),
save the file, and commit the change so the pre-commit hook and pipeline no
longer fail.
- Line 445: Update the broken markdown link in
docs/integrations/sglang_hicache.md by replacing the removed target
"../kvbm/kvbm_architecture.md" with the new file "../kvbm/kvbm_design.md" so the
"**[KVBM Architecture](... )**" link points to the existing kvbm_design.md;
ensure the link text ("KVBM Architecture") remains the same and the relative
path is correct.
- Around line 122-124: Update the fenced code block that contains the formula
"Host KV Cache Size = Device KV Cache Size × hicache-ratio" to include a
language specifier (use "text") after the opening triple backticks so the block
is rendered correctly; locate the code block in
docs/integrations/sglang_hicache.md by searching for that exact formula and
change the opening fence from ``` to ```text.
In `@docs/kvbm/kvbm_design.md`:
- Around line 103-105: The fenced code block containing the formula
"block_stride_in_bytes = align_up(num_layers × layer_stride, alignment);" should
include a language specifier (e.g., text) so renderers treat it as plain
text/pseudocode; update the triple-backtick fence to include `text` (or convert
the single-line formula to inline code) around the expression involving
block_stride_in_bytes, align_up, num_layers, layer_stride, and alignment.
In `@docs/kvbm/kvbm_guide.md`:
- Line 1: Remove the trailing whitespace found on line 1 of the
docs/kvbm/kvbm_guide.md file: edit the file to delete the extra space at the end
of the first line (so it ends with the last visible character, not a space),
save and recommit so the pre-commit hook can pass; no code changes required
beyond trimming that whitespace.
- Around line 46-50: Update the broken link to the KVBM bindings README by
replacing the lowercase filename reference "readme.md" with the actual uppercase
filename "README.md" in the link text that points to the "KVBM bindings README"
(the line containing "To build KVBM from source, see the detailed instructions
in the [KVBM bindings
README](../../lib/bindings/kvbm/readme.md#build-from-source)"). Ensure the
Markdown link now uses "README.md" so it works on case-sensitive filesystems.
1389998 to
69fa1fe
Compare
Migrate KVBM documentation to a new three-tier structure: - Tier 1: README.md (Quick Start) - overview, link to user guide, feature matrix, architecture - Tier 2: kvbm_guide.md (Guide) - installation, configuration, deployment for all user paths (pip wheel, via trtllm/vllm or via the dynamo integrations with other kv offloading solutions) - Tier 3: kvbm_design.md (Design) - architecture deep dive, components, data flows, framework integrations Create integrations folder with: - flexkv_integration.md - new FlexKV integration guide from PR #5858 - lmcache_integration.md - migrated from backends/vllm/ - sglang_hicache.md - migrated from backends/sglang/ Add AGENTS.md for KVBM component to guide AI agents. Update docs/index.rst to add "KV Cache Offloading" as first item under User Guides. Signed-off-by: akshatha-k <akshutk@gmail.com>
Update fern directory with new three-tier KVBM documentation: - Add fern/pages/kvbm/README.md (Quick Start) - Add fern/pages/kvbm/kvbm-guide.md (Guide) - Add fern/pages/kvbm/kvbm-design.md (Design) Create fern/pages/integrations/ with: - lmcache-integration.md - flexkv-integration.md - sglang-hicache.md Update fern/versions/next.yml navigation: - Add "KV Cache Offloading" to User Guides section - Update KVBM section with new structure - Add Integrations section Delete old fern KVBM files that were replaced. Signed-off-by: akshatha-k <akshutk@gmail.com>
- Remove references to non-existent FlexKV launch scripts (agg_flexkv.sh, agg_flexkv_router.sh, disagg_flexkv.sh) from docs and fern - Add language specifier 'text' to code blocks showing formulas in sglang_hicache.md and kvbm_design.md - Fix broken link: kvbm_architecture.md → kvbm_design.md in sglang_hicache.md - Fix case sensitivity: readme.md → README.md in kvbm_guide.md Signed-off-by: akshatha-k <akshutk@gmail.com>
Move the Grafana screenshot to the central images directory and update the reference in kvbm_guide.md to use the new path. Signed-off-by: akshatha-k <akshutk@gmail.com>
Update links to reflect new documentation structure: - kvbm_architecture.md → README.md (KVBM overview) - LMCache_Integration.md → integrations/lmcache_integration.md - trtllm-setup.md → kvbm_guide.md#run-kvbm-in-dynamo-with-tensorrt-llm Affected files: - README.md - docs/backends/sglang/README.md - docs/backends/trtllm/README.md - docs/backends/vllm/README.md - docs/backends/vllm/prometheus.md - fern/pages/backends/sglang/README.md Signed-off-by: akshatha-k <akshutk@gmail.com>
Update links in fern documentation to match new KVBM structure: - Update KVBM and LMCache links in fern/pages/backends/ - Fix cross-references in fern/pages/integrations/sglang-hicache.md Signed-off-by: akshatha-k <akshutk@gmail.com>
69fa1fe to
ce8103b
Compare
|
/ok to test ce8103b |
This commit reverts all changes made to the fern/ directory, restoring it to its state before this PR's documentation changes. Signed-off-by: akshatha-k <akshutk@gmail.com>
Signed-off-by: akshatha-k <akshutk@gmail.com>
5ef845f to
2d4dfb2
Compare
Signed-off-by: akshatha-k <akshutk@gmail.com>
|
/ok to test a1405da |
|
/ok to test 130ed97 |
- Remove trailing whitespace in kvbm_guide.md (pre-commit failure) - Remove deleted kvbm/vllm-setup.md and kvbm/trtllm-setup.md from hidden_toctree.rst - Remove moved backends/vllm/LMCache_Integration.md from hidden_toctree.rst - Fix spacing in kvbm_intro.rst: "GPU memory(in future) ," → "GPU memory (future)," Signed-off-by: Dan Gil <dagil@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>
|
/ok to test 9edc4ba |
- Remove lib/bindings/kvbm/AGENTS.md from this PR (submit separately) - Add blank lines before image captions for proper rendering - Add G4 (Remote Storage) to storage pools section - Add tier labels (G3, G4) to Host → Disk offload descriptions Signed-off-by: Dan Gil <dagil@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>
|
/ok to test cf7598a |
Co-authored-by: Kyle McGill <101670481+nv-kmcgill53@users.noreply.github.com> Signed-off-by: dagil-nvidia <dagil@nvidia.com>
Co-authored-by: Kyle McGill <101670481+nv-kmcgill53@users.noreply.github.com> Signed-off-by: dagil-nvidia <dagil@nvidia.com>
Co-authored-by: Kyle McGill <101670481+nv-kmcgill53@users.noreply.github.com> Signed-off-by: dagil-nvidia <dagil@nvidia.com>
|
/ok to test e86c9ed |
…o#5905) Signed-off-by: akshatha-k <akshutk@gmail.com> Signed-off-by: Dan Gil <dagil@nvidia.com> Signed-off-by: dagil-nvidia <dagil@nvidia.com> Co-authored-by: dagil-nvidia <dagil@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Kyle McGill <101670481+nv-kmcgill53@users.noreply.github.com>
Summary
Migrate KVBM documentation to a new three-tier structure:
Create integrations folder with:
flexkv_integration.md- new FlexKV integration guide from PR feat: FlexKV integration in Dynamo #5858lmcache_integration.md- migrated from backends/vllm/sglang_hicache.md- migrated from backends/sglang/Add
AGENTS.mdfor KVBM component to guide AI agentsUpdate
docs/index.rstto add "KV Cache Offloading" as first item under User GuidesTest plan
Summary by CodeRabbit
Release Notes