Conversation
📝 WalkthroughWalkthroughRemoved five legacy gb300-fp4 128k8k YAML recipes and added multiple new 128k8k and 128k8k_mtp configuration YAMLs, a chunked-prefill recipe, and a Bash setup script; changes are configuration-only (no code/API changes). Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 🧹 Recent nitpick comments
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
🤖 Fix all issues with AI agents
In
`@recipes/gb300-fp4/128k8k_mtp/highthroughput-ctx3_ctx_pp4_gen1_dep8_batch16_eplb0_mtp1.yaml`:
- Around line 1-4: The recipe filename indicates EPLB is off (eplb0) but the
sglang_config enables EPLB via the keys eplb-algorithm and
ep-num-redundant-experts; either rename the recipe to reflect EPLB on (update
name to include eplb1) or disable EPLB by removing or commenting out the
sglang_config keys eplb-algorithm and ep-num-redundant-experts (and any other
EPLB-related params) so the file name and the actual config are consistent;
locate these keys (eplb-algorithm, ep-num-redundant-experts) in the recipe and
apply one of the two options.
- Around line 145-146: The YAML contains a user-specific mount entry under the
extra_mount key (the string "/home/yangminl/sglang:/sgl-workspace/sglang");
remove this personal dev path or replace it with a generic/template value (e.g.,
an env var or placeholder) so CI and other developers won't reference a
non-existent local path; update the extra_mount list in the same file to either
omit the entry or use a parameter like "${LOCAL_SGLANG}:/sgl-workspace/sglang"
and document the expected env var.
In
`@recipes/gb300-fp4/128k8k_mtp/midcurve-ctx5_ctx_pp4_gen1_dep16_batch8_eplb0_mtp2.yaml`:
- Around line 97-99: The recipe filename indicates "eplb0" but the config
enables EPLB (eplb-algorithm: deepseek) and sets ep-num-redundant-experts: 32;
update one of them to be consistent: either rename the recipe file to include
the EPLB variant (e.g., remove "eplb0" or add "eplb-deepseek") or change the
config to disable EPLB (set eplb-algorithm to "none"/"" or remove
ep-num-redundant-experts and set ep-num-redundant-experts: 0) so that
ep-num-redundant-experts and eplb-algorithm in the YAML match the filename.
- Around line 1-4: The config name contains a duplicated token "ctx5_ctx_pp4"
which looks like a copy-paste typo; update the YAML name value (the "name:"
string in this file) to the corrected identifier (e.g.,
"gb300-fp4-128k8k-ctx5_pp4_gen1_dep16_batch8_eplb0_mtp2") and ensure the recipe
filename and any other references that mirror the name use the same corrected
token ("ctx5_pp4") so the identifier is consistent across the file and
references.
- Around line 146-147: The extra_mount entry contains a personal home path
"/home/yangminl/sglang:/sgl-workspace/sglang" which must not be committed;
remove this mapping from the extra_mount list or replace it with a
non-user-specific alternative (e.g., a project-relative path, CI-standard path,
or environment variable placeholder) so that the extra_mount key remains generic
and portable; update the extra_mount value that currently references
"/home/yangminl/sglang:/sgl-workspace/sglang" accordingly.
In
`@recipes/gb300-fp4/128k8k/highthroughput-ctx5_ctx_pp4_gen1_dep16_batch16_eplb0_mtp0.yaml`:
- Around line 21-66: The environment YAML has inconsistent boolean string
casing: MC_TE_METRIC is "true" while SGLANG_MOONCAKE_CUSTOM_MEM_POOL is "True";
update the values for SGLANG_MOONCAKE_CUSTOM_MEM_POOL in both decode_environment
and prefill_environment to the same lowercase "true" (or normalize both to
"True" if your code expects title-case) so string comparisons like MC_TE_METRIC
== "true" behave consistently; ensure you change the
SGLANG_MOONCAKE_CUSTOM_MEM_POOL entries that appear alongside MC_TE_METRIC in
both blocks.
- Around line 132-133: Remove the hardcoded developer path in the extra_mount
entry ("/home/yangminl/sglang:/sgl-workspace/sglang") — either delete that mount
line from the extra_mount list or replace the left-hand side with a configurable
placeholder/env var (e.g., ${SGLANG_LOCAL_PATH}) so users can override it;
ensure the YAML key extra_mount remains valid and document the new variable in
README or defaults so CI and other developers won’t rely on a personal home
directory.
In
`@recipes/gb300-fp4/128k8k/maxthroughput-ctx3_ctx_pp4_gen1_dep8_batch32_eplb0_mtp0.yaml`:
- Around line 132-133: The extra_mount entry currently contains a hardcoded,
user-specific path "/home/yangminl/sglang:/sgl-workspace/sglang"; update the
extra_mount configuration to remove or replace this hardcoded path by either
deleting the mount, substituting a project-relative/shared path (e.g.,
"./sglang:/sgl-workspace/sglang"), or parameterizing it with an environment
variable (e.g., use a placeholder like
"${SGLANG_HOST_PATH}:/sgl-workspace/sglang" and document SGLANG_HOST_PATH). Edit
the extra_mount key in the YAML to implement one of these options and ensure no
username or absolute home path remains.
In
`@recipes/gb300-fp4/128k8k/middlecurve-ctx8_ctx_pp4_gen1_dep32_batch8_eplb0_mtp0.yaml`:
- Around line 132-133: The extra_mount entry contains a hardcoded personal path
"/home/yangminl/sglang" which will break for other users; edit the extra_mount
list (key: extra_mount) to remove this user-specific mount or replace it with a
parameterized/portable value (e.g., use an environment variable placeholder like
${SG_LANG_PATH} or ${HOME}/sglang, or point to a shared well-known path) and
update any README or config docs to describe how to set that variable so the
recipe is portable.
- Around line 90-94: The filename indicates EPLB is disabled (suffix "eplb0")
but the config enables it via ep-num-redundant-experts and eplb-algorithm;
reconcile by either updating the filename to reflect EPLB enabled or disabling
EPLB in the config: change eplb-related fields (eplb-algorithm,
ep-num-redundant-experts, and any related flags like
disable-shared-experts-fusion if needed) so they reflect "EPLB off" (e.g., set
eplb-algorithm to a disabled/none value and ep-num-redundant-experts to 0) or
rename the recipe to remove the "eplb0" suffix so it accurately denotes EPLB is
active.
🧹 Nitpick comments (8)
recipes/gb300-fp4/128k8k_mtp/midcurve-ctx5_ctx_pp4_gen1_dep16_batch8_eplb0_mtp2.yaml (2)
46-49: Unresolved TODO comments — confirm whether these env vars are needed for prefill.Lines 48–49 flag open questions (
TODO: check if needed for prefill) forSGLANG_NCCL_ALL_GATHER_IN_OVERLAP_SCHEDULER_SYNC_BATCHandSGLANG_BLACKWELL_OVERLAP_SHARED_EXPERTS_OUTSIDE_SBO. Since these are currently absent fromprefill_environment, please verify and resolve the TODOs before merging, or leave a tracking issue.Would you like me to open an issue to track resolving these TODOs?
126-136: Commented-out config lines — clean up or document intent.Lines 126, 130, and 136 contain commented-out settings (
enable-symm-mem,max-total-tokens, alternativemoe-runner-backend). If these are intentionally kept as reference for future tuning, a brief comment explaining why would help; otherwise, remove them to reduce noise.recipes/gb300-fp4/128k8k/maxthroughput-ctx3_ctx_pp4_gen1_dep8_batch32_eplb0_mtp0.yaml (2)
21-66: Consider using YAML anchors to reduce environment duplication.The
decode_environmentandprefill_environmentblocks share ~15 identical key-value pairs, with decode adding only 3 extra vars (SGLANG_DEEPEP_NUM_MAX_DISPATCH_TOKENS_PER_RANK,SGLANG_MOE_NVFP4_DISPATCH,SGLANG_NVFP4_CKPT_FP8_NEXTN_MOE). YAML anchors (&/<<: *) would eliminate the drift risk when a shared variable needs updating.
113-117: Commented-out config keys left in prefill section.Lines 113 (
enable-symm-mem) and 117 (max-total-tokens) are commented out. If these are intentionally documented as tuning knobs, a brief inline comment explaining why they're disabled would help future readers; otherwise consider removing them to keep the config clean.recipes/gb300-fp4/128k8k/middlecurve-ctx8_ctx_pp4_gen1_dep32_batch8_eplb0_mtp0.yaml (2)
113-113: Clean up commented-out config lines before merging.Lines 113 (
# enable-symm-mem: true) and 117 (# max-total-tokens: 544000) are development remnants. If they're intentionally disabled, they can simply be removed to keep the config clean. If they may be re-enabled, consider adding a brief comment explaining why they're preserved.Also applies to: 117-117
43-44: Minor style inconsistency: unquoted string values.
FLASHINFER_WORKSPACE_BASEandSGLANG_DG_CACHE_DIRvalues are bare paths while most other env vars use quoted strings. YAML handles this fine, but quoting them would be consistent with the rest of the file.Also applies to: 65-66
recipes/gb300-fp4/128k8k_mtp/highthroughput-ctx3_ctx_pp4_gen1_dep8_batch16_eplb0_mtp1.yaml (2)
46-49: Resolve TODO comments before merging.Lines 48–49 flag open questions about whether
SGLANG_NCCL_ALL_GATHER_IN_OVERLAP_SCHEDULER_SYNC_BATCHandSGLANG_BLACKWELL_OVERLAP_SHARED_EXPERTS_OUTSIDE_SBOare needed for prefill. These are only set in the decode environment; if they are indeed needed for prefill as well, they should be added toprefill_environment. Please resolve these TODOs (or confirm they're decode-only) before this ships.Would you like me to open an issue to track confirming and removing these TODOs?
43-44: Unquoted path values — minor inconsistency.
FLASHINFER_WORKSPACE_BASEandSGLANG_DG_CACHE_DIRare unquoted on Lines 43–44 and 70–71, while most other values in the environment blocks are quoted strings. YAML will still parse these as strings, but quoting them would be consistent with the rest of the file.Also applies to: 70-71
| # Config: ctx3_ctx_pp4_gen1_dep8_batch16_eplb0_mtp1 | ||
| # max global bs: 128, concurrency: 256 | ||
|
|
||
| name: "gb300-fp4-128k8k-ctx3_ctx_pp4_gen1_dep8_batch16_eplb0_mtp1" |
There was a problem hiding this comment.
Filename says eplb0 but config enables EPLB.
The filename encodes eplb0 (EPLB off), yet the decode sglang_config sets eplb-algorithm: deepseek (Line 99) and ep-num-redundant-experts: 32 (Line 97). Either the filename should be updated to reflect EPLB being on, or those config keys should be removed if EPLB is truly intended to be disabled.
🤖 Prompt for AI Agents
In
`@recipes/gb300-fp4/128k8k_mtp/highthroughput-ctx3_ctx_pp4_gen1_dep8_batch16_eplb0_mtp1.yaml`
around lines 1 - 4, The recipe filename indicates EPLB is off (eplb0) but the
sglang_config enables EPLB via the keys eplb-algorithm and
ep-num-redundant-experts; either rename the recipe to reflect EPLB on (update
name to include eplb1) or disable EPLB by removing or commenting out the
sglang_config keys eplb-algorithm and ep-num-redundant-experts (and any other
EPLB-related params) so the file name and the actual config are consistent;
locate these keys (eplb-algorithm, ep-num-redundant-experts) in the recipe and
apply one of the two options.
recipes/gb300-fp4/128k8k_mtp/highthroughput-ctx3_ctx_pp4_gen1_dep8_batch16_eplb0_mtp1.yaml
Outdated
Show resolved
Hide resolved
| # Config: ctx5_ctx_pp4_gen1_dep16_batch8_eplb0_mtp2 | ||
| # max global bs: 128, concurrency: 256 | ||
|
|
||
| name: "gb300-fp4-128k8k-ctx5_ctx_pp4_gen1_dep16_batch8_eplb0_mtp2" |
There was a problem hiding this comment.
Possible typo in the config identifier: ctx5_ctx.
The name and filename both contain the substring ctx5_ctx_pp4, where the repeated ctx fragment looks like a copy-paste artifact. Should this be ctx5_pp4 (or a different token)?
🤖 Prompt for AI Agents
In
`@recipes/gb300-fp4/128k8k_mtp/midcurve-ctx5_ctx_pp4_gen1_dep16_batch8_eplb0_mtp2.yaml`
around lines 1 - 4, The config name contains a duplicated token "ctx5_ctx_pp4"
which looks like a copy-paste typo; update the YAML name value (the "name:"
string in this file) to the corrected identifier (e.g.,
"gb300-fp4-128k8k-ctx5_pp4_gen1_dep16_batch8_eplb0_mtp2") and ensure the recipe
filename and any other references that mirror the name use the same corrected
token ("ctx5_pp4") so the identifier is consistent across the file and
references.
| ep-num-redundant-experts: 32 | ||
| disable-shared-experts-fusion: true | ||
| eplb-algorithm: deepseek |
There was a problem hiding this comment.
Filename says eplb0 but config sets eplb-algorithm: deepseek.
The filename encodes eplb0, which implies EPLB is disabled or zeroed, but the decode config explicitly enables it with eplb-algorithm: deepseek and ep-num-redundant-experts: 32. Either the filename should reflect the actual EPLB setting or the config should disable EPLB to match the name.
🤖 Prompt for AI Agents
In
`@recipes/gb300-fp4/128k8k_mtp/midcurve-ctx5_ctx_pp4_gen1_dep16_batch8_eplb0_mtp2.yaml`
around lines 97 - 99, The recipe filename indicates "eplb0" but the config
enables EPLB (eplb-algorithm: deepseek) and sets ep-num-redundant-experts: 32;
update one of them to be consistent: either rename the recipe file to include
the EPLB variant (e.g., remove "eplb0" or add "eplb-deepseek") or change the
config to disable EPLB (set eplb-algorithm to "none"/"" or remove
ep-num-redundant-experts and set ep-num-redundant-experts: 0) so that
ep-num-redundant-experts and eplb-algorithm in the YAML match the filename.
recipes/gb300-fp4/128k8k_mtp/midcurve-ctx5_ctx_pp4_gen1_dep16_batch8_eplb0_mtp2.yaml
Outdated
Show resolved
Hide resolved
| decode_environment: | ||
| TORCH_DISTRIBUTED_DEFAULT_TIMEOUT: "7200" | ||
| PYTHONUNBUFFERED: "1" | ||
| DYN_SKIP_SGLANG_LOG_FORMATTING: "1" | ||
| SGLANG_NVFP4_CKPT_FP8_GEMM_IN_ATTN: "1" | ||
| SGLANG_PER_TOKEN_GROUP_QUANT_8BIT_V2: "1" | ||
| SGLANG_DISAGGREGATION_HEARTBEAT_MAX_FAILURE: "100000" | ||
| SGLANG_DISAGGREGATION_BOOTSTRAP_TIMEOUT: "100000" | ||
| SGLANG_DISAGGREGATION_WAITING_TIMEOUT: "100000" | ||
| SGLANG_HACK_SEQ_BOOTSTRAP_ROOM: "1" | ||
| MC_TE_METRIC: "true" | ||
| MC_FORCE_MNNVL: "1" | ||
| NCCL_MNNVL_ENABLE: "1" | ||
| NCCL_CUMEM_ENABLE: "1" | ||
| SGLANG_MOONCAKE_CUSTOM_MEM_POOL: "True" | ||
| SGLANG_USE_MESSAGE_QUEUE_BROADCASTER: "0" | ||
| SGLANG_DISABLE_TP_MEMORY_INBALANCE_CHECK: "1" | ||
| SGLANG_DEEPEP_NUM_MAX_DISPATCH_TOKENS_PER_RANK: "1024" | ||
| SGLANG_MOE_NVFP4_DISPATCH: "1" | ||
| SGLANG_NVFP4_CKPT_FP8_NEXTN_MOE: "1" | ||
| SGLANG_FLASHINFER_FP4_GEMM_BACKEND: "cutlass" | ||
| FLASHINFER_DISABLE_VERSION_CHECK: "1" | ||
| FLASHINFER_WORKSPACE_BASE: /tmp/flashinfer-cache | ||
| SGLANG_DG_CACHE_DIR: /configs/deepgemm-cache | ||
|
|
||
| prefill_environment: | ||
| TORCH_DISTRIBUTED_DEFAULT_TIMEOUT: "7200" | ||
| PYTHONUNBUFFERED: "1" | ||
| DYN_SKIP_SGLANG_LOG_FORMATTING: "1" | ||
| SGLANG_NVFP4_CKPT_FP8_GEMM_IN_ATTN: "1" | ||
| SGLANG_PER_TOKEN_GROUP_QUANT_8BIT_V2: "1" | ||
| SGLANG_DISAGGREGATION_HEARTBEAT_MAX_FAILURE: "100000" | ||
| SGLANG_DISAGGREGATION_BOOTSTRAP_TIMEOUT: "100000" | ||
| SGLANG_DISAGGREGATION_WAITING_TIMEOUT: "100000" | ||
| SGLANG_HACK_SEQ_BOOTSTRAP_ROOM: "1" | ||
| MC_TE_METRIC: "true" | ||
| MC_FORCE_MNNVL: "1" | ||
| NCCL_MNNVL_ENABLE: "1" | ||
| NCCL_CUMEM_ENABLE: "1" | ||
| SGLANG_MOONCAKE_CUSTOM_MEM_POOL: "True" | ||
| SGLANG_USE_MESSAGE_QUEUE_BROADCASTER: "0" | ||
| SGLANG_DISABLE_TP_MEMORY_INBALANCE_CHECK: "1" | ||
| SGLANG_FLASHINFER_FP4_GEMM_BACKEND: "cutlass" | ||
| FLASHINFER_DISABLE_VERSION_CHECK: "1" | ||
| FLASHINFER_WORKSPACE_BASE: /tmp/flashinfer-cache | ||
| SGLANG_DG_CACHE_DIR: /configs/deepgemm-cache |
There was a problem hiding this comment.
Minor inconsistency: "true" vs "True" casing in environment values.
MC_TE_METRIC uses lowercase "true" (lines 31, 56) while SGLANG_MOONCAKE_CUSTOM_MEM_POOL uses title-case "True" (lines 35, 60). If the consuming code does strict string comparison (e.g., == "true"), the title-case variant may not be recognized. Worth normalizing to a consistent casing.
🤖 Prompt for AI Agents
In
`@recipes/gb300-fp4/128k8k/highthroughput-ctx5_ctx_pp4_gen1_dep16_batch16_eplb0_mtp0.yaml`
around lines 21 - 66, The environment YAML has inconsistent boolean string
casing: MC_TE_METRIC is "true" while SGLANG_MOONCAKE_CUSTOM_MEM_POOL is "True";
update the values for SGLANG_MOONCAKE_CUSTOM_MEM_POOL in both decode_environment
and prefill_environment to the same lowercase "true" (or normalize both to
"True" if your code expects title-case) so string comparisons like MC_TE_METRIC
== "true" behave consistently; ensure you change the
SGLANG_MOONCAKE_CUSTOM_MEM_POOL entries that appear alongside MC_TE_METRIC in
both blocks.
recipes/gb300-fp4/128k8k/highthroughput-ctx5_ctx_pp4_gen1_dep16_batch16_eplb0_mtp0.yaml
Outdated
Show resolved
Hide resolved
recipes/gb300-fp4/128k8k/maxthroughput-ctx3_ctx_pp4_gen1_dep8_batch32_eplb0_mtp0.yaml
Outdated
Show resolved
Hide resolved
| deepep-mode: low_latency | ||
| ep-dispatch-algorithm: static | ||
| ep-num-redundant-experts: 32 | ||
| disable-shared-experts-fusion: true | ||
| eplb-algorithm: deepseek |
There was a problem hiding this comment.
Filename says eplb0 but config enables EPLB.
The filename suffix eplb0 implies EPLB is disabled, yet the decode config sets eplb-algorithm: deepseek with ep-num-redundant-experts: 32. Either the filename is stale or these config values are unintended. Please reconcile.
🤖 Prompt for AI Agents
In
`@recipes/gb300-fp4/128k8k/middlecurve-ctx8_ctx_pp4_gen1_dep32_batch8_eplb0_mtp0.yaml`
around lines 90 - 94, The filename indicates EPLB is disabled (suffix "eplb0")
but the config enables it via ep-num-redundant-experts and eplb-algorithm;
reconcile by either updating the filename to reflect EPLB enabled or disabling
EPLB in the config: change eplb-related fields (eplb-algorithm,
ep-num-redundant-experts, and any related flags like
disable-shared-experts-fusion if needed) so they reflect "EPLB off" (e.g., set
eplb-algorithm to a disabled/none value and ep-num-redundant-experts to 0) or
rename the recipe to remove the "eplb0" suffix so it accurately denotes EPLB is
active.
recipes/gb300-fp4/128k8k/middlecurve-ctx8_ctx_pp4_gen1_dep32_batch8_eplb0_mtp0.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
These two flags are decode only
There was a problem hiding this comment.
Got it, removing in next patchset
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@configs/gb300-fp4-128k8k-setup.sh`:
- Around line 39-42: The direct wheel URL for sgl-kernel
(sgl_kernel-0.3.21+cu130-cp310-abi3_manylinux2014_aarch64.whl) is pinned to
aarch64 and CPython 3.10 and may install an incompatible binary; add a pre-check
that validates the host architecture and Python major/minor (e.g., check uname
-m and python -c "import sys; print(sys.version_info[:2])") before running the
pip install, and if they don’t match aarch64/3.10 either abort with a clear
error message or choose an alternative wheel/source, or at minimum add an
explicit comment documenting the assumed architecture/Python and reference
PIP_ARGS and the exact wheel filename so future changes will update this guard.
- Around line 4-10: The setup script currently hardcodes BRANCH="gb300_blog" and
replaces the remote with a personal fork via the git remote add origin
https://github.com/YAMY1234/sglang.git and then checks out origin/${BRANCH},
which risks non-reproducible and broken deployments; change the script to accept
the remote URL and branch/commit as parameters (or default to the official
upstream repo sgl-project/sglang), stop force-replacing origin blindly (use git
remote set-url or conditionally add if missing), and pin the checkout to a
specific commit SHA instead of checking out origin/${BRANCH} so the code is
reproducible (update the BRANCH variable usage and the git fetch / git checkout
steps to fetch the chosen remote and checkout the provided commit SHA).
In
`@recipes/gb300-fp4/128k8k/maxthroughput-ctx3_ctx_pp4_gen1_dep8_batch32_eplb0_mtp0.yaml`:
- Around line 1-4: The filename indicates EPLB disabled (eplb0) but the config
enables it — fix the mismatch by either renaming the recipe to use eplb1 in the
name or disabling the EPLB keys; locate and update the entries
ep-num-redundant-experts, disable-shared-experts-fusion, and eplb-algorithm
(lines shown in the diff) to be consistent with the filename: if you want EPLB
off remove or zero out ep-num-redundant-experts and reset/disable
disable-shared-experts-fusion and eplb-algorithm, or else rename the file and
top-level name string from eplb0 to eplb1 to reflect that EPLB is enabled.
In
`@recipes/gb300-fp4/128k8k/minttft-ctx1_ctx_pp4_gen1_tp4_eplb0_mtp0-chunked-32k-dynchunk.yaml`:
- Around line 1-4: The config filename contains "tp4" while the YAML name field
and top comment contain "tep4", causing a mismatch; pick the canonical shorthand
and make them consistent by either renaming the file to use "tep4" or updating
the name field string and comment to use "tp4" (edit the value of the name key
"gb300-fp4-128k8k-ctx1_ctx_pp4_gen1_tep4_batch1_eplb0_mtp0" and the top comment
accordingly), ensuring any tooling or references that parse the filename or the
name key will match.
🧹 Nitpick comments (1)
recipes/gb300-fp4/128k8k_mtp/midcurve-ctx5_ctx_pp4_gen1_dep16_batch8_eplb0_mtp2.yaml (1)
19-66: Decode and prefill environment blocks share ~15 identical variables.Consider extracting common environment variables into a shared/base section (if supported by the deployment framework) to reduce duplication and drift risk. Not blocking, just a maintainability suggestion.
| BRANCH="gb300_blog" | ||
|
|
||
| cd /sgl-workspace/sglang | ||
| git remote remove origin | ||
| git remote add origin https://github.com/YAMY1234/sglang.git | ||
| git fetch origin | ||
| git checkout origin/${BRANCH} |
There was a problem hiding this comment.
Setup script points origin to a personal fork — this should not be merged as-is.
Replacing the container's origin remote with https://github.com/YAMY1234/sglang.git ties every deployment to a personal fork. If this fork diverges, is force-pushed, or is deleted, all environments running this setup will break or pull unexpected code.
Additionally, checking out origin/${BRANCH} (a branch tip) without pinning a specific commit SHA makes this non-reproducible.
Consider:
- Using the upstream
sgl-project/sglangrepository, or at minimum parameterizing the remote URL. - Pinning a specific commit hash instead of a branch name.
Suggested improvement
-BRANCH="gb300_blog"
+REPO_URL="${SGLANG_REPO_URL:-https://github.com/sgl-project/sglang.git}"
+COMMIT="${SGLANG_COMMIT:-<pin-a-specific-sha>}"
cd /sgl-workspace/sglang
git remote remove origin
-git remote add origin https://github.com/YAMY1234/sglang.git
+git remote add origin "$REPO_URL"
git fetch origin
-git checkout origin/${BRANCH}
+git checkout "$COMMIT"🤖 Prompt for AI Agents
In `@configs/gb300-fp4-128k8k-setup.sh` around lines 4 - 10, The setup script
currently hardcodes BRANCH="gb300_blog" and replaces the remote with a personal
fork via the git remote add origin https://github.com/YAMY1234/sglang.git and
then checks out origin/${BRANCH}, which risks non-reproducible and broken
deployments; change the script to accept the remote URL and branch/commit as
parameters (or default to the official upstream repo sgl-project/sglang), stop
force-replacing origin blindly (use git remote set-url or conditionally add if
missing), and pin the checkout to a specific commit SHA instead of checking out
origin/${BRANCH} so the code is reproducible (update the BRANCH variable usage
and the git fetch / git checkout steps to fetch the chosen remote and checkout
the provided commit SHA).
| # Install sgl-kernel 0.3.21 for CUDA 13.0 (aarch64) | ||
| echo "Installing sgl-kernel 0.3.21..." | ||
| pip install https://github.com/sgl-project/whl/releases/download/v0.3.21/sgl_kernel-0.3.21+cu130-cp310-abi3-manylinux2014_aarch64.whl --force-reinstall $PIP_ARGS | ||
| echo "sgl-kernel 0.3.21 installed" |
There was a problem hiding this comment.
Direct wheel URL for sgl-kernel is architecture- and Python-version-specific.
The wheel sgl_kernel-0.3.21+cu130-cp310-abi3-manylinux2014_aarch64.whl is pinned to aarch64 and cp310 ABI. If the container ever changes Python version or architecture, this will silently install an incompatible binary. Consider adding a guard or comment documenting these assumptions.
🤖 Prompt for AI Agents
In `@configs/gb300-fp4-128k8k-setup.sh` around lines 39 - 42, The direct wheel URL
for sgl-kernel (sgl_kernel-0.3.21+cu130-cp310-abi3_manylinux2014_aarch64.whl) is
pinned to aarch64 and CPython 3.10 and may install an incompatible binary; add a
pre-check that validates the host architecture and Python major/minor (e.g.,
check uname -m and python -c "import sys; print(sys.version_info[:2])") before
running the pip install, and if they don’t match aarch64/3.10 either abort with
a clear error message or choose an alternative wheel/source, or at minimum add
an explicit comment documenting the assumed architecture/Python and reference
PIP_ARGS and the exact wheel filename so future changes will update this guard.
| # Config: ctx3_ctx_pp4_gen1_dep8_batch32_eplb0_mtp0 | ||
| # max global bs: 256, concurrency: 512 | ||
|
|
||
| name: "gb300-fp4-128k8k-ctx3_ctx_pp4_gen1_dep8_batch32_eplb0_mtp0" |
There was a problem hiding this comment.
Filename encodes eplb0 but decode config enables EPLB.
Lines 87-89 set ep-num-redundant-experts: 32, disable-shared-experts-fusion: true, and eplb-algorithm: deepseek, which all indicate EPLB is active. The filename eplb0 suggests it's off. This mismatch is present across multiple YAML files in this PR (maxthroughput, highthroughput, middlecurve). Please reconcile — either rename the files to eplb1 or remove the EPLB-related keys.
Also applies to: 85-89
🤖 Prompt for AI Agents
In
`@recipes/gb300-fp4/128k8k/maxthroughput-ctx3_ctx_pp4_gen1_dep8_batch32_eplb0_mtp0.yaml`
around lines 1 - 4, The filename indicates EPLB disabled (eplb0) but the config
enables it — fix the mismatch by either renaming the recipe to use eplb1 in the
name or disabling the EPLB keys; locate and update the entries
ep-num-redundant-experts, disable-shared-experts-fusion, and eplb-algorithm
(lines shown in the diff) to be consistent with the filename: if you want EPLB
off remove or zero out ep-num-redundant-experts and reset/disable
disable-shared-experts-fusion and eplb-algorithm, or else rename the file and
top-level name string from eplb0 to eplb1 to reflect that EPLB is enabled.
| # Config: ctx1_ctx_pp4_gen1_tep4_batch1_eplb0_mtp0 | ||
| # chunked prefill size: 32000 (dynamic chunking enabled) | ||
|
|
||
| name: "gb300-fp4-128k8k-ctx1_ctx_pp4_gen1_tep4_batch1_eplb0_mtp0" |
There was a problem hiding this comment.
Filename says tp4 but the name field and comment say tep4 — which is correct?
The filename uses tp4 while Line 1 and Line 4 both use tep4. If tep4 is a shorthand for tensor+expert parallel, this could confuse tooling or humans grepping by filename vs. the name key. Please reconcile so the filename and config name match.
🤖 Prompt for AI Agents
In
`@recipes/gb300-fp4/128k8k/minttft-ctx1_ctx_pp4_gen1_tp4_eplb0_mtp0-chunked-32k-dynchunk.yaml`
around lines 1 - 4, The config filename contains "tp4" while the YAML name field
and top comment contain "tep4", causing a mismatch; pick the canonical shorthand
and make them consistent by either renaming the file to use "tep4" or updating
the name field string and comment to use "tp4" (edit the value of the name key
"gb300-fp4-128k8k-ctx1_ctx_pp4_gen1_tep4_batch1_eplb0_mtp0" and the top comment
accordingly), ensuring any tooling or references that parse the filename or the
name key will match.
Summary by CodeRabbit