Skip to content

feat(sglang): relay forward pass metrics to event plane#7386

Draft
ishandhanani wants to merge 4 commits intomainfrom
idhanani/sglang-fpm-relay
Draft

feat(sglang): relay forward pass metrics to event plane#7386
ishandhanani wants to merge 4 commits intomainfrom
idhanani/sglang-fpm-relay

Conversation

@ishandhanani
Copy link
Copy Markdown
Contributor

Summary

  • Extend DynamoSglangPublisher with init_fpm_relay() to subscribe to sglang's ForwardPassMetrics ZMQ PUB and relay to the event plane
  • Map DYN_FORWARDPASS_METRIC_PORT env var to sglang's server_args.forward_pass_metrics_port in args.py
  • Uses the same FpmEventRelay Rust binding as the vLLM integration (feat: ForwardPassMetrics dynamo event plane integration #7250)
  • Handles DP attention topology (one relay per local DP rank)

Motivation

Companion to sgl-project/sglang#20569 which adds per-iteration ForwardPassMetrics emission from sglang's scheduler. This PR wires the Dynamo side to receive those metrics and forward them to the event plane for consumption by the planner.

Architecture

sglang scheduler (SchedulerMetricsMixin):
  _emit_forward_pass_metrics() -> _FpmPublisherThread -> ZMQ PUB (localhost)

Dynamo parent (DynamoSglangPublisher):
  init_fpm_relay() -> FpmEventRelay (ZMQ SUB) -> Event Plane (NATS)

Consumer (planner):
  FpmEventSubscriber -> decode() -> ForwardPassMetrics

Changes

  • args.py: Map DYN_FORWARDPASS_METRIC_PORT env var to server_args.forward_pass_metrics_port before engine creation
  • publisher.py: Add init_fpm_relay() method reading port from server_args, cleanup in cleanup(), call from setup_sgl_metrics()

Test plan

  • Wire compatibility verified: sglang-encoded ForwardPassMetrics decoded by Dynamo's decoder
  • Live server test: sglang with --forward-pass-metrics-port 20380 emits prefill/decode metrics per iteration
  • Full stack test with event plane consumer (requires NATS/etcd)

…elay

Extend DynamoSglangPublisher with init_fpm_relay() to subscribe to
sglang's new ForwardPassMetrics ZMQ PUB and relay to the Dynamo event
plane. Uses the same FpmEventRelay Rust binding as the vLLM integration.

Gated behind DYN_FORWARDPASS_METRIC_PORT env var. Creates one relay
per local DP rank, handles multi-node DP attention topology.
Map DYN_FORWARDPASS_METRIC_PORT env var to sglang's new
server_args.forward_pass_metrics_port in args.py. The publisher
reads from server_args instead of env var directly.
@github-actions github-actions bot added feat documentation Improvements or additions to documentation backend::sglang Relates to the sglang backend and removed feat labels Mar 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend::sglang Relates to the sglang backend documentation Improvements or additions to documentation feat size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant