Repurpose Scheduler Spec Dec metric for testing correctness #1

ekagra-ranjan · 2025-04-09T20:37:16Z

I was looking into SD metrics in V1 and find that spec_decoding_stats is reinit every time we do an engine step and we use an observe function which from the name is supposed to aggregate over miltiple observe calls. However, since it's reinit everytime, we will always have 1 observe call and there is no aggregation.

To enable AL computation for checking correctness, this PR aggregates the metrics across steps in the EngineCoreOutputs.scheduler_stats

ekagra-ranjan · 2025-04-09T20:39:33Z

examples/offline_inference/eagle.py

+        num_draft_tokens = scheduler_stats.spec_decoding_stats.num_draft_tokens
+        num_accepted_tokens = scheduler_stats.spec_decoding_stats.num_accepted_tokens
+        num_spec_proposal = num_draft_tokens / args.num_spec_tokens
+        mean_accepted_tokens = 1 + num_accepted_tokens / num_spec_proposal


num_spec_proposal is the num of times the SD call was made

mean_accepted_tokens = (sum of generated tokens over num_spec_proposal) / num_spec_proposal
= (num_spec_proposal + sum of accepted tokens over num_spec_proposal) / num_spec_proposal
= 1 + num_accepted_tokens / num_spec_proposal

ekagra-ranjan · 2025-04-09T20:40:21Z

vllm/v1/core/sched/scheduler.py

+        # spec_decoding_stats: Optional[SpecDecodingStats] = None
+        spec_decoding_stats = self.spec_decoding_stats


cache the spec_decoding_stats so that it keeps a running metric instead of reinit it every engine step

ekagra-ranjan · 2025-04-09T20:41:00Z

examples/offline_inference/eagle.py

    model_dir = "meta-llama/Meta-Llama-3-8B-Instruct"
-    eagle_dir = "abhigoyal/EAGLE-LLaMA3-Instruct-8B-vllm"
+    # eagle_dir = "yuhuili/EAGLE-LLaMA3-Instruct-8B"
+    eagle_dir = "lmsys/sglang-EAGLE-LLaMA3-Instruct-8B"


using sglang model so that the prev SGL bench is comparable: https://docs.google.com/document/d/18ETJLsnxR88Qq3VDk5Mq-Hb7vuE9o3VNZ-hhz-OqAXk/edit?usp=sharing

markmc · 2025-05-09T14:35:48Z

hi @ekagra-ranjan

I was looking into SD metrics in V1 and find that spec_decoding_stats is reinit every time we do an engine step and we use an observe function which from the name is supposed to aggregate over miltiple observe calls. However, since it's reinit everytime, we will always have 1 observe call and there is no aggregation.

Hmm, we discussed this on Slack shortly after you submitted this PR

spec_decoding_stats = self.make_spec_decoding_stats(
                    spec_decoding_stats,

a new SpecDecodingStats should only be created once per update_from_output() call - we should aggregate across all requests in a single step

   def make_spec_decoding_stats(
        self,
        spec_decoding_stats: Optional[SpecDecodingStats],
        ...
    ) -> Optional[SpecDecodingStats]:
        ...
        if spec_decoding_stats is None:
            spec_decoding_stats = SpecDecodingStats()
        ...
        return spec_decoding_stats

Your response, for reference:

Oh, the purpose of the SpecDecodingStats is just to aggregate across the req per step and is reinit per step. Then it is working fine.

ekagra-ranjan · 2025-05-09T15:15:15Z

Hi @markmc - yup, we are good. I am still using this hacky PR whenever I want to quickly find the AL for my evals since vllm-project#16367 is still not merged

github-actions · 2025-08-08T03:20:25Z

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

github-actions · 2025-09-08T02:50:44Z

This pull request has been automatically closed due to inactivity. Please feel free to reopen if you intend to continue working on it. Thank you!

ekagra-ranjan added 2 commits April 9, 2025 20:32

resolve conflict and repurpose sched sd metric for testing correctness

27a1ef7

enable log

6b46e3a

ekagra-ranjan commented Apr 9, 2025

View reviewed changes

ekagra-ranjan changed the title ~~Repurpose sched sd metric for testing correctness~~ Repurpose Scheduler Spec Dec metric for testing correctness Apr 9, 2025

ekagra-ranjan mentioned this pull request Apr 9, 2025

[V1][Spec Decode] Eagle Model loading vllm-project/vllm#16035

Merged

luyuzhe111 mentioned this pull request Apr 10, 2025

[V1] Add request-level, per-step acceptance counts tracking for spec dec. vllm-project/vllm#16367

Closed

ekagra-ranjan mentioned this pull request Apr 11, 2025

[V1][Spec Decode] Non greedy sample with EAGLE / Reduce memory allocation for Rejection Sampler vllm-project/vllm#16077

Closed

2 tasks

This was referenced May 17, 2025

Quick Metric hack #2

Closed

metric hack #3

Closed

github-actions bot added the stale label Aug 8, 2025

github-actions bot closed this Sep 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repurpose Scheduler Spec Dec metric for testing correctness #1

Repurpose Scheduler Spec Dec metric for testing correctness #1

Uh oh!

ekagra-ranjan commented Apr 9, 2025 •

edited

Loading

Uh oh!

ekagra-ranjan Apr 9, 2025 •

edited

Loading

Uh oh!

ekagra-ranjan Apr 9, 2025

Uh oh!

ekagra-ranjan Apr 9, 2025

Uh oh!

markmc commented May 9, 2025

Uh oh!

ekagra-ranjan commented May 9, 2025

Uh oh!

github-actions bot commented Aug 8, 2025

Uh oh!

github-actions bot commented Sep 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		# spec_decoding_stats: Optional[SpecDecodingStats] = None
		spec_decoding_stats = self.spec_decoding_stats

Repurpose Scheduler Spec Dec metric for testing correctness #1

Repurpose Scheduler Spec Dec metric for testing correctness #1

Uh oh!

Conversation

ekagra-ranjan commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ekagra-ranjan Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ekagra-ranjan Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

ekagra-ranjan Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

markmc commented May 9, 2025

Uh oh!

ekagra-ranjan commented May 9, 2025

Uh oh!

github-actions bot commented Aug 8, 2025

Uh oh!

github-actions bot commented Sep 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ekagra-ranjan commented Apr 9, 2025 •

edited

Loading

ekagra-ranjan Apr 9, 2025 •

edited

Loading