Skip to content

[Feature] logger refactor#466

Merged
hsliuustc0106 merged 3 commits intovllm-project:mainfrom
Bounty-hunter:logger_refactor
Dec 26, 2025
Merged

[Feature] logger refactor#466
hsliuustc0106 merged 3 commits intovllm-project:mainfrom
Bounty-hunter:logger_refactor

Conversation

@Bounty-hunter
Copy link
Contributor

@Bounty-hunter Bounty-hunter commented Dec 25, 2025

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

As describe in #442, This PR do the following:
(1) Unified log system with vllm logger
(2) Add prefix for each stage
(3) mv some log metrcs from stage to omni client

We just use logger.warning/info/debug/ in the future.

Future work:
we need to refactor the metric system

Test Plan

Test Result

Online

(1) check log with stage prefix

[Stage-2] DEBUG 12-25 02:34:16 [omni_stage.py:1035] Enqueued result for request chatcmpl-bbcb8e6133a0db88 to downstream

(2) metric log in client

(APIServer pid=169357) INFO 12-25 09:45:59 [log_utils.py:104] {'type': 'Request_stage_stats',
(APIServer pid=169357) INFO 12-25 09:45:59 [log_utils.py:104]  'stage_id': 1,
(APIServer pid=169357) INFO 12-25 09:45:59 [log_utils.py:104]  'request_id': 'chatcmpl-8489f6f4f0a3e58b',
(APIServer pid=169357) INFO 12-25 09:45:59 [log_utils.py:104]  'batch_size': 1,
(APIServer pid=169357) INFO 12-25 09:45:59 [log_utils.py:104]  'num_tokens_out': 1786,
(APIServer pid=169357) INFO 12-25 09:45:59 [log_utils.py:104]  'stage_gen_time_ms': 33674.50213432312,
(APIServer pid=169357) INFO 12-25 09:45:59 [log_utils.py:104]  'tokens_per_s': 53.0371612585654,
(APIServer pid=169357) INFO 12-25 09:45:59 [log_utils.py:104]  'rx_transfer_bytes': 2538941,
(APIServer pid=169357) INFO 12-25 09:45:59 [log_utils.py:104]  'rx_decode_time_ms': 2.0301342010498047,
(APIServer pid=169357) INFO 12-25 09:45:59 [log_utils.py:104]  'rx_mbps': 10005.017397124135}

Offline

INFO 12-25 09:52:07 [omni_llm.py:469] [Summary] {'e2e_avg_time_per_request_ms': 22568.159818649292,
INFO 12-25 09:52:07 [omni_llm.py:469]  'e2e_avg_tokens_per_s': 0.0,
INFO 12-25 09:52:07 [omni_llm.py:469]  'e2e_requests': 1,
INFO 12-25 09:52:07 [omni_llm.py:469]  'e2e_sum_time_ms': 22568.159818649292,
INFO 12-25 09:52:07 [omni_llm.py:469]  'e2e_total_time_ms': 22568.99666786194,
INFO 12-25 09:52:07 [omni_llm.py:469]  'e2e_total_tokens': 0,
INFO 12-25 09:52:07 [omni_llm.py:469]  'final_stage_id': {'0_b565e4cd-1c32-43f2-ba13-f6bcd9e5e141': 2},
INFO 12-25 09:52:07 [omni_llm.py:469]  'stages': [{'avg_time_per_request_ms': 1426.2864589691162,
INFO 12-25 09:52:07 [omni_llm.py:469]              'avg_tokens_per_s': 43.46952858601212,
INFO 12-25 09:52:07 [omni_llm.py:469]              'requests': 1,
INFO 12-25 09:52:07 [omni_llm.py:469]              'stage_id': 0,
INFO 12-25 09:52:07 [omni_llm.py:469]              'tokens': 62,
INFO 12-25 09:52:07 [omni_llm.py:469]              'total_time_ms': 1426.2864589691162},
INFO 12-25 09:52:07 [omni_llm.py:469]             {'avg_time_per_request_ms': 16772.900581359863,
INFO 12-25 09:52:07 [omni_llm.py:469]              'avg_tokens_per_s': 53.419502229432325,
INFO 12-25 09:52:07 [omni_llm.py:469]              'requests': 1,
INFO 12-25 09:52:07 [omni_llm.py:469]              'stage_id': 1,
INFO 12-25 09:52:07 [omni_llm.py:469]              'tokens': 896,
INFO 12-25 09:52:07 [omni_llm.py:469]              'total_time_ms': 16772.900581359863},
INFO 12-25 09:52:07 [omni_llm.py:469]             {'avg_time_per_request_ms': 4331.4759731292725,
INFO 12-25 09:52:07 [omni_llm.py:469]              'avg_tokens_per_s': 0.0,
INFO 12-25 09:52:07 [omni_llm.py:469]              'requests': 1,
INFO 12-25 09:52:07 [omni_llm.py:469]              'stage_id': 2,
INFO 12-25 09:52:07 [omni_llm.py:469]              'tokens': 0,
INFO 12-25 09:52:07 [omni_llm.py:469]              'total_time_ms': 4331.4759731292725}],
...

Diffusion

INFO 12-25 07:19:46 [diffusion_engine.py:48] Generation completed successfully.
INFO 12-25 07:19:46 [diffusion_engine.py:53] Post-processing completed in 0.0658 seconds
Total generation time: 115.3691 seconds (115369.05 ms)
Saved edited image to /home/d00806799/code/vllm-omni/examples/offline_inference/image_to_image/output_image_edit_0.png
INFO 12-25 07:19:46 [gpu_worker.py:201] Worker 0: Received shutdown message
INFO 12-25 07:19:46 [gpu_worker.py:225] event loop terminated.
INFO 12-25 07:19:46 [gpu_worker.py:256] Worker 0: Shutdown complete.

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

) -> None:
_safe_append_jsonl(
stats_file,
logger.info(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rely on printing this out, or should we consider exposing metrics through a Prometheus interface instead?

Copy link
Contributor Author

@Bounty-hunter Bounty-hunter Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this relative to metrics refactor: we collete metrics in client side, whether to use logger or Prometheus depends on the user specifying the appropriate statlogger, just like vllm: LoggingStatLogger, PrometheusStatLogger. This metric refactor can be done in a new RFC

@Bounty-hunter Bounty-hunter force-pushed the logger_refactor branch 2 times, most recently from d8e25fb to ac2ca18 Compare December 25, 2025 07:00
@Bounty-hunter Bounty-hunter marked this pull request as ready for review December 25, 2025 07:34
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@hsliuustc0106 hsliuustc0106 added the ready label to trigger buildkite CI label Dec 25, 2025
Copy link
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change lines for multiple stage metrics? currently it's too long in one line

@hsliuustc0106
Copy link
Collaborator

INFO 12-25 07:04:33 [log_utils.py:491] {'type': 'request_level_metrics', 'request_id': '0_27b041e4-b66e-4f7e-aa95-7816d8703766', 'e2e_time_ms': 22666.945219039917, 'e2e_tpt': 0.0, 'num_tokens_out': 0, 'transfers_total_time_ms': 6.960630416870117, 'transfers_total_bytes': 1782362, 'stages': {0: {'stage_gen_time_ms': 1386.4283561706543, 'num_tokens_out': 62}, 1: {'stage_gen_time_ms': 16854.31432723999, 'num_tokens_out': 896}, 2: {'stage_gen_time_ms': 4335.120916366577, 'num_tokens_out': 0}}}

@hsliuustc0106
Copy link
Collaborator

the summary is also too long, can we change to multiple lines?

INFO 12-25 07:04:33 [omni_llm.py:468] [Summary] {'e2e_requests': 1, 'e2e_total_time_ms': 22667.661428451538, 'e2e_sum_time_ms': 22666.945219039917, 'e2e_total_tokens': 0, 'e2e_avg_time_per_request_ms': 22666.945219039917, 'e2e_avg_tokens_per_s': 0.0, 'wall_time_ms': 22667.661428451538, 'final_stage_id': {'0_27b041e4-b66e-4f7e-aa95-7816d8703766': 2}, 'stages': [{'stage_id': 0, 'requests': 1, 'tokens': 62, 'total_time_ms': 1398.9851474761963, 'avg_time_per_request_ms': 1398.9851474761963, 'avg_tokens_per_s': 44.317840051303996}, {'stage_id': 1, 'requests': 1, 'tokens': 896, 'total_time_ms': 16869.991779327393, 'avg_time_per_request_ms': 16869.991779327393, 'avg_tokens_per_s': 53.11205907628033}, {'stage_id': 2, 'requests': 1, 'tokens': 0, 'total_time_ms': 4347.3546504974365, 'avg_time_per_request_ms': 4347.3546504974365, 'avg_tokens_per_s': 0.0}], 'transfers': [{'from_stage': 0, 'to_stage': 1, 'samples': 1, 'total_bytes': 1779125, 'total_time_ms': 2.4514198303222656, 'tx_mbps': 5806.02303365104, 'rx_samples': 1, 'rx_total_bytes': 1779125, 'rx_total_time_ms': 1.5587806701660156, 'rx_mbps': 9130.854822881614, 'total_samples': 1, 'total_transfer_time_ms': 4.788637161254883, 'total_mbps': 2972.244402887727}, {'from_stage': 1, 'to_stage': 2, 'samples': 1, 'total_bytes': 3237, 'total_time_ms': 0.31065940856933594, 'tx_mbps': 83.35817067075979, 'rx_samples': 1, 'rx_total_bytes': 3237, 'rx_total_time_ms': 1.1925697326660156, 'rx_mbps': 21.714453495401838, 'total_samples': 1, 'total_transfer_time_ms': 2.1719932556152344, 'total_mbps': 11.922688955433589}]}

@Bounty-hunter Bounty-hunter force-pushed the logger_refactor branch 2 times, most recently from a28b958 to 1b05ca1 Compare December 25, 2025 08:52
@Bounty-hunter Bounty-hunter changed the title [Feature] logger refactor [WIP] logger refactor Dec 25, 2025
@Bounty-hunter Bounty-hunter force-pushed the logger_refactor branch 2 times, most recently from a0771ed to f9484cb Compare December 25, 2025 09:49
@Bounty-hunter Bounty-hunter changed the title [WIP] logger refactor [FEATURE] logger refactor Dec 25, 2025
@Bounty-hunter
Copy link
Contributor Author

can we change lines for multiple stage metrics? currently it's too long in one line

Changed to use pformat for metric logging to improve readability.

@hsliuustc0106
Copy link
Collaborator

please check whether metrics is aligned with vllm upstream

@Bounty-hunter
Copy link
Contributor Author

please check whether metrics is aligned with vllm upstream

This PR only refactors the logging mechanism; the contents of the metrics remain unchanged.
Current metrics have some limitations. For example, in online mode, it can't summarize metrics over a period of time. Addressing this would require a separate RFC to refactor the metrics system.

@Bounty-hunter Bounty-hunter force-pushed the logger_refactor branch 3 times, most recently from f0c02b1 to ef1b8c4 Compare December 25, 2025 12:12
@Bounty-hunter Bounty-hunter changed the title [FEATURE] logger refactor [WIP] logger refactor Dec 25, 2025
@Bounty-hunter Bounty-hunter changed the title [WIP] logger refactor [Feature] logger refactor Dec 25, 2025
Signed-off-by: dengyunyang <[email protected]>
@hsliuustc0106 hsliuustc0106 merged commit a500ca1 into vllm-project:main Dec 26, 2025
7 checks passed
@gcanlin
Copy link
Contributor

gcanlin commented Dec 26, 2025

@Bounty-hunter Could we make the final log not a ERROR when the offline script works normally? My concern is the red ERROR word will make users confused and misunderstood it a real error.

INFO 12-26 02:56:30 [omni_llm.py:469]                 'rx_total_time_ms': 1.2178421020507812,
INFO 12-26 02:56:30 [omni_llm.py:469]                 'rx_mbps': 20.83685558026625,
INFO 12-26 02:56:30 [omni_llm.py:469]                 'total_samples': 1,
INFO 12-26 02:56:30 [omni_llm.py:469]                 'total_transfer_time_ms': 2.4716854095458984,
INFO 12-26 02:56:30 [omni_llm.py:469]                 'total_mbps': 10.266678721327288}]}
[Stage-0] ERROR 12-26 02:56:30 [omni_stage.py:559] Received shutdown signal
[Stage-2] ERROR 12-26 02:56:30 [omni_stage.py:559] Received shutdown signal
[Stage-1] ERROR 12-26 02:56:30 [omni_stage.py:559] Received shutdown signal
Request ID: 0_8716d63d-13b5-49a0-9772-b252c2cad419, Text saved to output_audio/0_8716d63d-13b5-49a0-9772-b252c2cad419.txt
Request ID: 0_8716d63d-13b5-49a0-9772-b252c2cad419, Saved audio to output_audio/output_0_8716d63d-13b5-49a0-9772-b252c2cad419.wav

@Bounty-hunter
Copy link
Contributor Author

@Bounty-hunter Could we make the final log not a ERROR when the offline script works normally? My concern is the red ERROR word will make users confused and misunderstood it a real error.

INFO 12-26 02:56:30 [omni_llm.py:469]                 'rx_total_time_ms': 1.2178421020507812,
INFO 12-26 02:56:30 [omni_llm.py:469]                 'rx_mbps': 20.83685558026625,
INFO 12-26 02:56:30 [omni_llm.py:469]                 'total_samples': 1,
INFO 12-26 02:56:30 [omni_llm.py:469]                 'total_transfer_time_ms': 2.4716854095458984,
INFO 12-26 02:56:30 [omni_llm.py:469]                 'total_mbps': 10.266678721327288}]}
[Stage-0] ERROR 12-26 02:56:30 [omni_stage.py:559] Received shutdown signal
[Stage-2] ERROR 12-26 02:56:30 [omni_stage.py:559] Received shutdown signal
[Stage-1] ERROR 12-26 02:56:30 [omni_stage.py:559] Received shutdown signal
Request ID: 0_8716d63d-13b5-49a0-9772-b252c2cad419, Text saved to output_audio/0_8716d63d-13b5-49a0-9772-b252c2cad419.txt
Request ID: 0_8716d63d-13b5-49a0-9772-b252c2cad419, Saved audio to output_audio/output_0_8716d63d-13b5-49a0-9772-b252c2cad419.wav

Using INFO makes more sense; I'll update it in a later PR.

yenuo26 pushed a commit to yenuo26/vllm-omni that referenced this pull request Dec 29, 2025
Signed-off-by: dengyunyang <[email protected]>
Co-authored-by: Hongsheng Liu <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
princepride pushed a commit to princepride/vllm-omni that referenced this pull request Jan 10, 2026
Signed-off-by: dengyunyang <[email protected]>
Co-authored-by: Hongsheng Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants