-
-
Notifications
You must be signed in to change notification settings - Fork 11.9k
[misc] [doc] [frontend] LLM torch profiler support #7943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[misc] [doc] [frontend] LLM torch profiler support #7943
Conversation
|
/ready |
|
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge). To run full CI, you can do one of these:
🚀 |
vllm/executor/gpu_executor.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, we should also add start_profile/stop_profile for CPU-only targets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, either that or make it clear in the documentation & example that this is currently only supported on GPUs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, missed it. @DamonFool @ywang96 added to cpu_executor.py in 2b23e8f PTAL.
2880ae8 to
2b23e8f
Compare
comaniac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow missed this PR before. LGTM
DamonFool
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update.
LGTM
|
not sure if it's appropriate to ask this here, just in case if there is any help: when I tried to run examples/offline_inference_with_profiler.py, occuring the following error The output of `python collect_env.py`The error stackmy code |
|
Besides, do you think if it's a good idea to create a specific configuration for profiling? In my use case, I care about memory usage too, so I add |
|
Feel free to open an issue for bug report and discussions like this. |
got it, will do, thank you! |
Signed-off-by: Alvant <[email protected]>
Signed-off-by: Amit Garg <[email protected]>
Signed-off-by: LeiWang1999 <[email protected]>

follow up to #7451
cc @comaniac @DamonFool
thanks for the tips from @sfc-gh-mkeralapura