UPSTREAM PR #17914: Restore clip's cb() to its rightful glory#516
UPSTREAM PR #17914: Restore clip's cb() to its rightful glory#516
Conversation
|
Explore the complete analysis inside the Version Insights Performance Analysis Summary - PR #516PR Title: Restore clip's cb() to its rightful glory Analysis OverviewThis PR refactors the debug callback mechanism in the CLIP vision encoder. The changes replace the previous Code Changes:
Performance Impact: The Functions in the CLIP image processing pipeline show improvements:
Tokens Per Second Impact: No impact on tokenization or text inference performance. The modified functions ( Power Consumption: The Key Findings: The refactoring improves debug callback execution by eliminating redundant tensor copies and moving statistics computation into the graph execution phase. The changes are isolated to debug functionality and vision preprocessing, with no effect on text generation throughput. |
e70bc15 to
ef96f85
Compare
81e654d to
c785ce2
Compare
…D_DEBUG_GRAPH with same functionality
Mirrored from ggml-org/llama.cpp#17914
I used my callback function from my Qwen3Next testing days, it seems like it works more cleanly than the previous one which was causing some problems with the scheduler / buffers.