-
Notifications
You must be signed in to change notification settings - Fork 2.3k
fix: prompt token #6727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: prompt token #6727
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes token counting issues by ensuring the enable_thinking parameter is consistently set to false when calculating tokens, preventing failures when models have different default settings.
Key changes:
- Added optional
chat_template_kwargsparameter to the token counting API - Modified token counting calls to explicitly disable thinking mode
- Applied code formatting improvements to function calls
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| web-app/src/services/models/default.ts | Added chat_template_kwargs type definition and explicit disable thinking parameter in token counting |
| extensions/llamacpp-extension/src/index.ts | Updated token counting implementation to use chat_template_kwargs with fallback logic and applied code formatting |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Describe Your Changes
Frontend Changes
Added chat_template_kwargs: { enable_thinking: false } when calling engine.getTokensCount()
Updated type definition to include optional chat_template_kwargs parameter
This ensures token counting always works regardless of model defaults
Fixes Issues
Self Checklist