Commit fa7a0f3
authored
Fix for Deepseek r1 parsing (#676)
* Implement function calling / tools for ik_llama.cpp for Kimi K2
* Implement basic tool choice
* Backport llama.cpp tool calls support
* Enhance function calls with improved chat parser and string utilities
- Add new chat.h/chat.cpp and chat-parser.h/chat-parser.cpp for better chat handling
- Improve function calls parsing with fallback to llama.cpp builder pattern
- Add string utility functions (starts_with, ends_with, find_partial_stop)
- Update README with function calls testing instructions
- Enhance Kimi K2 parser and function calls documentation
- Add comprehensive test suite for function calls
- Update CMakeLists.txt and Makefile for new components
* Enhance function calling with unified streaming and parser improvements
- Fix streaming content cleanup to prevent function syntax in output
- Unify content extraction patterns with llama.cpp approach
- Improve Kimi K2 parser robustness and partial content handling
- Add comprehensive test coverage for function call scenarios
- Optimize chat message parsing and diff computation
* Replace hardcoded values in kimi_k2_parser.hpp with named constants
- Add compile-time constants for all token format markers
- Add compile-time constants for XML format markers
- Add compile-time constants for simple format patterns
- Replace all hardcoded string literals with named constants
- Use compile-time length calculation to avoid manual counting
- Improve maintainability and reduce magic numbers throughout parser
* Fix duplicate common_chat_parse definition
- Remove duplicate implementation from chat-parser.cpp
- Keep single implementation in chat.cpp following llama.cpp patterns
- Resolves linker error: multiple definition of common_chat_parse
* Fix JSON assertion failure in function call parsing
- Add proper validation that 'function' field is an object before accessing nested keys
- Handle missing 'arguments' field gracefully with default "{}"
- Prevents crash when parsing malformed tool call JSON structures
* Add comprehensive Qwen3 XML tool calling support with unit tests
- Implement Qwen3 XML parser with <tool_call>{"name": "func", "arguments": {...}}</tool_call> format
- Add model detection and routing for Qwen3 vs Kimi-K2 formats
- Create 8 comprehensive unit tests covering parsing, streaming, error handling
- Fix token format cleaning bug in kimi_k2_parser.hpp processing order
- Remove progressive parsing code and related utilities
- Add tool injection support for Qwen3 format in server utils
* Add DeepSeek R1 function calling support with comprehensive unit tests
- Implement complete DeepSeek R1 tool call parsing in common_chat_parser.cpp
- Add DeepSeek R1 model detection and tool injection in deepseek_r1_tools.hpp
- Update function_calls.hpp with DeepSeek R1 integration and content extraction
- Update documentation to reflect support for Kimi-K2, Qwen3, and DeepSeek R1 models
- Add comprehensive unit tests for DeepSeek R1 reasoning, tool calls, and integration
- Port exact implementation patterns from original llama.cpp for compatibility
Key features:
- Native DeepSeek R1 format: <|tool▁calls▁begin|>function<|tool▁sep|>name```json{}```<|tool▁call▁end|><|tool▁calls▁end|>
- Reasoning content extraction from <think>...</think> tags
- Multiple tool calls support with separate call blocks
- Model detection for deepseek-r1, deepseek_r1 naming patterns
- Integration with incremental parsing and streaming support
* Add partial parsing support for JSON and regex
- json-partial.h/cpp: JSON partial parsing functionality
- regex-partial.h/cpp: Regex partial parsing functionality
* Add format_chat integration tests for Qwen3 tool injection
- Add test_qwen3_format_chat_integration() to validate tool injection pipeline
- Test tool injection conditions and system message enhancement
- Verify JSON formatting and anti-preamble instructions
- Add comprehensive test documentation
Tests confirm tool injection works correctly - conversational preamble
issue is not in ik_llama.cpp but likely in UI configuration.
* Fix Qwen3 tool call parsing - pass model name to parser
Server was not passing model name to parse_chat_message_incremental(),
causing Qwen3 to fall back to Kimi-K2 parser and return tool calls
as content instead of proper tool_calls array.
* Fix non-streaming path to use model-specific parsing
Non-streaming responses were hardcoded to use Kimi-K2 format,
causing Qwen3 XML tool calls to be returned as content instead
of proper tool_calls array. Now uses same model detection as
streaming path for consistency.
* Update Qwen3 function call handling in server and tests
- Enhanced server function call detection and response formatting
- Improved test coverage for Qwen3 tool call scenarios
- Refined XML parsing for better tool execution support
* Add DeepSeek-R1 function call parsing support
Implements comprehensive parsing for all 4 DeepSeek-R1 function call formats:
- Format 1: Standard function call syntax (already supported)
- Format 2: Alternative function call patterns (already supported)
- Format 3: Tools array format - function\n```json\n{"tools": [...]}
- Format 4: XML wrapped format - <tool_call>function</think>Name\n```json\n{...}```</tool_call>
Key changes:
- Added parse_deepseek_r1_tools_array() following original parse_prefixed_json_tool_call_array pattern
- Added parse_deepseek_r1_xml_wrapped() following Hermes-2-Pro XML wrapper patterns
- Integrated both parsers into exception handling chain for robust fallback
- Added comprehensive TDD test coverage for all formats
- Anonymized all confidential information while preserving functionality
Resolves tool_calls_count=0 issue where DeepSeek-R1 models generated valid tool calls
but server failed to parse them correctly.
* Update function_calls.md documentation for DeepSeek-R1 Format 4
- Added Format 4 (XML wrapped) documentation with examples
- Updated implementation notes with correct parser order (3→4→1→2)
- Marked all DeepSeek-R1 formats as working (July 2025 update)
- Updated test status for Format 3 and 4 as passing
- Added parse_deepseek_r1_xml_wrapped() function reference
- Corrected implementation file line numbers
* Fix merge conflict in test-function-calls.cpp
- Removed incomplete merge conflict marker from line 3027
- Ensured all tests compile and pass successfully
- All DeepSeek-R1 formats (1-4) working correctly
- All streaming and content cleaning tests passing
* Fix DeepSeek R1 parsing issue with responses wrapped in think tags
Restore missing consume_rest() call from working PR #648 implementation.
When responses don't contain tool calls, remaining content after reasoning
parsing must be preserved as displayable content.
Fixes issue where entire responses wrapped in <think> tags resulted in
empty content output.
* Implement proper reasoning handling following original llama.cpp patterns
- Add missing reasoning_format and reasoning_in_content fields to common_chat_syntax
- Update try_parse_reasoning to match original llama.cpp logic exactly
- Add TDD test case with reasoning_in_content=true for DeepSeek R1
- Following TDD: test should now pass with proper syntax configuration
Based on original llama.cpp implementation patterns.
* TDD SUCCESS: Fix DeepSeek R1 thinking tag termination issue
✅ Test passes with reasoning_in_content=true configuration
- Content properly preserved: '<think>content</think>' displays fully
- Reasoning field empty as expected
- Following TDD: test-first approach validates the fix
Next: Update server to automatically apply this configuration.
* Complete server integration fix for DeepSeek R1 thinking tag termination
- Server now automatically sets reasoning_in_content=true for DeepSeek R1 models
- Fixes issue where responses wrapped in <think> tags appear empty to users
* Add TDD test case for DeepSeek R1 thinking tag termination issue
- Test reproduces the exact failure scenario reported by user
- Validates that reasoning_in_content=true fixes the issue
- Demonstrates empty content problem and working solution
* Add remaining TDD test changes for DeepSeek R1 thinking tag fix
* Add debug output after upstream merge
* Remove temporary benchmark and debug files
- Remove tests/benchmark-progressive-parsing.cpp (development tool, not part of core functionality)
- Remove tests/reproduce_bug.sh (debugging script, not needed for PR)1 parent 41f0d2e commit fa7a0f3
File tree
5 files changed
+101
-19
lines changed- common
- examples/server
- tests
5 files changed
+101
-19
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | | - | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
89 | 100 | | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
97 | 113 | | |
98 | 114 | | |
99 | | - | |
100 | 115 | | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
| 116 | + | |
107 | 117 | | |
108 | 118 | | |
109 | 119 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
278 | 278 | | |
279 | 279 | | |
280 | 280 | | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
281 | 284 | | |
282 | 285 | | |
283 | 286 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
135 | 135 | | |
136 | 136 | | |
137 | 137 | | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
138 | 144 | | |
139 | 145 | | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
140 | 150 | | |
141 | 151 | | |
142 | 152 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
| 92 | + | |
| 93 | + | |
92 | 94 | | |
93 | 95 | | |
94 | 96 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3298 | 3298 | | |
3299 | 3299 | | |
3300 | 3300 | | |
| 3301 | + | |
| 3302 | + | |
| 3303 | + | |
| 3304 | + | |
| 3305 | + | |
| 3306 | + | |
| 3307 | + | |
| 3308 | + | |
| 3309 | + | |
| 3310 | + | |
| 3311 | + | |
| 3312 | + | |
| 3313 | + | |
| 3314 | + | |
| 3315 | + | |
| 3316 | + | |
| 3317 | + | |
| 3318 | + | |
| 3319 | + | |
| 3320 | + | |
| 3321 | + | |
| 3322 | + | |
| 3323 | + | |
| 3324 | + | |
| 3325 | + | |
| 3326 | + | |
| 3327 | + | |
| 3328 | + | |
| 3329 | + | |
| 3330 | + | |
| 3331 | + | |
| 3332 | + | |
| 3333 | + | |
| 3334 | + | |
| 3335 | + | |
| 3336 | + | |
| 3337 | + | |
| 3338 | + | |
| 3339 | + | |
| 3340 | + | |
| 3341 | + | |
| 3342 | + | |
| 3343 | + | |
| 3344 | + | |
| 3345 | + | |
| 3346 | + | |
| 3347 | + | |
| 3348 | + | |
| 3349 | + | |
| 3350 | + | |
| 3351 | + | |
| 3352 | + | |
| 3353 | + | |
| 3354 | + | |
| 3355 | + | |
| 3356 | + | |
| 3357 | + | |
3301 | 3358 | | |
3302 | 3359 | | |
3303 | 3360 | | |
3304 | 3361 | | |
3305 | 3362 | | |
3306 | 3363 | | |
3307 | 3364 | | |
3308 | | - | |
| 3365 | + | |
0 commit comments