Deepseek R1 function calls (more formats) by iSevenDays · Pull Request #652 · ikawrakow/ik_llama.cpp

iSevenDays · 2025-07-26T06:54:56Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

Implemented more DeepSeek R1 supported function tool calls formats.
The diff of examples/server/function_calls.md shows what formats are supported.

I was testing DeepSeek R1 and I found out that it often uses different formats with Claude Code, so I decided to support them as well. It can be useful when next version of DeepSeek is released, so we will have better support than even original llama.cpp

- Add new chat.h/chat.cpp and chat-parser.h/chat-parser.cpp for better chat handling - Improve function calls parsing with fallback to llama.cpp builder pattern - Add string utility functions (starts_with, ends_with, find_partial_stop) - Update README with function calls testing instructions - Enhance Kimi K2 parser and function calls documentation - Add comprehensive test suite for function calls - Update CMakeLists.txt and Makefile for new components

- Fix streaming content cleanup to prevent function syntax in output - Unify content extraction patterns with llama.cpp approach - Improve Kimi K2 parser robustness and partial content handling - Add comprehensive test coverage for function call scenarios - Optimize chat message parsing and diff computation

- Add compile-time constants for all token format markers - Add compile-time constants for XML format markers - Add compile-time constants for simple format patterns - Replace all hardcoded string literals with named constants - Use compile-time length calculation to avoid manual counting - Improve maintainability and reduce magic numbers throughout parser

- Remove duplicate implementation from chat-parser.cpp - Keep single implementation in chat.cpp following llama.cpp patterns - Resolves linker error: multiple definition of common_chat_parse

- Add proper validation that 'function' field is an object before accessing nested keys - Handle missing 'arguments' field gracefully with default "{}" - Prevents crash when parsing malformed tool call JSON structures

- Implement Qwen3 XML parser with <tool_call>{"name": "func", "arguments": {...}}</tool_call> format - Add model detection and routing for Qwen3 vs Kimi-K2 formats - Create 8 comprehensive unit tests covering parsing, streaming, error handling - Fix token format cleaning bug in kimi_k2_parser.hpp processing order - Remove progressive parsing code and related utilities - Add tool injection support for Qwen3 format in server utils

- Implement complete DeepSeek R1 tool call parsing in common_chat_parser.cpp - Add DeepSeek R1 model detection and tool injection in deepseek_r1_tools.hpp - Update function_calls.hpp with DeepSeek R1 integration and content extraction - Update documentation to reflect support for Kimi-K2, Qwen3, and DeepSeek R1 models - Add comprehensive unit tests for DeepSeek R1 reasoning, tool calls, and integration - Port exact implementation patterns from original llama.cpp for compatibility Key features: - Native DeepSeek R1 format: <｜tool▁calls▁begin｜>function<｜tool▁sep｜>name```json{}```<｜tool▁call▁end｜><｜tool▁calls▁end｜> - Reasoning content extraction from <think>...</think> tags - Multiple tool calls support with separate call blocks - Model detection for deepseek-r1, deepseek_r1 naming patterns - Integration with incremental parsing and streaming support

- json-partial.h/cpp: JSON partial parsing functionality - regex-partial.h/cpp: Regex partial parsing functionality

- Add test_qwen3_format_chat_integration() to validate tool injection pipeline - Test tool injection conditions and system message enhancement - Verify JSON formatting and anti-preamble instructions - Add comprehensive test documentation Tests confirm tool injection works correctly - conversational preamble issue is not in ik_llama.cpp but likely in UI configuration.

Server was not passing model name to parse_chat_message_incremental(), causing Qwen3 to fall back to Kimi-K2 parser and return tool calls as content instead of proper tool_calls array.

Non-streaming responses were hardcoded to use Kimi-K2 format, causing Qwen3 XML tool calls to be returned as content instead of proper tool_calls array. Now uses same model detection as streaming path for consistency.

- Enhanced server function call detection and response formatting - Improved test coverage for Qwen3 tool call scenarios - Refined XML parsing for better tool execution support

- Integrated latest upstream changes from ikawrakow/ik_llama.cpp - Resolved conflicts in test-function-calls.cpp - Maintained local enhancements for Qwen3 function calling support

Implements comprehensive parsing for all 4 DeepSeek-R1 function call formats: - Format 1: Standard function call syntax (already supported) - Format 2: Alternative function call patterns (already supported) - Format 3: Tools array format - function\n```json\n{"tools": [...]} - Format 4: XML wrapped format - <tool_call>function</think>Name\n```json\n{...}```</tool_call> Key changes: - Added parse_deepseek_r1_tools_array() following original parse_prefixed_json_tool_call_array pattern - Added parse_deepseek_r1_xml_wrapped() following Hermes-2-Pro XML wrapper patterns - Integrated both parsers into exception handling chain for robust fallback - Added comprehensive TDD test coverage for all formats - Anonymized all confidential information while preserving functionality Resolves tool_calls_count=0 issue where DeepSeek-R1 models generated valid tool calls but server failed to parse them correctly.

- Added Format 4 (XML wrapped) documentation with examples - Updated implementation notes with correct parser order (3→4→1→2) - Marked all DeepSeek-R1 formats as working (July 2025 update) - Updated test status for Format 3 and 4 as passing - Added parse_deepseek_r1_xml_wrapped() function reference - Corrected implementation file line numbers

Resolved merge conflict in tests/test-function-calls.cpp by combining: - DeepSeek-R1 Format 4 XML wrapper tests - Streaming finish_reason logic tests from origin/main Both test suites now coexist and provide comprehensive coverage.

- Removed incomplete merge conflict marker from line 3027 - Ensured all tests compile and pass successfully - All DeepSeek-R1 formats (1-4) working correctly - All streaming and content cleaning tests passing

raidshoebox1 · 2025-08-07T15:21:58Z

Thank you for your hard work. But after this pull merged, the entire response from DeepSeek-R1-0528 is wrapped within the "thinking" tag.

Rolling back to pull #648, it works normally.

My parameters here:

/build/bin/llama-server
-m /root/models/DeepSeek-R1-0528/DeepSeek-R1-0528-UD-Q4_K_XL-00001-of-00008.gguf
--threads 32
--ctx-size 76800
-ot "exps=CPU"
-ngl 99
--host 0.0.0.0
--port 8089
--verbose
-a DeepSeek-R1-0528
-fa
-fmoe
-mla 3
-amb 512
-rtr

ikawrakow · 2025-08-07T15:26:10Z

@iSevenDays

Do you see a fix? Else it would be better to revert.

iSevenDays · 2025-08-07T18:12:02Z

@raidshoebox1 thank you for the bug report!
Could you please check if the issue persists in this PR #676?
I think I identified the issue, but I don't have a chance to run tests now

raidshoebox1 · 2025-08-08T01:49:32Z

I tested #676 and observed the same result as in this PR. The "Thinking" don't terminate.

iSevenDays · 2025-08-08T08:43:06Z

@raidshoebox1 I would be very thankful if you can test this branch https://github.com/iSevenDays/ik_llama.cpp/tree/deepseek-r1-parsing
I added additional tests, and that should work.

In case it doesn't, I think we have to revert this merge request, as I don't have other solutions available at the moment.

raidshoebox1 · 2025-08-08T09:15:23Z

@iSevenDays Just tested this branch. It works perfectly now. Thanks for the fix!

iSevenDays added 20 commits July 17, 2025 09:16

Implement function calling / tools for ik_llama.cpp for Kimi K2

fb7d01f

Implement basic tool choice

7f54f55

Backport llama.cpp tool calls support

e9e7fe6

Fix duplicate common_chat_parse definition

d230096

- Remove duplicate implementation from chat-parser.cpp - Keep single implementation in chat.cpp following llama.cpp patterns - Resolves linker error: multiple definition of common_chat_parse

Fix JSON assertion failure in function call parsing

3eff579

- Add proper validation that 'function' field is an object before accessing nested keys - Handle missing 'arguments' field gracefully with default "{}" - Prevents crash when parsing malformed tool call JSON structures

Merge branch 'ikawrakow:main' into function_calling

cd0392f

Add partial parsing support for JSON and regex

0272064

- json-partial.h/cpp: JSON partial parsing functionality - regex-partial.h/cpp: Regex partial parsing functionality

Fix Qwen3 tool call parsing - pass model name to parser

ff6be37

Server was not passing model name to parse_chat_message_incremental(), causing Qwen3 to fall back to Kimi-K2 parser and return tool calls as content instead of proper tool_calls array.

Fix non-streaming path to use model-specific parsing

8726ae5

Non-streaming responses were hardcoded to use Kimi-K2 format, causing Qwen3 XML tool calls to be returned as content instead of proper tool_calls array. Now uses same model detection as streaming path for consistency.

Update Qwen3 function call handling in server and tests

aff9de3

- Enhanced server function call detection and response formatting - Improved test coverage for Qwen3 tool call scenarios - Refined XML parsing for better tool execution support

Merge origin/main into qwen3-function-calls branch

501bbe9

- Integrated latest upstream changes from ikawrakow/ik_llama.cpp - Resolved conflicts in test-function-calls.cpp - Maintained local enhancements for Qwen3 function calling support

Merge origin/main with DeepSeek-R1 implementation

a80811d

Resolved merge conflict in tests/test-function-calls.cpp by combining: - DeepSeek-R1 Format 4 XML wrapper tests - Streaming finish_reason logic tests from origin/main Both test suites now coexist and provide comprehensive coverage.

iSevenDays changed the title ~~Deepseek r1 parsing~~ Deepseek R1 function calls (more formats) Jul 26, 2025

Fix merge conflict in test-function-calls.cpp

343304a

- Removed incomplete merge conflict marker from line 3027 - Ensured all tests compile and pass successfully - All DeepSeek-R1 formats (1-4) working correctly - All streaming and content cleaning tests passing

iSevenDays mentioned this pull request Jul 26, 2025

Function calling support for Kimi-K2 #628

Merged

4 tasks

ikawrakow approved these changes Aug 7, 2025

View reviewed changes

ikawrakow merged commit e484944 into ikawrakow:main Aug 7, 2025

iSevenDays mentioned this pull request Aug 8, 2025

Fix for Deepseek r1 parsing #676

Merged

4 tasks

saood06 mentioned this pull request Aug 15, 2025

Bug: Deepseek response content combined into <think> #690

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deepseek R1 function calls (more formats)#652

Deepseek R1 function calls (more formats)#652
ikawrakow merged 21 commits intoikawrakow:mainfrom
iSevenDays:deepseek-r1-parsing

iSevenDays commented Jul 26, 2025 •

edited

Loading

Uh oh!

raidshoebox1 commented Aug 7, 2025

Uh oh!

ikawrakow commented Aug 7, 2025

Uh oh!

iSevenDays commented Aug 7, 2025

Uh oh!

raidshoebox1 commented Aug 8, 2025

Uh oh!

iSevenDays commented Aug 8, 2025

Uh oh!

raidshoebox1 commented Aug 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

iSevenDays commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

raidshoebox1 commented Aug 7, 2025

Uh oh!

ikawrakow commented Aug 7, 2025

Uh oh!

iSevenDays commented Aug 7, 2025

Uh oh!

raidshoebox1 commented Aug 8, 2025

Uh oh!

iSevenDays commented Aug 8, 2025

Uh oh!

raidshoebox1 commented Aug 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

iSevenDays commented Jul 26, 2025 •

edited

Loading