fix(compaction): estimate context usage after compaction and show 0.1% precision by RealKai42 · Pull Request #1269 · MoonshotAI/kimi-cli

RealKai42 · 2026-02-27T07:10:10Z

This PR fixes context usage reporting right after compaction.

SimpleCompaction now returns a CompactionResult with compacted messages plus optional token usage.
Added CompactionResult.estimated_token_count:
- Uses exact usage.output for the generated summary when available.
- Estimates preserved/all message tokens from text length when exact usage is unavailable.
- Ignores non-text parts (e.g., think content).
KimiSoul now updates context token count immediately after compaction, preventing temporary 0% usage display.
Web UI context percentage now shows one decimal place (e.g., 12.3%) for clearer feedback.

Tests:

Added unit tests for token estimation behavior:
- with usage
- without usage
- non-text part exclusion
- empty messages

Checklist

I have read the CONTRIBUTING document.
I have linked the related issue, if any.
I have added tests that prove my fix is effective or that my feature works.
I have run make gen-changelog to update the changelog.
I have run make gen-docs to update the user documentation.

…imation and update related methods

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Copilot

Pull request overview

This PR improves context usage reporting immediately after context compaction by estimating token usage when exact usage isn’t available, and by displaying context usage percentages with one-decimal precision in the web UI.

Changes:

Introduces CompactionResult (messages + optional TokenUsage) and adds estimated_token_count for post-compaction token estimation.
Updates KimiSoul.compact_context() to update Context.token_count right after compaction using the estimate.
Adjusts the web UI to show context usage with 0.1% precision (e.g., 12.3%).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`web/src/features/chat/components/prompt-toolbar/toolbar-context.tsx`	Displays context percentage with one decimal place.
`web/src/features/chat/chat.tsx`	Computes `usagePercent` with 0.1% precision.
`tests/core/test_simple_compaction.py`	Adds unit tests for `CompactionResult.estimated_token_count` behavior.
`src/kimi_cli/soul/kimisoul.py`	Uses `CompactionResult` and updates context token count immediately after compaction.
`src/kimi_cli/soul/compaction.py`	Adds `CompactionResult` and token estimation logic; changes compaction return type accordingly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-27T07:17:48Z

src/kimi_cli/soul/compaction.py

+    """Estimate tokens from message text content using a character-based heuristic."""
+    total_chars = 0
+    for msg in messages:
+        for part in msg.content:
+            if isinstance(part, TextPart):
+                total_chars += len(part.text)
+    # ~4 chars per token for English; somewhat underestimates for CJK text,
+    # but this is a temporary estimate that gets corrected on the next LLM call.
+    return total_chars // 4


_estimate_text_tokens currently only counts TextPart in message.content and ignores other token-bearing fields like Message.tool_calls (function names/arguments) and any non-text content that still consumes tokens (e.g., images). Since Context.token_count is used to decide when to compact (token_count + reserved >= max_context_size), this underestimation can prevent compaction and lead to provider context-limit errors. Consider extending the estimator to include tool call names/arguments (and optionally apply a fallback cost for non-text parts) so the estimate is biased high rather than low.

Suggested change

"""Estimate tokens from message text content using a character-based heuristic."""

total_chars = 0

for msg in messages:

for part in msg.content:

if isinstance(part, TextPart):

total_chars += len(part.text)

# ~4 chars per token for English; somewhat underestimates for CJK text,

# but this is a temporary estimate that gets corrected on the next LLM call.

return total_chars // 4

"""Estimate tokens from message content using a character-based heuristic.

This includes:

- Textual content parts (TextPart, ThinkPart) counted by character length.

- Tool call payloads, approximated from their string representation.

- A conservative fallback token cost for any non-text content parts.

"""

total_chars = 0

extra_tokens = 0

for msg in messages:

# Count textual content parts.

for part in getattr(msg, "content", []) or []:

if isinstance(part, TextPart):

total_chars += len(part.text)

elif isinstance(part, ThinkPart):

total_chars += len(part.text)

else:

# Non-text parts (e.g., images, custom structures) still consume tokens

# at the provider. Assign a small conservative cost so we bias high.

extra_tokens += 32

# Roughly account for tool call names/arguments, which are serialized as text.

for tool_call in getattr(msg, "tool_calls", []) or []:

# Use repr() to capture both the function name and arguments textually.

total_chars += len(repr(getattr(tool_call, "function", tool_call)))

# ~4 chars per token for English; somewhat underestimates for CJK text,

# but this is a temporary estimate that gets corrected on the next LLM call.

# Add extra_tokens so that non-text parts are not underestimated to zero.

return total_chars // 4 + extra_tokens

Copilot · 2026-02-27T07:17:48Z

src/kimi_cli/soul/compaction.py

+    # ~4 chars per token for English; somewhat underestimates for CJK text,
+    # but this is a temporary estimate that gets corrected on the next LLM call.
+    return total_chars // 4


The estimator uses floor division (total_chars // 4), which can return 0 for short-but-non-empty text (and will always round down). Because this value is written into Context.token_count and drives both UI context usage and compaction triggering, rounding down is risky. Consider using math.ceil(total_chars / 4) (and possibly max(1, …) when total_chars > 0) so the estimate is not systematically under-reporting.

Copilot · 2026-02-27T07:17:49Z

src/kimi_cli/soul/kimisoul.py

        await self._context.clear()
        await self._checkpoint()
-        await self._context.append_message(compacted_messages)
+        await self._context.append_message(compaction_result.messages)
+
+        # Estimate token count so context_usage is not reported as 0%
+        await self._context.update_token_count(compaction_result.estimated_token_count)
+


update_token_count is fed from compaction_result.estimated_token_count, but this method calls _checkpoint() between clear() and append_message(). When _checkpoint_with_user_message is enabled, _checkpoint() appends a CHECKPOINT … user message into the context history; that message’s tokens are not included in estimated_token_count, so Context.token_count becomes inconsistent with Context.history. Consider estimating from the full post-compaction history (or adding the checkpoint message cost) before updating the token count.

Copilot · 2026-02-27T07:17:49Z

src/kimi_cli/soul/kimisoul.py

+        # Estimate token count so context_usage is not reported as 0%
+        await self._context.update_token_count(compaction_result.estimated_token_count)
+


This change updates Context.token_count during compact_context(), which also affects the compaction trigger logic (token_count + reserved >= max_context_size). There doesn’t appear to be a test exercising compact_context() end-to-end to ensure token counts are updated as expected (including the checkpoint message case). Consider adding a unit/integration test around KimiSoul.compact_context() to prevent regressions in context usage reporting and compaction behavior.

…% precision (#1269)

feat(compaction): enhance CompactionResult to include token usage est…

99e385f

…imation and update related methods

Copilot AI review requested due to automatic review settings February 27, 2026 07:10

Copilot started reviewing on behalf of RealKai42 February 27, 2026 07:10 View session

devin-ai-integration bot reviewed Feb 27, 2026

View reviewed changes

Copilot AI reviewed Feb 27, 2026

View reviewed changes

RealKai42 added 2 commits February 27, 2026 15:20

chore: update changelog

acc4632

feat(compaction): add test for manual compaction with usage tracking

31c0a39

RealKai42 merged commit dc5f94b into main Feb 27, 2026
18 checks passed

RealKai42 deleted the kaiyi/fix-compact-pct-display branch February 27, 2026 07:48

RealKai42 added a commit that referenced this pull request Feb 27, 2026

fix(compaction): estimate context usage after compaction and show 0.1…

c6b696c

…% precision (#1269)

This was referenced Feb 28, 2026

📊 AI CLI 工具社区动态日报 2026-02-28 duanyytop/agents-radar#23

Open

📊 AI CLI 工具社区动态日报 2026-02-28 duanyytop/agents-radar#27

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(compaction): estimate context usage after compaction and show 0.1% precision#1269

fix(compaction): estimate context usage after compaction and show 0.1% precision#1269
RealKai42 merged 3 commits intomainfrom
kaiyi/fix-compact-pct-display

RealKai42 commented Feb 27, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 27, 2026

Uh oh!

Copilot AI Feb 27, 2026

Uh oh!

Copilot AI Feb 27, 2026

Uh oh!

Copilot AI Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    """Estimate tokens from message text content using a character-based heuristic."""
-    total_chars = 0
-    for msg in messages:
-        for part in msg.content:
-            if isinstance(part, TextPart):
-                total_chars += len(part.text)
-    # ~4 chars per token for English; somewhat underestimates for CJK text,
-    # but this is a temporary estimate that gets corrected on the next LLM call.
-    return total_chars // 4
+    """Estimate tokens from message content using a character-based heuristic.
+    This includes:
+    - Textual content parts (TextPart, ThinkPart) counted by character length.
+    - Tool call payloads, approximated from their string representation.
+    - A conservative fallback token cost for any non-text content parts.
+    """
+    total_chars = 0
+    extra_tokens = 0
+    for msg in messages:
+        # Count textual content parts.
+        for part in getattr(msg, "content", []) or []:
+            if isinstance(part, TextPart):
+                total_chars += len(part.text)
+            elif isinstance(part, ThinkPart):
+                total_chars += len(part.text)
+            else:
+                # Non-text parts (e.g., images, custom structures) still consume tokens
+                # at the provider. Assign a small conservative cost so we bias high.
+                extra_tokens += 32
+        # Roughly account for tool call names/arguments, which are serialized as text.
+        for tool_call in getattr(msg, "tool_calls", []) or []:
+            # Use repr() to capture both the function name and arguments textually.
+            total_chars += len(repr(getattr(tool_call, "function", tool_call)))
+    # ~4 chars per token for English; somewhat underestimates for CJK text,
+    # but this is a temporary estimate that gets corrected on the next LLM call.
+    # Add extra_tokens so that non-text parts are not underestimated to zero.
+    return total_chars // 4 + extra_tokens

		# Estimate token count so context_usage is not reported as 0%
		await self._context.update_token_count(compaction_result.estimated_token_count)

Conversation

RealKai42 commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RealKai42 commented Feb 27, 2026 •

edited

Loading