feat[DRAFT]: Introduce structured thinking UI and total thinking time measurement #6835

qnixsynapse · 2025-10-28T10:33:57Z

Describe Your Changes

Structured Thinking UI (ThinkingBlock):
- Replaced raw text streaming with step-by-step display for reasoning, tool_call, and tool_output using a new ReActStep structure.
- Implemented streaming logic that displays the previously completed step while the current step accumulates content.
- Added visual cues (icons, styling) for completed steps and total thinking duration upon finalization.
- Added step content truncation (1000 chars) to prevent UI slowdowns from large outputs.
Thinking Time Measurement (useChat):
- Captured startTime at message creation and computed totalThinkingTime upon final completion.
- Included totalThinkingTime in message metadata when reasoning or tool calls occur.
Unified Reasoning/Tool Presentation (ThreadContent):
- Consolidated presentation into a single ThinkingBlock component whenever reasoning or tool calls are present, simplifying the thread rendering logic.
- Implemented chronological event tracking (streamEvents) to accurately order reasoning chunks and tool steps for reliable step-by-step display.
Agent Loop Refactor (useChat & postMessageProcessing):
- Removed the recursive while loop in useChat; agent execution is now self-contained and driven by recursive calls within postMessageProcessing.
- Replaced the global toolStepCounter with a recursive currentStepCount parameter to accurately limit tool execution steps per message chain, fixing potential infinite loop issues.

This results in a richer, step-by-step trace of the model's internal thought process, clearer visualization of tool usage, and accurate measurement of agent latency.

Fixes Issues

Closes #
Closes #

Self Checklist

Added relevant comments, esp in complex areas
Updated docs (for bug fixes / features)
Created issues for follow-up changes or refactoring needed

github-actions · 2025-10-28T13:21:32Z

Barecheck - Code coverage report

Total: 29.2%

Your code coverage diff: -0.40% ▾

Uncovered files and lines

File	Lines
web-app/src/containers/ScrollToBottom.tsx	1-11, 13-16, 19-28, 30, 32-44, 46-51, 53-55, 57-58, 60, 62, 64
web-app/src/containers/ThinkingBlock.tsx	3-9, 29-35, 38-41, 49-58, 61-66, 68-77, 80, 83-86, 89, 93, 96-100, 103-106, 109-110, 112, 115-118, 121-126, 128-133, 135-138, 140-143, 145, 147, 149-153, 155, 157-170, 172, 174-175, 177-179, 181-195, 197-202, 204, 206-209, 211-221, 223-230, 232, 234-237, 239, 241, 243-244, 246, 248, 250-259, 262-263, 267-268, 270-275, 277-281, 283-286, 288-291, 293-295, 298-301, 303, 305-309, 313-317, 319-320, 322, 324-325, 327-332, 334-335, 337-338, 342-354, 357-362, 365-369, 371, 373-374, 376, 378
web-app/src/containers/ThreadContent.tsx	3-9, 11-12, 17, 22-23, 25, 27-30, 53-58, 61-63, 65, 67-68, 71-72, 74-78, 81-83, 85-86, 88-90, 92-96, 98-101, 103-107, 109-116, 118, 120, 123-125, 139-141, 144-147, 149-152, 154-156, 158-161, 164-169, 177-184, 187-188, 192, 194-197, 200-202, 205-209, 213-215, 217, 220-221, 223-229, 231-235, 237-239, 241, 243-251, 253-275, 277-279, 281-293, 295-298, 300-305, 307, 312-313, 316-317, 319, 321, 323, 325-327, 329-330, 332-333, 335-336, 339-341, 343-348, 350-351, 353, 355-368, 370-371, 373, 375-376, 378-389, 391-398, 401-405, 407-414, 417-423, 425, 427, 429-435, 438-445, 448-450, 452, 454-467, 469-474, 477-484, 487-488, 490-492, 494-501, 503-510, 515-516, 519, 522-523, 528-529, 533-534, 536-539, 541-552, 556-562, 564-568, 570-581, 583-586, 590-600, 602-613, 615-619, 622-629, 631-641, 643-654, 657-664, 666-667, 671-676, 679-683, 685-686, 689-693, 697-703, 705-708, 710-711, 713-720, 722-728, 730, 732-739, 741, 744-752, 754-755, 757-758
web-app/src/containers/dialogs/MessageMetadataDialog.tsx	73-81, 86-96, 101-105, 107, 109-121, 123-129, 132-139, 141, 143-149, 151-177, 180-196, 199-206, 208-213, 215, 217-218, 222-245, 248-251, 253-256, 258
web-app/src/hooks/useChat.ts	114-115, 119-123, 126, 129-138, 140-150, 153-155, 158-160, 162-166, 174-183, 193-215, 218, 220, 222, 225, 228-233, 235-236, 241-243, 246-250, 253, 256-258, 260-270, 305-306, 308-311, 313-315, 317-341, 344-345, 347-350, 353-355, 357-360, 363-370, 373-387, 389-391, 405-406, 431-432, 439-443, 452-453, 455-460, 463, 465-472, 479-481, 483-496, 511-520, 523-526, 528-534, 536-546, 548-552, 554-557, 559-594, 596-622, 624-626, 629, 631-633, 635-636, 639-645, 647-652, 654-655, 657, 659, 662-666, 668-673, 676-681, 683-688, 690-693, 695-703, 706, 710-713, 715-725, 727-730, 733-734, 736-739, 741-746, 748-752, 754-756, 758, 760, 762-766, 769-770, 773-777, 779, 781-786, 798-802, 814-815
web-app/src/lib/completion.ts	79-84, 89, 104-112, 184-192, 194, 196-197, 199-201, 203, 205, 208-213, 215-222, 224-234, 237-245, 248-273, 275-276, 278, 280-296, 298-319, 335-341, 392, 395-400, 402-405, 476-477, 480-481, 496, 498, 500-506, 508-510, 512-514, 568-572, 582-583, 590, 620-621, 630, 634-638, 644-664, 674-680, 682-690, 695-699, 752-753, 755, 757-765, 767, 769-775, 777, 779-788, 790-792, 794-796, 798-803, 805-811, 813, 815-817, 819-825, 828-833, 835, 838-846, 848-854, 856-861, 864-865, 867-889, 891-905

… thinking time - **ThinkingBlock** - Added `ThoughtStep` type and UI handling for step kinds: `thought`, `tool_call`, `tool_output`, and `done`. - Integrated `Check` icon for completed steps and formatted duration (seconds) display. - Implemented streaming paragraph extraction, fade‑in/out animation, and improved loading state handling. - Updated header to show dynamic titles (thinking/thought + duration) and disabled expand/collapse while loading. - Utilized `cn` utility for conditional class names and added relevant imports. - **ThreadContent** - Defined `ToolCall` and `ThoughtStep` types for type safety. - Constructed `allSteps` via `useMemo`, extracting thought paragraphs, tool calls/outputs, and a final `done` step with total thinking time. - Passed `steps`, `loading`, and `duration` props to `ThinkingBlock`. - Introduced `hasReasoning` flag to conditionally render the reasoning block and avoid duplicate tool call rendering. - Adjusted rendering logic to hide empty reasoning and ensure tool call blocks only appear when no reasoning is present. - **useChat** - Refactored `getCurrentThread` for clearer async flow while preserving temporary‑chat behavior. - Captured `startTime` at message creation and computed `totalThinkingTime` on completion. - Included `totalThinkingTime` in message metadata when appropriate. - Minor cleanup: improved error handling for image ingestion and formatting adjustments. Overall, these changes provide a richer, step‑by‑step thinking UI, better state handling during streaming, and expose total thinking duration for downstream components.

- Replace raw text parsing with step‑based streaming logic in `ThinkingBlock`. - Introduced `stepsWithoutDone`, `currentStreamingStepIndex`, and `displayedStepIndex` to drive the streaming UI. - Added placeholder UI for empty streaming state and hide block when there is no content after streaming finishes. - Simplified expansion handling and bullet‑point rendering, using `renderStepContent` for both streaming and expanded views. - Removed unused `extractThinkingContent` import and related code. - Updated translation keys and duration formatting. - Consolidate reasoning and tool‑call presentation in `ThreadContent`. - Introduced `shouldShowThinkingBlock` to render a single `ThinkingBlock` when either reasoning or tool calls are present. - Adjusted `ThinkingBlock` props (`text`, `steps`, `loading`) and ID generation. - Commented out the now‑redundant `ToolCallBlock` import and removed its conditional rendering block. - Cleaned up comments, unused variables, and minor formatting/typo fixes. - General cleanup: - Updated comments for clarity. - Fixed typo in deletion loop comment. - Minor UI tweaks (bullet styling, border handling).

…soning step

…d step limit * Move the assistant‑loop logic out of `useChat` and into `postMessageProcessing`. * Eliminate the while‑loop that drove repeated completions; now a single completion is sent and subsequent tool calls are processed recursively. * Introduce early‑abort checks and guard against missing provider before proceeding. * Add `ReasoningProcessor` import and use it consistently for streaming reasoning chunks. * Add `ToolCallEntry` type and a global `toolStepCounter` to track and cap total tool steps (default 20) to prevent infinite loops. * Extend `postMessageProcessing` signature to accept thread, provider, tools, UI update callback, and max tool steps. * Update UI‑update logic to use a single `updateStreamingUI` callback and ensure RAF scheduling is cleaned up reliably. * Refactor token‑speed / progress handling, improve error handling for out‑of‑context situations, and tidy up code formatting. * Minor clean‑ups: const‑ify `availableTools`, remove unused variables, improve readability.

…ention This commit significantly refactors how assistant message content containing reasoning steps (<think> blocks) and tool calls is parsed and split into final output text and streamed reasoning text in `ThreadContent.tsx`. It introduces new logic to correctly handle multiple, open, or closed `<think>` tags, ensuring that: 1. All text outside of `<think>...</think>` tags is correctly extracted as final output text. 2. Content inside all `<think>` tags is aggregated as streamed reasoning text. 3. The message correctly determines if reasoning is actively loading during a stream. Additionally, this commit: * **Fixes infinite tool loop prevention:** The global `toolStepCounter` in `completion.ts` is replaced with an explicit `currentStepCount` parameter passed recursively in `postMessageProcessing`. This ensures that the tool step limit is correctly enforced per message chain, preventing potential race conditions and correctly resolving the chain. * **Fixes large step content rendering:** Limits the content of a single thinking step in `ThinkingBlock.tsx` to 1000 characters to prevent UI slowdowns from rendering extremely large JSON or text outputs.

Implement support for displaying images returned in the Multi-Content Part (MCP) format within the `tool_output` step of the ReAct thinking block. This change: - Safely parses `tool_output` content to detect and extract image data (base64). - Renders images as clickable thumbnails using data URLs. - Integrates `ImageModal` to allow users to view the generated images in full size.

- Remove legacy <think> tag parsing and accumulation of reasoning chunks in the main text buffer. - Rely exclusively on `streamEvents` to derive reasoning and loading state. - Update loading logic to account for both tool calls and reasoning events. - Adjust memo dependencies and return values to avoid stale references. - Update `useChat` and `completion.ts` to stop mutating the accumulated text with reasoning, keeping the logic purely event‑driven. - Ensure the ThinkingBlock always renders from the structured steps list, improving consistency and eliminating duplicate content.

Add more descriptive loading and finished state labels for the ThinkingBlock component. The update: - Uses new translation keys (`chat:calling_tool`, `chat:thought_and_tool_call`, etc.) for clearer tool‑call and reasoning messages. - Handles `tool_call`, `tool_output`, and `reasoning` steps explicitly, providing a fallback when no active step is present. - Adjusts the final label logic to use the new i18n keys and formats durations consistently. - Adds missing locale entries for all new keys and a trailing newline to the JSON file. These changes improve user feedback during chat interactions and make the messages easier to maintain and localize.

The update fixes how total thinking time is calculated during a chat message flow. Previously the elapsed time from the initial completion was incorrectly added to the overall thinking time, leading to inflated metrics. The new logic splits the computation into separate phases (initial completion, tool execution, and follow‑up completions) and accumulates them into `totalThinkingTime`, ensuring accurate measurement. Additionally, translation keys for chat components are now namespaced with `chat:` to avoid collisions and clearly indicate the context in which they are used. The diff also removes a stray comment line and keeps metadata updates consistent across recursive calls.

Remove the separate “Thinking…” placeholder component and embed the empty‑streaming state directly inside the main block. Adjust the click handler and button disabled logic to only allow toggling when content is available, preventing accidental collapse during loading. This change simplifies the component, eliminates duplicate markup, and improves UX by consistently showing the thinking indicator within the block.

Previously the component used an `isStreamingEmpty` flag to display a “thinking” placeholder when the block was loading but had no steps yet. The new implementation handles this case directly in the streaming block by checking `activeStep || N === 0`, removing the unused flag and simplifying the conditional rendering. In addition, the click and button‑disable logic were clarified, and extraneous comments were removed for cleaner code. These changes improve readability and maintainability without altering external behavior.

Adds a `linkComponents` prop that can be supplied to `RenderMarkdown` within `ThinkingBlock` and propagates this prop to the thread view. The change enables custom link rendering (e.g., special handling of URLs or interactions) without modifying the core component logic. It also renames the “Tool Call” label to “Tool Input” to more accurately describe the content being displayed.

Use the `MessageStatus` enum to determine when the “Generate AI Response” button should appear. Previously the visibility logic checked the last message’s role or the presence of tool calls, which could be unreliable since we moved to combined tool call/reasoning block. By checking that the last message exists and its status is not `Ready` which is that the message is finished generating when an eot token is found, the button is shown only while the AI has stopped generating before it could end properly, improving UX and aligning the UI state with the underlying message state. The change also imports `MessageStatus` and cleans up formatting for readability.

Previously the dialog simply rendered the raw JSON of the metadata, which made it hard to read and required the CodeEditor dependency. This change replaces the raw viewer with a set of semantic sections that show assistant details, model parameters, token speed, and timestamps in a clean, icon‑rich layout. The component now uses TypeScript interfaces for better type safety, memoized formatting helpers, and removes the unnecessary CodeEditor import. Locale entries were added for all new labels. The updated UI improves user experience by making metadata more accessible and readable, while simplifying the code base and reducing bundle size.

…ormatting Add import for `captureProactiveScreenshots`, correct mock response formatting, and update test expectations to match the new API. Enhance coverage by adding scenarios for screenshot capture errors, abort controller handling, and proactive mode toggling. These changes provide clearer, more robust tests for the completion logic.

The metadata prop was previously required, but callers sometimes pass null or undefined. Updating the type to allow null/undefined prevents runtime errors and simplifies usage.

Rearrange the `postMessageProcessing` signature so that `isProactiveMode` now precedes the internal `currentStepCount`. This change improves the logical flow of the API: the proactive flag is a public flag that callers should set, whereas the step counter is an internal bookkeeping value. The updated order also aligns the JSDoc comment with the implementation. All related tests were updated to reflect the new parameter order, and comments were adjusted to describe the internal counter in the correct place. No functional behavior changes occur; the change simply makes the API easier to read and maintain.

Add a helper `extractContentAndClean` that pulls out the content between `<think>` tags and removes all auxiliary tags from the final output. Update the message rendering logic to use this helper for finalized messages that lack explicit stream events, ensuring that reasoning and final output are correctly separated and displayed. Adjust the reasoning detection to consider extracted reasoning as well as stream events, clean the copy button to use the actual final output, and eliminate duplicate `StreamEvent` type definitions. These changes improve message parsing accuracy and simplify the component’s handling of legacy messages that embed both reasoning and results in the same string.

github-project-automation bot added this to Jan Oct 28, 2025

qnixsynapse marked this pull request as draft October 28, 2025 10:34

github-actions bot assigned qnixsynapse Oct 28, 2025

qnixsynapse force-pushed the feat/new_combined_reasoning_tool_calling_block branch from 6cee9cf to 97c9407 Compare October 29, 2025 15:42

qnixsynapse and others added 24 commits November 1, 2025 20:40

fix: don't stream thought steps

b798623

fix: do not add done after tool call output when there is another rea…

3ee4d36

…soning step

chore: add smooth animation when step inside thinking and tool call

4e737db

fix: final text stream rendering

036ebad

chore: updat title block thinking and tool call

b8a32bd

fix: disable any type checking in useChat

58b81d2

Remove defunct condition as tool calls and reasoning are unified

7a06691

refactor: allow nullable metadata in MessageMetadataDialog

4bbb8a6

The metadata prop was previously required, but callers sometimes pass null or undefined. Updating the type to allow null/undefined prevents runtime errors and simplifies usage.

qnixsynapse force-pushed the feat/new_combined_reasoning_tool_calling_block branch from ea922ea to 2455147 Compare November 1, 2025 15:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat[DRAFT]: Introduce structured thinking UI and total thinking time measurement #6835

feat[DRAFT]: Introduce structured thinking UI and total thinking time measurement #6835

Uh oh!

qnixsynapse commented Oct 28, 2025

Uh oh!

github-actions bot commented Oct 28, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat[DRAFT]: Introduce structured thinking UI and total thinking time measurement #6835

Are you sure you want to change the base?

feat[DRAFT]: Introduce structured thinking UI and total thinking time measurement #6835

Uh oh!

Conversation

qnixsynapse commented Oct 28, 2025

Describe Your Changes

Fixes Issues

Self Checklist

Uh oh!

github-actions bot commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Barecheck - Code coverage report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Oct 28, 2025 •

edited

Loading