Skip to content

webui: Fix selecting generated output issues during active streaming#18091

Merged
allozaur merged 16 commits intoggml-org:masterfrom
allozaur:17132-select-message-during-generation
Dec 18, 2025
Merged

webui: Fix selecting generated output issues during active streaming#18091
allozaur merged 16 commits intoggml-org:masterfrom
allozaur:17132-select-message-during-generation

Conversation

@allozaur
Copy link
Contributor

@allozaur allozaur commented Dec 16, 2025

Close #17132

  • Implements incremental rendering: Splits markdown content into stable blocks and a single unstable block (initial draft created by @ServeurpersoCom):
    • Stable blocks (all but the last) are cached and only rendered once
    • Only the last block (unstable) is re-rendered during streaming
    • Prevents DOM reconstruction of already-rendered content, enabling smooth text selection
    • Using HAST node positions for stable block identification
  • Reduces unnecessary re-processing of unchanged markdown blocks
  • Improves code organization in MarkdownContent component
demo.mp4

Copy link
Contributor Author

@allozaur allozaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggerganov @ngxson @ServeurpersoCom

Please do some testing on your end as well and let me know if we aren't missing anything in this PR to address the issue.

@ServeurpersoCom
Copy link
Contributor

ServeurpersoCom commented Dec 16, 2025

Fantastic! I absolutely must stress test it @ MoE A3B on 5090 because my draft kept crashing after a while!

Edit : We have the same edge case bug as in the POC/Draft (video watched together). We need to try to narrow down by generating the content that causes it.

@ServeurpersoCom
Copy link
Contributor

ServeurpersoCom commented Dec 16, 2025

Running GPT-OSS-20B at 330 tok/s inference was faster than rendering, making it easier to trigger the race condition between stable/unstable block updates, solution: await tick() to force DOM sync
I can no longer reproduce the bug with this patch
I haven't noticed any performance drop, although it could be improved by limiting rendering to one per requestAnimationFrame or a submultiple of them later.

@ggerganov
Copy link
Member

Generally this also works on my end, but I do see occasionally bigger selections than expected. Maybe related to the race that @ServeurpersoCom found:

webui-selection-0.mp4

@allozaur allozaur force-pushed the 17132-select-message-during-generation branch from eb39de1 to 511a426 Compare December 17, 2025 09:50
Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I did some more testing and the problem that I observed occurs only when I am trying to select text inside a code block that is currently being generated. After it gets closed, then selecting for that block works ok.

I think this is acceptable.

@allozaur
Copy link
Contributor Author

Actually, I did some more testing and the problem that I observed occurs only when I am trying to select text inside a code block that is currently being generated. After it gets closed, then selecting for that block works ok.

I think this is acceptable.

@ggerganov @ServeurpersoCom I've added some changes after @ngxson's review. Please re-test this on your ends.

@ServeurpersoCom
Copy link
Contributor

No regression on my end: >10 long rich markdown generations with success, whereas the corruption was systematically occurring after 2 or 3 generations
Testing : GPT-OSS-20B
Prompt : Writes a long and rich markdown

  • Smartphone test OK

@ggerganov
Copy link
Member

Found a bug introduced here - when you "Copy code", it forgets the whitespaces:

image

Copy link
Contributor

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on my side, it works except for inside a generating code block as Georgi spotted earlier.

We can improve this in the future by somehow prevent setting innerHTML via {@html block.html}. Instead, I think it's best to have a system where it can take a HastRoot and use depth-first search to get the diff between 2 virtual DOM, and only update changed nodes on HTML. Probably there are some libraries already do all of these heavy-lifting works for us, but we can have a look later.

For now, I think this PR is good to merge.

@ServeurpersoCom
Copy link
Contributor

Same on Windows, copy paste -> no \n or no \r\n

@allozaur allozaur force-pushed the 17132-select-message-during-generation branch from 6768dec to aa461e8 Compare December 18, 2025 00:59
@allozaur
Copy link
Contributor Author

Found a bug introduced here - when you "Copy code", it forgets the whitespaces:
image

@ggerganov @ServeurpersoCom this should be fixed with 30c2c18

@allozaur allozaur force-pushed the 17132-select-message-during-generation branch from aa461e8 to b4aa66a Compare December 18, 2025 10:13
@allozaur allozaur merged commit 9ce64ae into ggml-org:master Dec 18, 2025
10 checks passed
@allozaur allozaur deleted the 17132-select-message-during-generation branch December 18, 2025 10:17
@thomasjfox
Copy link
Contributor

Thanks so much for this one! 🥳

It fixes a usability annoyance for good. The users will love it.

@allozaur
Copy link
Contributor Author

Thanks so much for this one! 🥳

It fixes a usability annoyance for good. The users will love it.

Great to hear! It's still not in a perfect state and we are planning a better strategy for rendering the generated content, but it solves the most pressing issue.

Anico2 added a commit to Anico2/llama.cpp that referenced this pull request Jan 15, 2026
…gml-org#18091)

* draft: incremental markdown rendering with stable blocks

* refactor: Logic improvements

* refactor: DRY Markdown post-processing logic

* refactor: ID generation improvements

* fix: Remove runes

* refactor: Clean up & add JSDocs

* chore: update webui static output

* fix: Add tick to prevent race conditions for rendering Markdown blocks

Suggestion from @ServeurpersoCom

Co-authored-by: Pascal <[email protected]>

* chore: Run `npm audit fix`

* chore: update webui static output

* feat: Improve performance using global counter & id instead of UUID

* refactor: Enhance Markdown rendering with link and code features

* chore: update webui static output

* fix: Code block content extraction

* chore: update webui static output

* chore: update webui static output

---------

Co-authored-by: Pascal <[email protected]>
blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026
…(#18091)

* draft: incremental markdown rendering with stable blocks

* refactor: Logic improvements

* refactor: DRY Markdown post-processing logic

* refactor: ID generation improvements

* fix: Remove runes

* refactor: Clean up & add JSDocs

* chore: update webui static output

* fix: Add tick to prevent race conditions for rendering Markdown blocks

Suggestion from @ServeurpersoCom

Co-authored-by: Pascal <[email protected]>

* chore: Run `npm audit fix`

* chore: update webui static output

* feat: Improve performance using global counter & id instead of UUID

* refactor: Enhance Markdown rendering with link and code features

* chore: update webui static output

* fix: Code block content extraction

* chore: update webui static output

* chore: update webui static output

---------

Co-authored-by: Pascal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: Webui: Problems selecting text while generating

5 participants