feat: telegram use parse mode ModeMarkdownV2 instead of ModeHTML by Alexandersfg4 · Pull Request #1018 · sipeed/picoclaw

Alexandersfg4 · 2026-03-03T07:59:15Z

📝 Description

🗣️ Type of Change

✨ New feature (non-breaking change which adds functionality)
⚡ Code refactoring (no functional changes, no api changes)

🤖 AI Code Generation

🛠️ Mostly AI-generated (AI draft, Human verified/modified)

🔗 Related Issue

Currently, Picoclaw sends messages from the agent to Telegram in the following way:
Input in MD format -> Parsed into HTML -> Sent as a message.
If the message fails to send, a fallback message is sent with HTML format and an empty parse mode.
In current implementation I often observe something like this (heading tags, HTML tags
and etc.):

My proposal:
Input in MD format -> Parsed into Telegram MD2 format -> Sent as a message. (less bugs could be handled)
If the message fails to send, a fallback message is sent with MD format and an empty parse mode.
Heading is not supporting so we transform heading -> bolding.
Unit tests created for the feature.

📚 Technical Context (Skip for Docs)

Reference URL:
Reasoning:

🧪 Test Environment

Hardware: Apple M2 Pro
OS: 26.3 (25D125)
Model/Provider: groq/meta-llama/llama-4-scout-17b-16e-instruct"
Channels: Telegram

📸 Evidence (Optional)

Click to view Logs/Screenshots

☑️ Checklist

My code/docs follow the style of this project.
I have performed a self-review of my own changes.
I have updated the documentation accordingly.

Alexandersfg4 · 2026-03-04T16:10:35Z

@alexhoshina Hi, could you look at my PR?

alexhoshina · 2026-03-04T16:42:09Z

Thanks for the reminder and for your patience.
This PR proposes switching Telegram’s parse mode to MarkdownV2.
At the same time, there’s another open PR (#935) that updates the Telegram sending logic under HTML mode—particularly around message formatting and output behavior.
These two PRs represent different directions: one assumes we’ll keep using HTML, the other assumes we’ll move to MarkdownV2.
Since they’re mutually exclusive in intent, we need to decide which parse mode we want to standardize on as a telegram.

Should our long-term Telegram output use HTML or MarkdownV2?

Thanks again for your understanding and collaboration!

alexhoshina · 2026-03-04T16:43:45Z

Thanks for the reminder and for your patience. This PR proposes switching Telegram’s parse mode to MarkdownV2. At the same time, there’s another open PR (#935) that updates the Telegram sending logic under HTML mode—particularly around message formatting and output behavior. These two PRs represent different directions: one assumes we’ll keep using HTML, the other assumes we’ll move to MarkdownV2. Since they’re mutually exclusive in intent, we need to decide which parse mode we want to standardize on as a telegram.

Should our long-term Telegram output use HTML or MarkdownV2?

Thanks again for your understanding and collaboration!

@Alexandersfg4 @putueddy

putueddy · 2026-03-05T12:20:29Z

Hey @alexhoshina, thanks for flagging this — it's a good discussion to have before we go in two different directions!

After looking at both PRs, I'd lean toward sticking with HTML. Here's my thinking:

LLM output is messy by nature
The agent loves throwing around *, _, ., ! and all sorts of characters that happen to be reserved in MarkdownV2 (there are 19+ of them!). One unescaped dot or mismatched asterisk and Telegram rejects the whole message. With HTML, we only need to worry about <, >, and & — which almost never show up in normal responses.

The community generally says the same thing
Most Telegram bot framework maintainers (Telegraf, GramIO, etc.) actually recommend HTML over MarkdownV2 for dynamic content. Some go as far as calling MarkdownV2 "dangerous" when you don't fully control the input — and with LLM output, we definitely don't.

A few concerns with the MarkdownV2 parser here
I noticed some things that could bite us — there's a potential out-of-bounds panic in the expandable blockquote handler when the input is short, the link detection relies on a single-char lookbehind that can misfire (e.g. array0), and backslashes aren't being escaped. The test suite covers the happy path nicely but doesn't quite get to these edge cases yet.

The formatting issues are fixable under HTML
The screenshots in this PR showing leaked HTML tags are real and annoying, but I think that's more about bugs in the current converter than a fundamental problem with HTML mode. My PR (#935) tackles the chunking and fallback logic, which should clean those up.

All that said — I really appreciate the work in this PR, @Alexandersfg4! The heading-to-bold conversion is a nice touch and something we could totally bring over to the HTML path too.

Curious what you both think!

CLAassistant · 2026-03-05T15:00:10Z

All committers have signed the CLA.

Alexandersfg4 · 2026-03-06T04:48:11Z

Hey @alexhoshina, thanks for flagging this — it's a good discussion to have before we go in two different directions!

After looking at both PRs, I'd lean toward sticking with HTML. Here's my thinking:

LLM output is messy by nature The agent loves throwing around *, _, ., ! and all sorts of characters that happen to be reserved in MarkdownV2 (there are 19+ of them!). One unescaped dot or mismatched asterisk and Telegram rejects the whole message. With HTML, we only need to worry about <, >, and & — which almost never show up in normal responses.

The community generally says the same thing Most Telegram bot framework maintainers (Telegraf, GramIO, etc.) actually recommend HTML over MarkdownV2 for dynamic content. Some go as far as calling MarkdownV2 "dangerous" when you don't fully control the input — and with LLM output, we definitely don't.

A few concerns with the MarkdownV2 parser here I noticed some things that could bite us — there's a potential out-of-bounds panic in the expandable blockquote handler when the input is short, the link detection relies on a single-char lookbehind that can misfire (e.g. array0), and backslashes aren't being escaped. The test suite covers the happy path nicely but doesn't quite get to these edge cases yet.

The formatting issues are fixable under HTML The screenshots in this PR showing leaked HTML tags are real and annoying, but I think that's more about bugs in the current converter than a fundamental problem with HTML mode. My PR (#935) tackles the chunking and fallback logic, which should clean those up.

All that said — I really appreciate the work in this PR, @Alexandersfg4! The heading-to-bold conversion is a nice touch and something we could totally bring over to the HTML path too.

Curious what you both think!

Hi, thank you for the feedback! I appreciate the thorough review.
The arguments for using HTML are very convincing, especially regarding how messy LLM outputs can be. I agree that HTML is a safer default.
That said, I’d be happy to refine my PR to address your concerns:

Add a configuration flag so users can select MarkdownV2 if they prefer (keeping HTML as the default).

{
  "channels": {
    "telegram": {
      "enabled": true,
      "token": "YOUR_BOT_TOKEN",
      "allow_from": ["YOUR_USER_ID"],
      "use_markdown_v2": true
    }
  }
}

Improve the MarkdownV2 parser by adding more test cases and fixing the edge cases you pointed out (like the potential panic and backslash escaping).

Let me know if that sounds like a good middle ground!

alexhoshina · 2026-03-06T04:51:41Z

Hey @alexhoshina, thanks for flagging this — it's a good discussion to have before we go in two different directions!
After looking at both PRs, I'd lean toward sticking with HTML. Here's my thinking:
LLM output is messy by nature The agent loves throwing around *, _, ., ! and all sorts of characters that happen to be reserved in MarkdownV2 (there are 19+ of them!). One unescaped dot or mismatched asterisk and Telegram rejects the whole message. With HTML, we only need to worry about <, >, and & — which almost never show up in normal responses.
The community generally says the same thing Most Telegram bot framework maintainers (Telegraf, GramIO, etc.) actually recommend HTML over MarkdownV2 for dynamic content. Some go as far as calling MarkdownV2 "dangerous" when you don't fully control the input — and with LLM output, we definitely don't.
A few concerns with the MarkdownV2 parser here I noticed some things that could bite us — there's a potential out-of-bounds panic in the expandable blockquote handler when the input is short, the link detection relies on a single-char lookbehind that can misfire (e.g. array0), and backslashes aren't being escaped. The test suite covers the happy path nicely but doesn't quite get to these edge cases yet.
The formatting issues are fixable under HTML The screenshots in this PR showing leaked HTML tags are real and annoying, but I think that's more about bugs in the current converter than a fundamental problem with HTML mode. My PR (#935) tackles the chunking and fallback logic, which should clean those up.
All that said — I really appreciate the work in this PR, @Alexandersfg4! The heading-to-bold conversion is a nice touch and something we could totally bring over to the HTML path too.
Curious what you both think!

Hi, thank you for the feedback! I appreciate the thorough review. The arguments for using HTML are very convincing, especially regarding how messy LLM outputs can be. I agree that HTML is a safer default. That said, I’d be happy to refine my PR to address your concerns:

Add a configuration flag so users can select MarkdownV2 if they prefer (keeping HTML as the default).
{
  "channels": {
    "telegram": {
      "enabled": true,
      "token": "YOUR_BOT_TOKEN",
      "allow_from": ["YOUR_USER_ID"],
      "use_markdown_v2": true
    }
  }
}
Improve the MarkdownV2 parser by adding more test cases and fixing the edge cases you pointed out (like the potential panic and backslash escaping).

Let me know if that sounds like a good middle ground!

I think this is feasible

documentation

Alexandersfg4 · 2026-03-06T11:44:49Z

use_markdown_v2

Hi, I pushed changes with flag use_markdown_v2

alexhoshina · 2026-03-08T07:24:49Z

Hi, we merged #935 yesterday, so you might need to resolve the conflicts

Alexandersfg4 · 2026-03-08T08:57:58Z

Hi, we merged #935 yesterday, so you might need to resolve the conflicts

@alexhoshina Hi, I resolved the conflicts

Also tested the new feature (when use_markdown_v2=false/true):

alexhoshina · 2026-03-10T14:55:30Z

Sorry! I was a bit late in reviewing. Could you please resolve the conflicts again? Thank you very much.

Alexandersfg4 · 2026-03-10T17:09:08Z

Sorry! I was a bit late in reviewing. Could you please resolve the conflicts again? Thank you very much.

Hi, I resolved again the conflicts.

@alexhoshina kind tag you, thank you very much!

alexhoshina · 2026-03-11T04:19:31Z

make lint plz

Alexandersfg4 · 2026-03-11T07:23:51Z

make lint plz

I fixed the linter and fixed MC :)

Kind ping @alexhoshina

alexhoshina · 2026-03-12T06:19:12Z

pkg/channels/telegram/telegram.go

 		}

-		if err := c.sendHTMLChunk(ctx, chatID, threadID, htmlContent, chunk, replyToID); err != nil {
+		if err := c.sendChunk(ctx, chatID, threadID, content, chunk, replyToID, useMarkdownV2); err != nil {


The parameter order was passed incorrectly

The chunk was passed into the replyToID parameter

The actual replyToID was passed into the mdFallback parameter

Sorry, I was a sick

FIxed the bug

alexhoshina · 2026-03-12T06:20:31Z

pkg/channels/telegram/telegram.go

+	parsedContent := parseContent(content, useMarkdownV2)
+	editMsg := tu.EditMessageText(tu.ID(cid), mid, parsedContent).
+		WithParseMode(telego.ModeMarkdownV2)


The current edit request enforces the use of MarkdownV2 parsing mode. When v2 is not enabled, message edits will either fail to parse or fall back

Alexandersfg4 · 2026-03-18T10:10:27Z

@alexhoshina Hi, sorry for the delay. I was out sick, but I've fixed the issues now. Could you please take another look at my PR?

alexhoshina · 2026-03-18T13:29:06Z

Thank you for your contribution!

…eed#1018) * feat: telegram use parse mode ModeMarkdownV2 instead of ModeHTML * handle expandable block quotation starts, add test for all md2 formats * fix: linter issue * feat: added flag use_markdown_v2, corrected config, updated documentation * move parseChatID to parser_markdown_to_html * fix: tests and linter issues * fix: case with ~ * test: fixed Test_markdownToTelegramMarkdownV2 * fix: regex block-quote line > * fix: linter issues * fix: send chunk param mismatched, in edit msg use HTML parse mode too * fix: remove from .gitignore redundant comment

feat: telegram use parse mode ModeMarkdownV2 instead of ModeHTML

c6cac8b

sipeed-bot bot added type: enhancement New feature or request domain: channel labels Mar 3, 2026

Alexandersfg4 added 2 commits March 4, 2026 08:37

handle expandable block quotation starts, add test for all md2 formats

25d4f16

fix: linter issue

dd9fb95

alexhoshina self-assigned this Mar 5, 2026

feat: added flag use_markdown_v2, corrected config, updated

1b189f3

documentation

move parseChatID to parser_markdown_to_html

174cd17

Merge remote-tracking branch 'origin/main' into feat/telegram-use-md2

8145622

fix: tests and linter issues

f4b4624

Alexandersfg4 added 3 commits March 10, 2026 18:38

fix: case with ~

83a831e

Merge remote-tracking branch 'origin/main' into feat/telegram-use-md2

1000daa

test: fixed Test_markdownToTelegramMarkdownV2

1678873

fix: regex block-quote line >

3d387ed

fix: linter issues

52ca274

Merge remote-tracking branch 'origin/main' into feat/telegram-use-md2

7fc601b

alexhoshina requested changes Mar 12, 2026

View reviewed changes

Alexandersfg4 added 3 commits March 18, 2026 12:26

fix: send chunk param mismatched, in edit msg use HTML parse mode too

5188bc7

Merge remote-tracking branch 'origin/main' into feat/telegram-use-md2

2147506

fix: remove from .gitignore redundant comment

df96533

alexhoshina approved these changes Mar 18, 2026

View reviewed changes

alexhoshina merged commit 12f4029 into sipeed:main Mar 18, 2026
3 checks passed

Conversation

Alexandersfg4 commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Description

🗣️ Type of Change

🤖 AI Code Generation

🔗 Related Issue

📚 Technical Context (Skip for Docs)

🧪 Test Environment

📸 Evidence (Optional)

☑️ Checklist

Uh oh!

Alexandersfg4 commented Mar 4, 2026

Uh oh!

alexhoshina commented Mar 4, 2026

Uh oh!

alexhoshina commented Mar 4, 2026

Uh oh!

putueddy commented Mar 5, 2026

Uh oh!

CLAassistant commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Alexandersfg4 commented Mar 6, 2026

Uh oh!

alexhoshina commented Mar 6, 2026

Uh oh!

Alexandersfg4 commented Mar 6, 2026

Uh oh!

alexhoshina commented Mar 8, 2026

Uh oh!

Alexandersfg4 commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexhoshina commented Mar 10, 2026

Uh oh!

Alexandersfg4 commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexhoshina commented Mar 11, 2026

Uh oh!

Alexandersfg4 commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexhoshina Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Alexandersfg4 Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

alexhoshina Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Alexandersfg4 Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Alexandersfg4 commented Mar 18, 2026

Uh oh!

alexhoshina commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Alexandersfg4 commented Mar 3, 2026 •

edited

Loading

CLAassistant commented Mar 5, 2026 •

edited

Loading

Alexandersfg4 commented Mar 8, 2026 •

edited

Loading

Alexandersfg4 commented Mar 10, 2026 •

edited

Loading

Alexandersfg4 commented Mar 11, 2026 •

edited

Loading