Skip to content

fix: Preserve LaTeX commands in unescapeStringForGeminiBug() (#19802)#19811

Closed
Nixxx19 wants to merge 3 commits intogoogle-gemini:mainfrom
Nixxx19:nityam/fix-latex-backslash-escape-bug
Closed

fix: Preserve LaTeX commands in unescapeStringForGeminiBug() (#19802)#19811
Nixxx19 wants to merge 3 commits intogoogle-gemini:mainfrom
Nixxx19:nityam/fix-latex-backslash-escape-bug

Conversation

@Nixxx19
Copy link
Copy Markdown
Contributor

@Nixxx19 Nixxx19 commented Feb 21, 2026

Fixes #19802

Summary

Fixed a bug where unescapeStringForGeminiBug() was incorrectly replacing \t, \n, and \r in LaTeX commands (e.g., \title, \textbf, \newline) with actual tab/newline/carriage return characters. The function now uses a smart heuristic to distinguish between domain-specific backslash sequences (LaTeX, regex patterns) and legitimate escape sequences that should be unescaped.

Impact: High - affects users generating LaTeX documents, regex patterns, or any content with backslash-prefixed commands.

Details

Root Cause

The original implementation aggressively unescaped all \t, \n, and \r sequences to fix LLM over-escaping bugs, but this broke LaTeX commands like \title{...} itle{...} (tab + "itle").

Solution

Implemented a context-aware heuristic in the unescapeStringForGeminiBug() function:

  • For \n, \t, \r: Check the character immediately following the escape sequence

    • If followed by a lowercase letter ([a-z]): Preserve the backslash (likely a LaTeX/domain command)
    • Otherwise (uppercase, punctuation, whitespace, EOF): Unescape to the actual control character
  • For quotes, backticks, backslashes: Always unescape (no change to existing behavior)

Examples

  • \title{Doc}\title{Doc} ✅ (preserved - lowercase 'i' follows \t)
  • \textbf{Bold}\textbf{Bold} ✅ (preserved - lowercase 'e' follows \t)
  • Column\tValueColumn Value ✅ (unescaped - uppercase 'V' follows \t)
  • Line1\n\newpageLine1 + newline + \newpage ✅ (mixed handling)
  • C:\name\testC:\name\test ✅ (Windows paths preserved - lowercase follows)

Code Changes

  • packages/core/src/utils/editCorrector.ts:

    • Enhanced the regex replacement callback in unescapeStringForGeminiBug() (lines 736-805)
    • Added lookahead logic to check the next character after \n, \t, \r
    • Fixed ESLint errors by adding explicit type annotations to callback parameters
  • packages/core/src/utils/editCorrector.test.ts:

    • Added 10 new comprehensive tests for LaTeX commands, regex patterns, Windows paths, and mixed content
    • Updated 4 existing tests to align with the new, more precise unescaping behavior
    • All 51 tests pass ✅

Related Issues

Fixes #19802

How to Validate

1. Run the test suite

npx vitest run packages/core/src/utils/editCorrector.test.ts

Expected: All 51 tests pass, including:

  • should NOT unescape backslash-t in LaTeX commands
  • should NOT unescape backslash-n in LaTeX commands
  • should NOT unescape backslash-r in LaTeX/regex commands
  • should handle mixed LaTeX commands and actual escape sequences
  • should preserve Windows paths with backslashes

2. Manual validation (LaTeX preservation)

node -e "const {unescapeStringForGeminiBug} = require('./packages/core/src/utils/editCorrector.ts'); console.log(unescapeStringForGeminiBug('\\title{My Document}'));"

Expected output: \title{My Document} (backslash preserved)

3. Manual validation (actual tabs still work)

node -e "const {unescapeStringForGeminiBug} = require('./packages/core/src/utils/editCorrector.ts'); console.log(unescapeStringForGeminiBug('Column\\tValue'));"

Expected output: Column Value (tab character inserted)

4. Edge cases covered

  • ✅ LaTeX commands: \title, \textbf, \newline, \ref, \renewcommand
  • ✅ Windows paths: C:\name\test preserved
  • ✅ Regex patterns: \d+\n\w+ preserved
  • ✅ Mixed content: LaTeX + actual escape sequences handled correctly
  • ✅ Existing behavior: Quotes, backticks, backslashes still unescape properly

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed) - 10 new tests + 4 updated tests
  • Noted breaking changes (if any) - No breaking changes
  • Validated on required platforms/methods:
    • MacOS
      • npm run - All tests pass (51/51)
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
      • npm run
      • npx
      • Docker
    • Linux
      • npm run
      • npx
      • Docker

ddddd

…i#19802)

Fixed unescapeStringForGeminiBug() to preserve LaTeX commands and other
domain-specific backslash sequences while still unescaping legitimate
escape sequences.

Fixes google-gemini#19802
@Nixxx19 Nixxx19 requested a review from a team as a code owner February 21, 2026 09:35
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @Nixxx19, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where the unescapeStringForGeminiBug utility was over-aggressively unescaping backslash characters, leading to corruption of domain-specific syntax such as LaTeX commands and regex patterns. The solution implements a refined, context-aware unescaping logic that intelligently distinguishes between actual escape sequences and intended literal backslash prefixes, significantly improving the accuracy and reliability of string processing for various content types.

Highlights

  • Bug Fix: Incorrect Unescaping of LaTeX Commands: Resolved a bug in unescapeStringForGeminiBug() where backslash sequences like \t, \n, and \r within LaTeX commands (e.g., \title, \textbf) were erroneously unescaped, corrupting the output.
  • Context-Aware Unescaping Heuristic Implemented: Introduced a smart heuristic to differentiate between legitimate escape sequences and domain-specific backslash commands. For \n, \t, \r, the function now checks the character immediately following the escape sequence: if it's a lowercase letter, the backslash is preserved; otherwise, it's unescaped.
  • Enhanced Test Coverage: Added 10 new comprehensive tests specifically for LaTeX commands, regex patterns, Windows paths, and mixed content. Additionally, 4 existing tests were updated to align with the new, more precise unescaping behavior, ensuring robust validation of the fix.
  • Preservation of Domain-Specific Backslash Sequences: The updated function now correctly preserves LaTeX commands, regex patterns (like \d, \w), and Windows file paths (e.g., C:\name\test), preventing unintended modifications to these contexts.
Changelog
  • packages/core/src/utils/editCorrector.test.ts
    • Updated Vitest imports to use named imports for Mock and Mocked types.
    • Corrected existing test expectations for strings containing \n and \t within path-like or command-like contexts to reflect preservation.
    • Added 10 new tests to specifically validate the preservation of LaTeX commands (\title, \textbf, \newline, \ref, \renewcommand).
    • Included new tests for mixed LaTeX commands and actual escape sequences to ensure correct handling of both.
    • Introduced tests for regex patterns (\d+, \w+, \s+) to confirm backslash preservation.
    • Added tests for Windows file paths (C:\temp\file.txt) to ensure backslashes are not unescaped.
    • Included tests for escape sequences at the end of a string and followed by punctuation.
    • Added a comprehensive test case for a full LaTeX document structure to verify overall preservation.
  • packages/core/src/utils/editCorrector.ts
    • Modified the unescapeStringForGeminiBug function to implement a context-aware heuristic for \n, \t, and \r.
    • Updated the regex replacement callback to accept offset and fullString parameters for lookahead logic.
    • Implemented conditional logic within the callback: if \n, \t, or \r is followed by a lowercase letter, the backslash is preserved; otherwise, it is unescaped.
    • Added explicit type annotations (match: string, capturedChar: string, offset: number, fullString: string) to the regex callback parameters to fix ESLint errors.
    • Updated internal comments to explain the new unescaping strategy and its rationale.
Activity
  • The pull request introduces 10 new comprehensive tests to cover LaTeX commands, regex patterns, Windows paths, and mixed content.
  • Four existing tests were updated to align with the new, more precise unescaping behavior.
  • All 51 tests in editCorrector.test.ts are reported to pass after these changes.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request addresses a critical bug in unescapeStringForGeminiBug() where LaTeX commands and similar domain-specific backslash sequences were incorrectly unescaped. The fix introduces a context-aware heuristic to differentiate between legitimate escape sequences and those that should be preserved. The changes include modifications to editCorrector.ts to implement this heuristic and extensive updates to editCorrector.test.ts with new and updated tests to validate the fix, including specific tests for LaTeX commands, regex patterns, and Windows paths. The changes significantly improve the correctness and robustness of the unescaping logic, particularly for users generating content with backslash-prefixed commands.

@gemini-cli gemini-cli bot added the area/core Issues related to User Interface, OS Support, Core Functionality label Feb 21, 2026
@Turbo-the-tech-dev
Copy link
Copy Markdown

✅ WRITING-PLANS ACTIVATED – LaTeX Preserve Fix Leveraged
(Turbo Fleet Command – Feb 21, 2026 – PR #19811 Integrated, 4 hours after opening)

Project Spec (exact from #19811):
Title: fix: Preserve LaTeX commands in unescapeStringForGeminiBug() (#19802)
Status: Open (opened today 09:35 UTC)
Author: Nixxx19
Closes: #19802 (Gemini CLI replaces \t characters with tab when it shouldn’t)
Core Change: Updated packages/core/src/utils/editCorrector.ts → context-aware heuristic in unescapeStringForGeminiBug():

  • If \n, \t, or \r is followed by a lowercase letter [a-z], preserve the backslash (protects \title, \textbf, \newline, \ref, \d+, C:\path).
  • Otherwise, perform normal unescape.
  • +10 new tests + updated existing ones. All 51 tests now pass.

Impact: Fixes corruption of LaTeX, regex, Windows paths, and any domain-specific backslash syntax in Gemini CLI output.

Goal for our fleet:
Make nec-fleet-gemini the first extension to fully exploit this core fix for pixel-perfect technical output.

  • Generate clean LaTeX-formatted NEC reports, conduit-fill equations, safety checklists, and SRS quizzes.
  • Add one-command nec:latex export (PDF-ready).
  • Turn the Central Documentation Hub into the canonical “Clean LaTeX in Gemini CLI” guide.
  • Future-proof every fleet response containing formulas, code, or paths.

MULTI-PHASE EXECUTION PLAN (Before Any Code Touch)

Phase 0: Lock & Monitoring (Current – 5 min)

  • Lock this spec
  • Auto-watch PR + linked issue via github extension
  • Confirm we target gemini-3.1-pro-preview for max LaTeX intelligence
  • Output: Signed-off plan

Success Metric: Reply LOCKED

Phase 1: nec-fleet-gemini LaTeX Supercharge (Est. 40 min)

  • New MCP tool: nec_latex_export(topic, format?) → returns clean LaTeX (or PDF via your DEATHSTAR bridge)
  • Update all existing tools (nec_search, nec_takeoff, nec_quiz, nec_plan) to use LaTeX for equations/tables when requested
  • New slash command /nec:latex (new commands/nec/latex.toml)
  • Extend GEMINI.md persona to auto-use LaTeX for formulas and note “LaTeX now fully preserved thanks to upstream fix”
  • Output: v1.3 extension with LaTeX engine

Phase 2: Central Docs Hub – LaTeX Mastery Page (Est. 45 min)

  • New Starlight page: /docs/advanced/latex-output.md
    • Before/after examples (corrupted vs. perfect \title{Conduit Fill})
    • One-click “Enable LaTeX Mode” workflow
    • Full nec-fleet-gemini demo (live nec:latex full 200A service)
    • Migration guide + test suite link
  • Update homepage hero + nec-fleet section with new banner asset
  • Output: 3 polished pages + auto-sync

Phase 3: Fleet-Wide Validation & Footer Tie-In (Est. 30 min)

  • Add LaTeX health check to nec_status() and /footer (from PR feat: add custom footer configuration via /footer #19001):
    🔌 NEC 2026 • LaTeX: PRESERVED ✓ • Model: LOCKED
  • study-ai + refactor to audit all fleet Markdown for LaTeX compatibility
  • End-to-end test: generate full NEC takeoff report → export to .tex → compile
  • Output: 100% clean output pipeline

Phase 4: Release & Upstream Celebration (Est. 25 min)

  • Tag nec-fleet-gemini v1.3 “LaTeX-Preserved Edition”
  • Publish hub update + new assets
  • Draft ready-to-post comment on PR fix: Preserve LaTeX commands in unescapeStringForGeminiBug() (#19802) #19811: “Tested with nec-fleet-gemini – perfect LaTeX for electrician formulas!”
  • Announcement: “Sovereign Electricians now generate publication-ready LaTeX from terminal”
  • Output: Live links + contribution

Total Estimated Time: ~2 hours 25 min (parallel with maestro/conductor)
Risks & Mitigations:

  • PR not merged yet → ship behind if (latexPreserveAvailable) flag + fallback
  • Compile step → optional DEATHSTAR python bridge to pdflatex
  • Overuse → persona rule: “Use LaTeX only when user requests formulas”

Success Criteria:

  • nec:latex takeoff 200A outputs perfect \begin{equation} ... \end{equation} with no escaped backslashes
  • Central Hub page solves LaTeX issues for every fleet user
  • Footer shows “LaTeX: PRESERVED”
  • Zero corrupted output in any NEC document

Fleet Command ready.

Reply:
LOCKED → I immediately drop:

  1. Full nec_latex_export tool code
  2. Updated gemini-extension.json + new latex.toml
  3. Complete /docs/advanced/latex-output.md
  4. One-liner test commands (try nec:latex search 250.52)

AMEND → tweak anything (add MathJax rendering, custom templates, etc.)
SKIP TO LATEX TOOL → only Phase 1 right now

This fix is pure gold for technical sovereign work — we turn it into a fleet superpower before the PR even merges. ⚡🔌

Your move, Commander.

@scidomino
Copy link
Copy Markdown
Collaborator

Thanks for the contribution but we can't accept this. This would fix the issue for some users, but it would cause headaches for other users in different contexts. Ultimately there is no code solution to the Gemini model making a dumb mistake.

@scidomino scidomino closed this Feb 23, 2026
@Nixxx19
Copy link
Copy Markdown
Contributor Author

Nixxx19 commented Feb 23, 2026

fair point, thanks for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core Issues related to User Interface, OS Support, Core Functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gemini CLI replaces \t, \r etc. characters when it shouldn't in LaTeX

3 participants