Skip to content

fix(core): add retry logic for transient SSL/TLS errors (#17318)#18310

Merged
sehoon38 merged 3 commits intogoogle-gemini:mainfrom
ppgranger:fix/ssl-error-retry-17318
Feb 5, 2026
Merged

fix(core): add retry logic for transient SSL/TLS errors (#17318)#18310
sehoon38 merged 3 commits intogoogle-gemini:mainfrom
ppgranger:fix/ssl-error-retry-17318

Conversation

@ppgranger
Copy link
Copy Markdown
Contributor

Summary

Add automatic retry for transient SSL errors like ERR_SSL_SSLV3_ALERT_BAD_RECORD_MAC that can occur during long sessions, preventing unnecessary crashes.

Previously, a transient SSL error would crash the entire CLI, forcing users to manually restart their session even though a simple retry would succeed
immediately.

Details

Changes

  • Add SSL error codes to RETRYABLE_NETWORK_CODES (retry.ts):

    • ERR_SSL_SSLV3_ALERT_BAD_RECORD_MAC
    • ERR_SSL_WRONG_VERSION_NUMBER
    • ERR_SSL_DECRYPTION_FAILED_OR_BAD_RECORD_MAC
    • ERR_SSL_BAD_RECORD_MAC
    • EPROTO (generic protocol error)
  • Improve error code extraction (retry.ts):

    • Traverse nested cause chains up to 5 levels deep
    • SSL errors from OpenSSL are often wrapped in multiple error layers
  • Allow retry during connection phase (geminiChat.ts):

    • Previously, connection phase errors were always thrown immediately
    • Now, retryable errors trigger the retry logic with exponential backoff

Test Coverage

Added 9 new tests covering:

  • SSL error codes recognition
  • Deeply nested SSL errors in cause chains
  • Connection phase SSL errors
  • Mid-stream SSL errors (exact scenario from the issue)
  • Gaxios-style error structure

Related Issues

Fixes #17318

How to Validate

  1. Run the new tests:
    npm test -w @google/gemini-cli-core -- --testNamePattern="SSL|GeminiChat Network Retries"
    
  2. Verify all tests pass (17 tests in the network retry suite)
  3. Build and lint:
    npm run build && npm run lint
  4. (Optional) To manually test, you could simulate an SSL error during a long session and verify the CLI retries automatically instead of crashing.

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • npm run
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
      • npm run
      • npx
      • Docker
    • Linux
      • npm run
      • npx
      • Docker

…i#17318)

Add automatic retry for transient SSL errors like ERR_SSL_SSLV3_ALERT_BAD_RECORD_MAC
that can occur during long sessions, preventing unnecessary crashes.

Changes:
- Add SSL error codes to RETRYABLE_NETWORK_CODES list
- Improve error code extraction to traverse nested cause chains
- Allow retry of retryable errors during connection phase
- Add comprehensive tests for SSL error scenarios
@ppgranger ppgranger requested a review from a team as a code owner February 4, 2026 15:49
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @ppgranger, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness of the core CLI by implementing automatic retry logic for transient SSL/TLS and network errors. Previously, such errors would cause the application to crash, requiring manual restarts. With these changes, the system can now gracefully recover from temporary network instabilities, particularly during long-running sessions, by automatically retrying failed operations.

Highlights

  • Enhanced SSL Error Handling: Added several transient SSL/TLS error codes (e.g., "ERR_SSL_SSLV3_ALERT_BAD_RECORD_MAC", "EPROTO") to the list of retryable network errors.
  • Deep Error Cause Traversal: Improved the error code extraction logic to traverse up to 5 levels deep into error "cause" chains, ensuring nested SSL errors are correctly identified for retry.
  • Connection Phase Retry: Enabled the retry mechanism to handle retryable errors that occur during the initial connection phase, preventing immediate crashes for transient issues.
Changelog
  • packages/core/src/core/geminiChat.ts
    • Modified the GeminiChat class to apply retry logic to errors encountered during the connection phase, specifically checking if they are retryable SSL/TLS errors.
  • packages/core/src/core/geminiChat_network_retry.test.ts
    • Added comprehensive tests to cover retry scenarios for SSL errors during the connection phase, ECONNRESET errors, and mid-stream SSL failures, ensuring the new retry logic functions as expected.
  • packages/core/src/utils/retry.test.ts
    • Introduced new tests to validate the retry mechanism's ability to handle various SSL error codes, including those deeply nested within error cause chains and gaxios-style error structures.
  • packages/core/src/utils/retry.ts
    • Expanded the RETRYABLE_NETWORK_CODES list to include specific transient SSL/TLS error codes like ERR_SSL_SSLV3_ALERT_BAD_RECORD_MAC and EPROTO.
    • Updated the getNetworkErrorCode function to recursively search for error codes within an error's cause property, up to a depth of 5, to correctly identify wrapped SSL errors.
Activity
  • The author added 9 new tests to cover various SSL error recognition scenarios, including deeply nested errors, connection phase errors, mid-stream errors, and gaxios-style error structures.
  • The pull request includes validation on MacOS and Linux platforms, confirming functionality across different environments.
  • Tests were updated and validated as part of the pre-merge checklist.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces robust retry logic for transient SSL/TLS errors, preventing unnecessary crashes during long-lived CLI sessions. The changes are well-implemented by expanding the list of retryable network codes and improving error inspection to traverse nested cause chains. The core chat logic is correctly modified to trigger retries for these errors even during the initial connection phase. The accompanying tests are comprehensive, covering various SSL error scenarios, including connection-phase failures and deeply nested error objects. Overall, this is an excellent contribution that significantly improves the resilience of the application.

@gemini-cli gemini-cli bot added priority/p1 Important and should be addressed in the near term. area/core Issues related to User Interface, OS Support, Core Functionality 🔒 maintainer only ⛔ Do not contribute. Internal roadmap item. help wanted We will accept PRs from all issues marked as "help wanted". Thanks for your support! labels Feb 4, 2026
Copy link
Copy Markdown
Contributor

@sehoon38 sehoon38 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a minor nit, lgtm overall. thanks for the contribution!

@ppgranger
Copy link
Copy Markdown
Contributor Author

@sehoon38 Ready to merge :).

@sehoon38 sehoon38 enabled auto-merge February 5, 2026 15:37
@sehoon38 sehoon38 added this pull request to the merge queue Feb 5, 2026
Merged via the queue into google-gemini:main with commit e3b8490 Feb 5, 2026
26 checks passed
@ppgranger ppgranger deleted the fix/ssl-error-retry-17318 branch February 5, 2026 16:33
sidwan02 pushed a commit to sidwan02/gemini-cli-gemma that referenced this pull request Feb 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core Issues related to User Interface, OS Support, Core Functionality help wanted We will accept PRs from all issues marked as "help wanted". Thanks for your support! 🔒 maintainer only ⛔ Do not contribute. Internal roadmap item. priority/p1 Important and should be addressed in the near term.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Unhandled ERR_SSL_SSLV3_ALERT_BAD_RECORD_MAC causes crash during long sessions

2 participants