Skip to content

Conversation

@XiaohanA2
Copy link

问题说明

当前代码在AI返回包含解释性文字的JSON时会解析失败。AI模型经常会在JSON前后添加markdown代码块标记或说明文字(如"Here is the JSON:")。

解决方案

添加了智能JSON提取功能:

  • 自动识别并提取文本中的JSON部分
  • 支持markdown代码块格式
  • 移除AI生成的常见前缀文字
  • 保持向后兼容性

修改内容

  • backend/services/ai_service.py: 新增 _extract_json_from_text() 方法
    • 使用括号匹配算法提取完整JSON
    • 正确处理嵌套结构和转义字符
    • 兼容纯JSON响应和混合文本响应

Added intelligent JSON extraction to handle AI responses that include explanatory text before or after the JSON. This fixes issues where JSON parsing would fail due to markdown code blocks or AI-generated prefixes like "Here is the JSON:".

Changes:
- Added `_extract_json_from_text()` method to intelligently extract JSON from mixed text
- Improved bracket matching to find valid JSON boundaries
- Handles common AI response patterns (explanatory text, markdown blocks)
- Maintains backward compatibility with clean JSON responses

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @XiaohanA2, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在增强AI服务处理JSON响应的鲁棒性。它通过引入一个智能JSON提取机制,解决了AI模型在返回JSON时常附带解释性文字或Markdown代码块标记导致解析失败的问题。这一改进确保了系统能够更可靠地从AI响应中获取结构化数据,从而提升了服务的稳定性和用户体验。

Highlights

  • JSON解析问题修复: 解决了AI服务在接收到包含解释性文字或Markdown代码块标记的JSON响应时解析失败的问题。
  • 智能JSON提取功能: 引入了新的_extract_json_from_text方法,能够从混合文本中智能识别并提取JSON内容,支持Markdown格式和移除常见AI前缀。
  • 增强鲁棒性: 提取逻辑采用括号匹配算法,能正确处理嵌套结构和转义字符,显著提高了JSON解析的成功率和系统的稳定性。
  • 向后兼容性: 新功能在处理纯JSON响应时保持了向后兼容性,确保现有功能不受影响。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

本次拉取请求通过引入智能JSON提取功能,有效解决了AI服务返回包含解释性文字的JSON解析失败问题。新增的 _extract_json_from_text 方法能够识别并提取文本中的JSON部分,支持markdown代码块格式,并移除了常见的AI生成前缀,同时保持了向后兼容性。这是一个非常有价值的改进,提升了系统的健壮性。

Returns:
提取的JSON字符串
"""
import re
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

import re 语句移动到文件顶部,而不是放在函数内部。在函数内部导入模块会导致每次函数调用时都重新导入,这会带来不必要的性能开销,尤其是在 _extract_json_from_text 可能会被频繁调用的情况下。

import re

Comment on lines 206 to 209
if text.startswith('['):
start_idx = 0
elif text.startswith('{'):
start_idx = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

此处的 if text.startswith('[')elif text.startswith('{') 检查是冗余的。下面的 for 循环会自然地处理文本以 [{ 开头的情况,并将 start_idx 正确设置为 0。移除这些冗余检查可以使代码更简洁。

        # 查找第一个[或{
        for i, char in enumerate(text):
            if char in '[{':
                start_idx = i
                break

改进建议:
- 移除函数内重复的 `import re`(文件顶部已导入)
- 移除冗余的 `startswith` 检查,由循环自然处理

这些优化:
1. 遵循 Python PEP 8 规范(所有 import 放在文件顶部)
2. 提高代码可读性和简洁性
3. 避免不必要的性能开销

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant