[Feature]integrate LazyLLM framework to support more text/image model by uestcsh917-art · Pull Request #158 · Anionex/banana-slides

uestcsh917-art · 2026-01-12T11:57:02Z

功能概述

新增LazyLLM框架接入，支持便捷调用不同厂商的文本/图像生成模型

1. 文本生成（LazyllmTextProvider）

支持多种主流大模型（如 Qwen、Deepseek、doubao、GLM、MINIMAX、sensenova 等）的文本生成能力。
通过统一的 LazyllmTextProvider 接口，自动路由到指定模型和厂商，便于灵活切换和扩展。

2. 图片生成与编辑（LazyllmImageProvider）

集成 Lazyllm 框架的图片生成与编辑能力，目前支持 doubao、qwen、siliconflow 等厂商和平台的主流图片编辑模型。
兼容多种分辨率和比例，自动处理 API 返回格式，提升鲁棒性和易用性。
完善异常处理与日志记录，便于定位图片生成失败原因。

3. 图片内容描述

集成Lazyllm 框架内的vlm模型，用于处理用户传入的参考文件中的图片，生成文字描述

实现细节

在backend/services/ai_provider/text中新增lazyllm_provider，通过lazyllm包调用不同厂商的文本生成模型
在backend/services/ai_provider/image中新增lazyllm_provider，通过lazyllm包调用不同厂商的图像编辑模型
在backend/services/file_parser_service.py中新增lazyllm接入vlm模型；在backend/controllers/reference_file_controller.py中新增对应的api导入
pyproject.toml中新增lazyllm及其相关依赖

注意事项

pyproject.toml中的依赖pillow>=12.0.0,google-genai>=1.52.0与lazyllm的依赖存在冲突，如果你想使用lazyllm，请改为pillow<12.0.0,>=8.0.0,google-genai<1.52.0再导入，并运行uv pip install '.[sdk]'安装lazyllm和必要的sdk
兼容性问题已修复

…ext model and image model

…text models

…ensetime-LazyLLM-framework

gemini-code-assist · 2026-01-12T11:57:37Z

Summary of Changes

Hello @uestcsh917-art, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在通过集成LazyLLM框架，显著扩展系统对各种AI模型提供商的支持。这一改进使得用户能够更灵活地选择和切换不同的文本生成、图像生成及图像内容描述模型，从而提升了系统的多功能性和可扩展性。通过统一的接口和简化的配置，开发者可以更便捷地接入来自不同厂商的先进AI能力，为未来的功能迭代奠定基础。

Highlights

LazyLLM框架集成: 引入LazyLLM框架，以支持更广泛的文本和图像生成模型，提升了系统的多功能性和可扩展性。
多厂商文本模型支持: 通过统一的LazyllmTextProvider接口，支持Qwen、Deepseek、doubao、GLM、MINIMAX、sensenova等主流大模型的文本生成能力。
多厂商图像模型支持: 集成了LazyLLM框架的图像生成与编辑能力，目前支持doubao、qwen、siliconflow等厂商。
图像内容描述: 整合了LazyLLM框架内的VLM模型，用于从用户传入的参考图片中生成文字描述。
配置与依赖更新: 更新了.env.example、README.md和backend/config.py以包含LazyLLM相关的配置项和API Key设置，并在pyproject.toml中新增了lazyllm及其相关SDK依赖。
依赖冲突解决提示: 提供了pillow和google-genai依赖冲突的解决方案，确保LazyLLM的顺利安装和运行。

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

此拉取请求集成了LazyLLM框架，以支持更广泛的文本和图像生成模型，显著增强了应用程序的灵活性和功能。更改包括更新配置、为文本和图像生成添加新的提供程序实现，以及在文件解析服务中集成LazyLLM进行图像描述。同时，README.md和.env.example中的文档也已更新，以反映新的配置选项。

gemini-code-assist · 2026-01-12T12:01:30Z

backend/services/file_parser_service.py

+        self._lazyllm_api_keys = lazyllm_api_keys
+        self._lazyllm_image_caption_source = lazyllm_image_caption_source


在__init__方法中，_lazyllm_api_keys被初始化为None。然而，在_get_lazyllm_client方法中，api_key = self._lazyllm_api_keys.get(source, "")这行代码直接访问了_lazyllm_api_keys的.get()方法，如果_lazyllm_api_keys为None，则会引发AttributeError。建议在访问_lazyllm_api_keys之前添加一个None检查。

Suggested change

self._lazyllm_api_keys = lazyllm_api_keys

self._lazyllm_image_caption_source = lazyllm_image_caption_source

self._lazyllm_api_keys = lazyllm_api_keys or {}

self._lazyllm_image_caption_source = lazyllm_image_caption_source

backend/services/file_parser_service.py

gemini-code-assist · 2026-01-12T12:01:30Z

backend/services/ai_providers/image/lazyllm_provider.py

+# import lazyllm
+from typing import Optional, List
+from PIL import Image
+from .base import ImageProvider
+from config import get_config
+# from lazyllm.components.formatter import decode_query_with_filepaths
+# from lazyllm import LOG


文件顶部的lazyllm相关导入语句被注释掉了，但在__init__方法中又在try-except块内重新导入。这种做法有些冗余且可能引起混淆。建议删除这些被注释掉的导入语句，只保留try-except块内的导入，以保持代码的清晰性。

Suggested change

# import lazyllm

from typing import Optional, List

from PIL import Image

from .base import ImageProvider

from config import get_config

# from lazyllm.components.formatter import decode_query_with_filepaths

# from lazyllm import LOG

from typing import Optional, List

from PIL import Image

from .base import ImageProvider

from config import get_config

gemini-code-assist · 2026-01-12T12:01:31Z

backend/services/ai_providers/__init__.py

+        text_source = _get_config_value('LAZYLLM_TEXT_SOURCE', 'siliconflow')
+        image_source = _get_config_value('LAZYLLM_IMAGE_SOURCE', 'siliconflow')


在_get_provider_config方法中，LAZYLLM_TEXT_SOURCE和LAZYLLM_IMAGE_SOURCE的默认值被设置为'siliconflow'。然而，在.env.example和backend/config.py中，这些变量的默认值分别是'deepseek'和'doubao'。这种默认值的不一致可能会导致配置上的混淆。建议将此处的默认值与.env.example和backend/config.py中的默认值保持一致。

Suggested change

text_source = _get_config_value('LAZYLLM_TEXT_SOURCE', 'siliconflow')

image_source = _get_config_value('LAZYLLM_IMAGE_SOURCE', 'siliconflow')

text_source = _get_config_value('LAZYLLM_TEXT_SOURCE', 'deepseek')

image_source = _get_config_value('LAZYLLM_IMAGE_SOURCE', 'doubao')

gemini-code-assist · 2026-01-12T12:01:31Z

README.md

+LAZYLLM_TEXT_SOURCE=deepseek   # 文本生成厂商
+LAZYLLM_IMAGE_SOURCE=doubao    # 图片生成厂商


在README.md的LazyLLM配置示例中，缺少了LAZYLLM_IMAGE_CAPTION_SOURCE的说明。.env.example中包含了此配置项，为了文档的完整性和清晰性，建议在README.md中也添加对它的说明。

Suggested change

LAZYLLM_TEXT_SOURCE=deepseek # 文本生成厂商

LAZYLLM_IMAGE_SOURCE=doubao # 图片生成厂商

LAZYLLM_TEXT_SOURCE=deepseek # 文本生成厂商

LAZYLLM_IMAGE_SOURCE=doubao # 图片生成厂商

LAZYLLM_IMAGE_CAPTION_SOURCE=qwen # 图片描述厂商

…ensetime-LazyLLM-framework

…e of lazyllm.

…ensetime-LazyLLM-framework

uestcsh917-art · 2026-01-23T02:57:59Z

@Anionex Hi, I have synced the latest code for this PR and resolved all conflicts. New feature enables easily integrate many models of different providers(including doubao,qwen,siliconflow,...) via the LazyLLM framework. Could you please take some time to review it? I will make timely revisions if there are any issues.

Anionex · 2026-01-23T03:07:46Z

Thanks🚀, I’ll review it soon.

Anionex · 2026-01-29T08:54:56Z

/gemini review

gemini-code-assist

Code Review

This PR successfully integrates the LazyLLM framework, extending support for various text and image models, which is a valuable enhancement. However, several issues require attention. Critical security vulnerabilities related to predictable temporary filenames, race conditions, and resource leakage were identified. Additionally, the review highlighted an uninitialized variable that could lead to runtime errors, inconsistencies in default configurations, unused imports and parameters, and hardcoded paths in demo scripts. Addressing these points will significantly improve the code's robustness, reliability, and maintainability.

backend/models/settings.py

backend/services/ai_providers/image/lazyllm_provider.py

backend/services/file_parser_service.py

.env.example

backend/controllers/reference_file_controller.py

backend/services/file_parser_service.py

v0_demo/lazyllm_genai.py

Some reasoning models include <think>...</think> blocks in their output. Add strip_think_tags() utility and apply it in all text providers and image caption generation to clean these tags from responses.

Use the user's configured resolution from settings instead of hardcoding 1K, which is unsupported by some models like seedream-4-5. Update test description to say '固定分辨率' instead of '1K'.

Include vendor SDKs (dashscope for qwen, zhipuai for glm) in the lazyllm optional dependency group so they install together with lazyllm.

All lazyllm vendor APIs (qwen/dashscope, doubao, siliconflow) expect width*height format. Convert resolution (1K/2K/4K) + aspect ratio to actual pixel dimensions instead of passing shorthand values.

When LazyLLM format is selected, hide irrelevant API Base URL / API Key fields and show vendor source dropdowns (text/image/caption model) with dynamic per-vendor API key inputs. Includes backend model fields, migration, controller sync logic, and i18n support.

…dropdowns

Doubao API requires 'WIDTHxHEIGHT' format (with 'x'), not 'WIDTH*HEIGHT'.

Some models (doubao-seedream) require >= 3686400 total pixels. Scale up dimensions when needed and round up to 64-byte alignment.

- qwen: uses '*' separator, max 2048px per dimension - doubao: uses 'x' separator, min 3686400 total pixels - Scale up for doubao minimum, cap for qwen maximum

Anionex · 2026-02-13T14:20:05Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for the LazyLLM framework as a new AI provider option, alongside existing OpenAI and Gemini formats. Key changes include adding LazyLLM-specific configuration fields for text, image, and image caption model sources, as well as vendor API keys, to the .env.example file, backend configuration, and database settings. The backend now includes new LazyLLMTextProvider and LazyLLMImageProvider classes, along with a lazyllm_env.py utility for API key management. The FileParserService has been updated to support image captioning via LazyLLM, and a strip_think_tags utility was added to clean AI responses across all text providers. The README.md and README_EN.md files were updated to reflect the new configuration options and revised Docker logging commands. The frontend Settings page was modified to include UI elements for selecting LazyLLM as a provider and configuring its specific model sources and API keys. Review comments highlighted a potential environment variable injection vulnerability in _sync_settings_to_config due to unvalidated vendor strings, suggested removing sudo from docker compose commands in the documentation for broader applicability, recommended refactoring complex image dimension calculation logic in LazyLLMImageProvider into a separate helper method, and advised reusing the strip_think_tags function in _generate_single_caption to avoid code duplication.

backend/controllers/settings_controller.py

README.md

backend/services/ai_providers/image/lazyllm_provider.py

backend/services/file_parser_service.py

1. Security: Add whitelist for lazyllm vendor names to prevent environment variable injection via arbitrary vendor strings 2. Refactor: Extract image dimension calculation into a standalone _calculate_image_dimensions() function with clear vendor constraints documented in VENDOR_IMAGE_CONSTRAINTS dict 3. DRY: Reuse strip_think_tags() from text providers module instead of duplicating regex logic in file_parser_service

Anionex · 2026-02-13T14:41:36Z

延伸改进工作

基于原 PR 进行了以下修复和增强：

🔧 Bug 修复

图像分辨率兼容性 - 不同厂商对 size 参数格式要求不同：
- qwen: 使用 * 分隔符，限制 512-2048px
- doubao: 使用 x 分隔符，要求最低 3,686,400 总像素
- 已重构为 _calculate_image_dimensions() 函数，按厂商自动适配
清理 <think> 标签 - 部分推理模型（如 DeepSeek）返回内容包含 <think>...</think> 块，已在所有文本 provider 中添加 strip_think_tags() 清理
测试配置 - 图像模型测试改用用户设置的分辨率，而非硬编码 1K（部分模型不支持）

🔒 安全修复

环境变量注入防护 - 添加 ALLOWED_LAZYLLM_VENDORS 白名单，防止通过任意 vendor 名覆盖系统环境变量

📦 依赖

厂商 SDK - 将 dashscope、zhipuai 加入 lazyllm optional dependencies，uv sync --extra lazyllm 一键安装

✅ 测试验证

qwen-image-edit-max: 2048*1152 ✅
doubao-seedream-4-5: 2560x1472 ✅

unknown added 8 commits January 8, 2026 18:04

[feature]add lazyllm framework to integrate different providers for t…

55602c0

…ext model and image model

new feature:add sensetime-lazyllm provider,integrate different image/…

2dff12f

…text models

Merge remote-tracking branch 'upstream/main' into feature/integrate-s…

b675350

…ensetime-LazyLLM-framework

fix bug

a32c199

Merge remote-tracking branch 'upstream/main' into feature/integrate-s…

2b1601f

…ensetime-LazyLLM-framework

add feature:image description

bd92bd3

format update

add3d66

format update

2d2ed67

gemini-code-assist bot reviewed Jan 12, 2026

View reviewed changes

fix bugs / update format

d0b3a31

uestcsh917-art changed the title ~~[Feature]integrate lazy llm framework to support more text/image model~~ [Feature]integrate LazyLLM framework to support more text/image model Jan 13, 2026

ChenJiahaoST mentioned this pull request Jan 14, 2026

生成图片时报错,使用了阿里百炼的qwen-image-plus #135

Open

unknown added 9 commits January 19, 2026 13:06

fix the conflict of lazyllm

cd5816c

Merge remote-tracking branch 'upstream/main' into feature/integrate-s…

8614adb

…ensetime-LazyLLM-framework

resolve package conflict of lazyllm / add lazyllm test demo

3fed3de

files recovery

11a1a56

Merge remote-tracking branch 'upstream/main' into feature/integrate-s…

76b6b4a

…ensetime-LazyLLM-framework

[feature]Optimize the api_key configuration with the namespace featur…

5a78efa

…e of lazyllm.

fix bug

4ad861d

Merge remote-tracking branch 'upstream/main' into feature/integrate-s…

ef833a8

…ensetime-LazyLLM-framework

fix bug

3ab7684

ChenJiahaoST mentioned this pull request Jan 27, 2026

能否支持 doubao-seedream-4-5 #173

Open

gemini-code-assist bot reviewed Jan 29, 2026

View reviewed changes

unknown and others added 3 commits January 30, 2026 11:02

update version

03bb352

fix bug:permission verification for image_caption_model in lazyllm

f3a8758

Merge branch 'main' into feature/integrate-sensetime-LazyLLM-framework

e2807ce

Anionex added 5 commits February 12, 2026 22:22

merge: sync main into pr-158

4bd34c3

fix: complete lazyllm provider integration and config consistency

3aad109

fix: support vendor-prefixed lazyllm api keys

b0e95d1

fix: require vendor-prefixed lazyllm api keys

106006f

fix: remove AGENTS.md from pr-158

c232acb

Anionex force-pushed the main branch from c1d7d14 to 0e87d3b Compare February 13, 2026 06:01

Anionex added 8 commits February 13, 2026 21:37

feat(backend): register lazyllm providers and add volcengine dependency

0b7fde1

fix(backend): strip <think> tags from AI responses

d9be474

Some reasoning models include <think>...</think> blocks in their output. Add strip_think_tags() utility and apply it in all text providers and image caption generation to clean these tags from responses.

fix(settings): use configured resolution for image model test

a92cc69

Use the user's configured resolution from settings instead of hardcoding 1K, which is unsupported by some models like seedream-4-5. Update test description to say '固定分辨率' instead of '1K'.

feat(deps): add dashscope and zhipuai to lazyllm optional dependencies

600da40

Include vendor SDKs (dashscope for qwen, zhipuai for glm) in the lazyllm optional dependency group so they install together with lazyllm.

fix(backend): use pixel dimensions for lazyllm image resolution

f8ef7c8

All lazyllm vendor APIs (qwen/dashscope, doubao, siliconflow) expect width*height format. Convert resolution (1K/2K/4K) + aspect ratio to actual pixel dimensions instead of passing shorthand values.

fix(settings): move lazyllm vendor API key inputs right after source …

677b14f

…dropdowns

style(settings): use orange gradient for LazyLLM format button

51512eb

Anionex force-pushed the feature/integrate-sensetime-LazyLLM-framework branch from ff15ccf to 51512eb Compare February 13, 2026 13:45

Anionex added 3 commits February 13, 2026 21:49

fix(backend): use 'x' separator for lazyllm image resolution dimensions

2778209

Doubao API requires 'WIDTHxHEIGHT' format (with 'x'), not 'WIDTH*HEIGHT'.

fix(backend): ensure minimum pixel count for lazyllm image resolution

9ec8180

Some models (doubao-seedream) require >= 3686400 total pixels. Scale up dimensions when needed and round up to 64-byte alignment.

fix(backend): handle vendor-specific image size formats and limits

a6db47c

- qwen: uses '*' separator, max 2048px per dimension - doubao: uses 'x' separator, min 3686400 total pixels - Scale up for doubao minimum, cap for qwen maximum

gemini-code-assist bot reviewed Feb 13, 2026

View reviewed changes

backend/controllers/settings_controller.py Show resolved Hide resolved

README.md Show resolved Hide resolved

backend/services/ai_providers/image/lazyllm_provider.py Outdated Show resolved Hide resolved

backend/services/file_parser_service.py Outdated Show resolved Hide resolved

Anionex added 2 commits February 13, 2026 22:31

chore: trigger CI

c26ee5d

Anionex merged commit 7b8d583 into Anionex:main Feb 13, 2026
2 checks passed

		self._lazyllm_api_keys = lazyllm_api_keys
		self._lazyllm_image_caption_source = lazyllm_image_caption_source

		text_source = _get_config_value('LAZYLLM_TEXT_SOURCE', 'siliconflow')
		image_source = _get_config_value('LAZYLLM_IMAGE_SOURCE', 'siliconflow')

		LAZYLLM_TEXT_SOURCE=deepseek # 文本生成厂商
		LAZYLLM_IMAGE_SOURCE=doubao # 图片生成厂商

Conversation

uestcsh917-art commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

功能概述

1. 文本生成（LazyllmTextProvider）

2. 图片生成与编辑（LazyllmImageProvider）

3. 图片内容描述

实现细节

注意事项

Uh oh!

gemini-code-assist bot commented Jan 12, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

uestcsh917-art commented Jan 23, 2026

Uh oh!

Anionex commented Jan 23, 2026

Uh oh!

Anionex commented Jan 29, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Anionex commented Feb 13, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Anionex commented Feb 13, 2026

延伸改进工作

🔧 Bug 修复

🔒 安全修复

📦 依赖

✅ 测试验证

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

uestcsh917-art commented Jan 12, 2026 •

edited

Loading