support OnlineEmbeddingModuleBase batch input by wangtianxiong-sensetime · Pull Request #725 · LazyAGI/LazyLLM

wangtianxiong-sensetime · 2025-08-26T02:24:03Z

完善OnlineEmbeddingModuleBase解析结果时固定取0造成不支持batch输入的现状

根据input类型输出 fix flake8 fix onlineEmbed支持batch输入 fix _parse_response fix

wzh1994 · 2025-08-26T10:25:02Z

完善一下测例

ChenJiahaoST

commented
另外：

是否需要一起修改一下本地部署的embed？这块如果需要，辛苦整体修改一下
正确填写pr description模板内容
考虑一下如果需要list的场景，额外增加一个入参batch（默认64），以免请求体过大导致一些场景oom。
测例补充一下batch的场景

lazyllm/module/llms/onlinemodule/supplier/glm.py

ChenJiahaoST · 2025-08-26T12:11:26Z

lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py

+        if isinstance(input, str):
+            return response['data'][0]['embedding']
+        else:
+            return [res['embedding'] for res in response['data']]


建议使用.get、预先判断列表长度等，增加_parse_response的健壮性。更多的，比如验证列表向量长度是否与input一致

ChenJiahaoST · 2025-08-26T12:13:58Z

lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py


-    def _parse_response(self, response: Dict[str, Any]) -> List[float]:
-        return response['data'][0]['embedding']
+    def _parse_response(self, response: Dict[str, Any], input: Union[List, str]) -> List[float]:


-> List[float]
按照实际类型改对

ChenJiahaoST · 2025-08-26T12:16:50Z

lazyllm/module/llms/onlinemodule/supplier/qwen.py


-    def _parse_response(self, response: Dict[str, Any]) -> List[float]:
-        return response['output']['embeddings'][0]['embedding']
+    def _parse_response(self, response: Dict[str, Any], input: Union[List, str]) -> List[float]:


same，顺便优化一下取值逻辑，尽量用get，增加鲁棒性

ChenJiahaoST · 2025-08-26T12:17:09Z

lazyllm/module/llms/onlinemodule/supplier/sensenova.py


-    def _parse_response(self, response: Dict[str, Any]) -> List[float]:
-        return response['embeddings'][0]['embedding']
+    def _parse_response(self, response: Dict[str, Any], input: Union[List, str]) -> List[float]:


ChenJiahaoST · 2025-08-27T01:34:37Z

lazyllm/module/llms/onlinemodule/supplier/doubao.py

        return json_data

-    def _parse_response(self, response: Dict[str, Any]) -> List[float]:
+    def _parse_response(self, response: Dict[str, Any], input: Union[List, str]) -> List[float]:


这里是否也需要支持多模态的列表输入？

豆包输入限制最多为1段文本+1张图片

ChenJiahaoST · 2025-08-27T01:35:24Z

lazyllm/module/llms/onlinemodule/supplier/qwen.py

        return json_data

-    def _parse_response(self, response: Dict[str, Any]) -> List[float]:
+    def _parse_response(self, response: Dict[str, Any], input: Union[List, str]) -> List[float]:


same，rerank是否需要修改

ChenJiahaoST

批处理和并行放在base实现

ChenJiahaoST · 2025-08-27T11:37:57Z

lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py

+            return []
        if isinstance(input, str):
-            return response['data'][0]['embedding']
+            return data[0].get("embedding", [])


单引号优先

ChenJiahaoST · 2025-08-27T11:38:35Z

lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py

+    def _parse_response(self, response: Dict, input: Union[List, str]) -> Union[List[List[float]], List[float]]:
+        data = response.get("data", [])
+        if not data:
+            return []


data没有元素是否是异常？raise？

ChenJiahaoST · 2025-08-27T11:40:54Z

lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py

batchsize、num_workers在基类中添加

ChenJiahaoST · 2025-08-27T12:08:54Z

lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py

+                else:
+                    raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
+
+    def run_embed_batch(self, input: Union[List, str], data: List, proxies):


如沟通，run_embed_batch增加一下超过上限后的动态降批次机制，可参考flagembed的实现机制

ChenJiahaoST

测例跑通
测例中添加一下考虑降批次、并行/非并行的情况

ChenJiahaoST · 2025-09-01T02:26:23Z

lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py

        }

-    def forward(self, input: Union[List, str], **kwargs) -> List[float]:
+    def forward(self, input: Union[List, str], **kwargs):


forward还是加一下返回类型吧

ChenJiahaoST · 2025-09-01T02:28:50Z

lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py

-        with requests.post(self._embed_url, json=data, headers=self._headers, proxies=proxies) as r:
-            if r.status_code == 200:
-                return self._parse_response(r.json())
+        if isinstance(data, List):


判断类型尽量不使用typing， if isinstance(data, list)

ChenJiahaoST · 2025-09-01T02:38:51Z

lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py


-    def _parse_response(self, response: Dict[str, Any]) -> List[float]:
-        return response['data'][0]['embedding']
+    def run_embed_batch_parallel(self, input: Union[List, str], data: List, proxies, **kwargs):


我想了一下，看是否还是把并行和非并行提取一下公共逻辑，比如：

if workers == 1: with requests.Session() as sess: for start, chunk in chunks: embeds = self._send_chunk_with_autoshrink(sess, chunk, proxies, **kwargs) results[start:start+len(embeds)] = embeds else: with ThreadPoolExecutor(max_workers=workers) as ex: futs = [ex.submit(self._send_chunk_with_autoshrink, None, chunk, proxies, **kwargs) for _, chunk in chunks] # 用 as_completed 获取完成结果，并按 start 索引回填 for (start, _), fut in zip(chunks, as_completed(futs)): embeds = fut.result() results[start:start+len(embeds)] = embeds

_send_chunk_with_autoshrink中包含当前批次内部降批次的逻辑：

def _send_chunk_with_autoshrink(self, sess: requests.Session, chunk: List[str], proxies, **kwargs) -> List[List[float]]: # 对单个 micro-batch 发送；若报“批太大/体积过大”，减半重试 cur = list(chunk) local_sess = sess or requests.Session() while True: try: resp = self._post_once(local_sess, self._build_payload(cur, **kwargs), proxies) out = self._parse_response(resp, cur) # 此处传“当前批”，不是整包 return out # List[List[float]] except requests.HTTPError as e: code = getattr(e.response, "status_code", None) text = (getattr(e.response, "text", "") or "").lower() too_big = code in (413, 422) or ("too many" in text) or ("max" in text and "input" in text) if too_big and len(cur) > 1: cur = cur[:max(1, len(cur)//2)] continue # 401/其他错误：直接抛 raise

从而不直接修改self._batch_size

ChenJiahaoST · 2025-09-01T02:40:14Z

lazyllm/module/llms/onlinemodule/supplier/openai.py

+        if isinstance(input, str):
+            return data[0].get('embedding', [])
+        else:
+            return [res.get('embedding', []) for res in data]


最后空一行

This reverts commit f9c0928.

fix flake8

ChenJiahaoST

修复跳出的bug，完善重试机制
完善请求最好加一下timeout

ChenJiahaoST · 2025-09-02T07:32:23Z

lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py

+                                self._batch_size = max(self._batch_size // 2, 1)
+                                data = self._encapsulated_data(input, **kwargs)
+                                break
+                    break


降批次后，break跳出接一个break，是否逻辑出错？直接跳出来了

ChenJiahaoST · 2025-09-02T07:57:01Z

lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py

+                            vec = self._parse_response(r.json(), input)
+                            start = i * self._batch_size
+                            end = start + len(vec)
+                            ret[start:end] = vec


ret[start: start + len(vec)] = vec

wangtianxiong added 2 commits August 26, 2025 10:14

fix OnlineEmbeddingModuleBase解析结果固定取0丢失向量

de00ca3

根据input类型输出 fix flake8 fix onlineEmbed支持batch输入 fix _parse_response fix

修正flake8 warning

dd9db30

wzh1994 requested a review from ChenJiahaoST August 26, 2025 03:55

ChenJiahaoST requested changes Aug 26, 2025

View reviewed changes

ChenJiahaoST reviewed Aug 27, 2025

View reviewed changes

wangtianxiong-sensetime changed the title ~~支持OnlineEmbeddingModuleBase batch输入~~ support OnlineEmbeddingModuleBase batch input Aug 27, 2025

onlineEmbedding支持batch输入

67dd16a

ChenJiahaoST reviewed Aug 27, 2025

View reviewed changes

wangtianxiong added 6 commits August 28, 2025 10:56

embed失败后动态调整batch_size

0994169

move batch_size to OnlineEmbeddingModuleBase

a2d6f79

raise when received is empty

eb19195

增加onlineEmbedModuleBase并行post

fec4259

数据前后处理由子类单独实现

86dd8bb

改用abstractmethod

9865794

ChenJiahaoST reviewed Sep 1, 2025

View reviewed changes

wangtianxiong added 4 commits September 2, 2025 10:09

优化run_embed_batch逻辑，实现基础_encapsulated_data和_parse_response

f9c0928

Revert "优化run_embed_batch逻辑，实现基础_encapsulated_data和_parse_response"

70ef8b6

This reverts commit f9c0928.

优化run_embed_batch逻辑，实现基础_encapsulated_data和_parse_response

14e1bd0

fix flake8

返回429抛异常

1dcc928

wangtianxiong-sensetime force-pushed the fix_onlineEmbeddingModule_parse_response branch from 844fd7b to 1dcc928 Compare September 2, 2025 07:25

修复run_embed_batch逻辑

81e9323

ChenJiahaoST reviewed Sep 2, 2025

View reviewed changes

add timeout

bb4d586

ChenJiahaoST approved these changes Sep 3, 2025

View reviewed changes

Merge branch 'main' into fix_onlineEmbeddingModule_parse_response

54f7168

mergify bot added the lint_pass label Sep 5, 2025

wzh1994 merged commit a3db624 into LazyAGI:main Sep 5, 2025
19 of 23 checks passed

Conversation

wangtianxiong-sensetime commented Aug 26, 2025

Uh oh!

wzh1994 commented Aug 26, 2025

Uh oh!

ChenJiahaoST left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ChenJiahaoST Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChenJiahaoST left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChenJiahaoST left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChenJiahaoST left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ChenJiahaoST Aug 26, 2025 •

edited

Loading