[Models] add fleet model fallback by xiaoguoguo626807 · Pull Request #7732 · PaddlePaddle/FastDeploy

xiaoguoguo626807 · 2026-05-07T07:07:32Z

Motivation

新增 PaddleFleet 作为模型推理后端（--model-impl paddlefleet），通过将 PaddleFleet TransformerLayer 中的 core_attention 替换为 FastDeploy Attention 内核，实现在 PaddleFleet 模型结构上复用 FastDeploy 的 KV Cache 和高性能 Attention 计算。

Modifications

config.py: 新增 paddlefleet 到 ModelImpl 类型定义
engine/args_utils.py: 支持 --model-impl paddlefleet CLI 参数，并补充校验逻辑
model_executor/models/paddleformers/base_fleet.py: 新增 PaddleFleetModelBase 基类、FastDeployAttention 层及 patch_paddlefleet_core_attention 替换函数
model_executor/models/paddleformers/__init__.py: 注册 PaddleFleetForCausalLM 模型类

Usage or Command

python -m fastdeploy.entrypoints.openai.api_server \
    --model /path/to/model \
    --model-impl paddlefleet

Accuracy Tests

N/A（本 PR 新增 PaddleFleet 推理后端，尚未提供与参考实现的 logits 对齐数据）

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-05-07T07:07:39Z

Thanks for your contribution!

add fleet fallback

6898863

xiaoguoguo626807 had a problem deploying to Metax_ci May 7, 2026 07:07 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Models] add fleet model fallback#7732

[Models] add fleet model fallback#7732
xiaoguoguo626807 wants to merge 1 commit intoPaddlePaddle:developfrom
xiaoguoguo626807:fleet

xiaoguoguo626807 commented May 7, 2026 •

edited

Loading

Uh oh!

paddle-bot Bot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xiaoguoguo626807 commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot Bot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xiaoguoguo626807 commented May 7, 2026 •

edited

Loading