Skip to content

Conversation

@Wangzheee
Copy link
Contributor

@Wangzheee Wangzheee commented Dec 1, 2023

PR types

Others

PR changes

Others

Description

Pcard-71503
Paddle-TRT support sub block: while op, condition_block op:

执行逻辑:

  1. Paddle Inference的执行器能正常调用 while op、conditional_block op,以及在子block内能正常调用 tensorrt_engine op;
  2. Paddle-TRT在子block内(while op、conditional_block op)内运行指定的pass,并将子block内的OP映射成TensorRT的layer和plugin;

收益:

  1. 对于生成式、conditional_block 的模型推理优化、部署,提供一个通用性更强、性能更优、更灵活、人力成本更低、bug更少的推理方案;
  2. 使用Paddle Inference的IR 对模型进行通用的图优化,使用 TensorRT 子图 + 原生推理的模式,兜底TensorRT不支持的情况;
  3. 在sub block内使用所有TensorRT的 layer,engine 分析、选 kernel 机制,可以在 plugin 中实现基于业务的定制化高性能 kernel
  4. 为接入 TensorRT LLM 提供框架基础

@Wangzheee Wangzheee force-pushed the PaddleTRT_sub_block branch from 2ed4a17 to 6511140 Compare December 1, 2023 08:22
Comment on lines 136 to 137
VarDesc *new_var = new VarDesc(*var_desc);
vars_[name] = std::make_unique<VarDesc>(*new_var);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里感觉会不会有问题,你append的和实际存储进block的不是同一个var,怎么同步?并且这里第136行完全是冗余的,make_unique内部就会new一个新的出来。。。

Copy link
Contributor Author

@Wangzheee Wangzheee Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里感觉会不会有问题,你append的和实际存储进block的不是同一个var,怎么同步?并且这里第136行完全是冗余的,make_unique内部就会new一个新的出来。。。

当前的IR都只在自己的block上进行graph的修改(可能会修改var_desc的属性),不同的block中的边界var_desc是不同的,为了不干扰其它block上的graph操作。op的执行阶段是通过var_desc中的name去找scope,只要var name一样,就能正常运行

return res;
}
// The block this SubGraph belongs to.
int block_id_{0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从private移到public的理由?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从private移到public的理由?

debug用的,忘了删

raindrops2sea
raindrops2sea previously approved these changes Dec 8, 2023

// block inputs
std::vector<std::string> block_inputs = {};
if (block_op->Type() == "while") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里只对while op进行了处理,没有处理其他的condition op吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ops_type 目前只有"conditional_block", "while",这些 OP的input name不一样,需要一对一的处理

Copy link
Contributor

@lanxianghit lanxianghit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for flag

Copy link
Contributor

@Aurelius84 Aurelius84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

FusePassBase::Init(name_scope_, graph);

VLOG(3) << "Running conv_bn_fuse_pass.";
if (graph->IsMainGraph()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

类似这样的日志判断,推荐用IS_VLOG_ON,默认情况下就可以跳过执行

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

类似这样的日志判断,推荐用IS_VLOG_ON,默认情况下就可以跳过执行

学习到了,下个pr修改一下

Copy link
Contributor

@winter-wang winter-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Wangzheee Wangzheee merged commit 33f439a into PaddlePaddle:develop Dec 18, 2023
HermitSun pushed a commit to HermitSun/Paddle that referenced this pull request Dec 21, 2023
…addle#59588)

* [Paddle-TRT] support sub block: while op, condition_block op
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants