rm convertToSSA API,test=huawei_ascend_npu test=nvidia_tensorrt test=verisilicon_timvx#8988
Merged
weishengying merged 1 commit intoPaddlePaddle:developfrom May 11, 2022
weishengying:rm_convert_to_ssa
Merged
rm convertToSSA API,test=huawei_ascend_npu test=nvidia_tensorrt test=verisilicon_timvx#8988weishengying merged 1 commit intoPaddlePaddle:developfrom weishengying:rm_convert_to_ssa
weishengying merged 1 commit intoPaddlePaddle:developfrom
weishengying:rm_convert_to_ssa
Conversation
…verisilicon_timvx
WeiLi233
pushed a commit
to WeiLi233/Paddle-Lite
that referenced
this pull request
Jun 28, 2022
WeiLi233
pushed a commit
to WeiLi233/Paddle-Lite
that referenced
this pull request
Jun 28, 2022
WeiLi233
pushed a commit
to WeiLi233/Paddle-Lite
that referenced
this pull request
Feb 13, 2023
test [OpTestPy] fix expand/expand_v2, fc,flatten_contiguous_range, gather, generate_proposals_v2,greater_equal diff! (PaddlePaddle#8339) (PaddlePaddle#8394) * test=document_fix fix Android C++ demo compile Fail bug (PaddlePaddle#8245) * fix demo MakeFile Add a convert_to_ssa macro definition (PaddlePaddle#8869) rm convertToSSA API,test=huawei_ascend_npu test=nvidia_tensorrt test=verisilicon_timvx (PaddlePaddle#8988) Sync with offical 3af2ffb Sync with 3af2ffb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
之前修复拓扑排序紊乱的PR #8967 已合入。本pr的目的测试删除convertToSSA api的使用, 如果该pr顺利合入的话,证明了之前修复的有效性。
除了删除 converttossa API相关的调用以及cmake定义之外;
本pr还修改了 type_target_cast_pass,原因如下:
直接删除 converttossa , 在xpu_dasou的CI中 模型ernie_gen 会报错,原因如下:
该模型的第8个输入 placeholder_7,作为模型的输入,其是host上的tensor, 但在子block中 assign算子,重新给这个变量赋值了,使placeholder_7成为了 xpu上的tensor,因此模型在第二次run时,初始化数据时, 使用memcpy去给placeholder_7 tensor里面的指针赋值,会报错。只能使用 xpu_memcpy。
一个简化的demo如下图所示:



feed 是 op的输入, op的结果回写到feed, 然后再执行其他op,假设这个模型运行在xpu上,得到的中间表达如下:
可见,feed var从host上的tensor变成了xpu上的tensor。因此模型第二次run,初始化数据时,只能使用xpu_memcpy。
一个可行的方案是在输出侧也插入 io copy算子。(这样会产生一个新的变量,与原始的converttossa api中生成新的变量类似)