Skip to content

[KernelDispatch] Add a temporary hack for 1x1 shapes with tile sizes M,N#1276

Open
Abhishek-Varma wants to merge 3 commits into
mainfrom
avarma_hack_32x512x64_1x1
Open

[KernelDispatch] Add a temporary hack for 1x1 shapes with tile sizes M,N#1276
Abhishek-Varma wants to merge 3 commits into
mainfrom
avarma_hack_32x512x64_1x1

Conversation

@Abhishek-Varma
Copy link
Copy Markdown
Contributor

@Abhishek-Varma Abhishek-Varma commented May 22, 2025

-- This commit adds a temporary hack for 1x1 AIE array to make those
2D matmul shapes work for which all of the operands get pulled in to L2
buffer. Once reprogramming of DMA ops is supported, we can get rid of this
workaround. We need to add this only for pack-peel-4-level-tiling NOT
pack-peel. The workaround just ensures that the tile size of first level is
NOT equal to M,N by halving the n0Tile and halving the corresponding packing
size in case n0Tile becomes lesser than the packing size.
-- Also adds e2e 32x512x64 for 1x1 npu4 test.

Signed-off-by: Abhishek Varma [email protected]

@Abhishek-Varma Abhishek-Varma marked this pull request as ready for review May 22, 2025 19:01
Comment thread compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/KernelDispatch.cpp Outdated
Comment on lines +453 to +455
// TODO(avarma): This is currently a workaround for 1x1 AIE array to make
// those 2D matmul shapes work for which all of the operands get pulled in
// to L2 buffer. Once reprogramming of DMA ops is supported, we can get rid
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't follow what's the problem here. What's the error message?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this we get 'aie.memtile_dma' op could not find and assign a valid BD id - because of exploding memtile_dma issue, which should be solved by having DMA ops reconfigured. But since the support is not yet available, this is a temporary workaround.

@Abhishek-Varma Abhishek-Varma force-pushed the avarma_hack_32x512x64_1x1 branch from 9a9672a to 7aa28d1 Compare May 26, 2025 06:10
-- This commit adds a temporary hack for 1x1 AIE array to make those
   2D matmul shapes work for which all of the operands get pulled in to L2
   buffer. Once reprogramming of DMA ops is supported, we can get rid of this
   workaround. We need to add this only for pack-peel-4-level-tiling NOT
   pack-peel. The workaround just ensures that the tile size of first level is
   NOT equal to M,N by halving the n0Tile.
-- Also adds e2e 32x512x64 for 1x1 npu4 test.

Signed-off-by: Abhishek Varma <[email protected]>
@Abhishek-Varma Abhishek-Varma force-pushed the avarma_hack_32x512x64_1x1 branch from 7aa28d1 to 2d0a1de Compare May 26, 2025 06:11
@Abhishek-Varma Abhishek-Varma requested a review from yzhang93 May 26, 2025 06:11
// of this workaround. We need to add this only for pack-peel-4-level-tiling
// NOT pack-peel. The workaround just ensures that the tile size of first
// level is NOT equal to M, N by halving the N0 tile.
if (numRows == 1 && numCols == 1) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you only see this issue for 1x1? I would expect it for any number of cores?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants