Skip to content

[AIEX] Canonicalize contiguous NpuDmaMemcpyNdOp accesses to linear form#2924

Draft
hunhoffe wants to merge 6 commits intomainfrom
fix/linearize-contiguous-dma-memcpy-nd
Draft

[AIEX] Canonicalize contiguous NpuDmaMemcpyNdOp accesses to linear form#2924
hunhoffe wants to merge 6 commits intomainfrom
fix/linearize-contiguous-dma-memcpy-nd

Conversation

@hunhoffe
Copy link
Collaborator

@hunhoffe hunhoffe commented Mar 5, 2026

A row-major contiguous access pattern such as sizes=[s3, s2, s1, s0] strides=[st3, st2, st1, 1]
where st1==s0 and st2==s0*s1 (or the corresponding size is 1) is semantically identical to a linear transfer of N=s0*s1*s2 elements.

Add a canonicalization pattern that rewrites such ops to the canonical linear form: sizes=[s3, 1, 1, N] strides=[st3, 0, 0, 1]

In this form isLinearTransferWithoutTransformation() returns true, so verifyStridesWraps() skips the 10-bit d0 wrap-size check. The hardware uses a wider transfer-length register in linear mode, so larger Ns are supported.

This fixes the motivating case from issue #2825 where fill/drain on a 2D buffer (e.g. memref<M x K x bf16>) generates sizes=[1,1,M,K] with K>1023, which previously failed verification as a data-layout-transform dimension but is simply a contiguous linear transfer.

hunhoffe and others added 2 commits March 4, 2026 17:01
A row-major contiguous access pattern such as
  sizes=[s3, s2, s1, s0] strides=[st3, st2, st1, 1]
where st1==s0 and st2==s0*s1 (or the corresponding size is 1) is
semantically identical to a linear transfer of N=s0*s1*s2 elements.

Add a canonicalization pattern that rewrites such ops to the canonical
linear form:
  sizes=[s3, 1, 1, N] strides=[st3, 0, 0, 1]

In this form isLinearTransferWithoutTransformation() returns true, so
verifyStridesWraps() skips the 10-bit d0 wrap-size check. The hardware
uses a wider transfer-length register in linear mode, so arbitrarily
large N is supported.

This fixes the motivating case from issue #2825 where fill/drain on a
2D buffer (e.g. memref<M x K x bf16>) generates sizes=[1,1,M,K] with
K>1023, which previously failed verification as a data-layout-transform
dimension but is simply a contiguous linear transfer.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

Coverage Report

Created: 2026-03-13 03:16

Click here for information about interpreting this report.

FilenameFunction CoverageLine CoverageRegion CoverageBranch Coverage
home/runner/work/mlir-aie/mlir-aie/lib/Dialect/AIEX/IR/AIEXDialect.cpp 98.21% 85.62% 88.71% 78.40%
Totals 98.21% 85.62% 88.71% 78.40%
Generated by llvm-cov -- llvm version 18.1.3

Copy link
Collaborator

@andrej andrej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clear PR description! I think this canonicalization is useful.

A row-major contiguous access pattern such as sizes=[s3, s2, s1, s0] strides=[st3, st2, st1, 1]
where st1==s0 and st2==s0*s1 (or the corresponding size is 1)

For all tests, if size == 1 the corresponding stride == 0. Does this pass also canonicalize if the stride != 0? Do we want to canonicalize in that case, or perhaps error (since with a size of 1, the stride will never be applied, so might indicate user confusion)? Either way, can we add a test for some cases where size == 1 and stride != 0?

Tests for the DMA task syntax would be good too.

Is this same canonicalization also applied to the dimensions of an ObjectFifo (dimensionsToStream, dimensionsFromStream)?

// limit violations: in the resulting linear form, isLinearTransferWithout-
// Transformation() returns true, so verifyStridesWraps() skips the 10-bit
// d0 wrap-size check. The hardware uses a wider transfer-length register in
// linear mode, so arbitrarily large N is supported.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are still limits but they are very large.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants