Skip to content

[Codegen] Fix correctness issues in ConvertGatherToLDS narrow type emulation#23763

Merged
lialan merged 2 commits intousers/lialan/subbyte_gather_to_ldsfrom
copilot/sub-pr-23758
Mar 12, 2026
Merged

[Codegen] Fix correctness issues in ConvertGatherToLDS narrow type emulation#23763
lialan merged 2 commits intousers/lialan/subbyte_gather_to_ldsfrom
copilot/sub-pr-23758

Conversation

Copy link
Contributor

Copilot AI commented Mar 12, 2026

ConvertGatherToLDS had several correctness issues in linearizeAndPack that could silently produce wrong IR or crash in non-assert builds.

Fixes:

  • Offset ignored: Memref layout offset was never incorporated into the linearized index. Now checks for dynamic offset (fails the pattern) and initializes linearIdx = offset + sum(idx[i] * stride[i]).
  • Silent rank mismatch: llvm::zip(indices, strides) silently truncated when sizes differed. Added explicit indices.size() != strides.size() guard.
  • Assert in rewrite path: assert(newBits > origBits && newBits % origBits == 0) would crash in debug and silently miscompile in release. Replaced with return nullptr (propagated as notifyMatchFailure by callers).
  • Misleading error message: "not a multiple of byte width" described the wrong invariant; corrected to "not a multiple of the new element bit width" to match the actual check (totalBits % newSrcBits != 0).
  • Unsafe early return: Removed the origBits == newBits && 1D fast-path that bypassed offset handling entirely.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: lialan <450283+lialan@users.noreply.github.com>
Copilot AI changed the title [WIP] [Codegen] Emulate gather to lds for narrow element types [Codegen] Fix correctness issues in ConvertGatherToLDS narrow type emulation Mar 12, 2026
Copilot AI requested a review from lialan March 12, 2026 20:31
@lialan lialan marked this pull request as ready for review March 12, 2026 20:33
@lialan lialan merged commit cef7504 into users/lialan/subbyte_gather_to_lds Mar 12, 2026
1 check passed
@lialan lialan deleted the copilot/sub-pr-23758 branch March 12, 2026 20:33

auto [strides, offset] = origType.getStridesAndOffset();

// Fail if the offset or any stride is dynamic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... Why is this a failure?

// Linearize: sum(idx[i] * stride[i]).
Value linearIdx = arith::ConstantIndexOp::create(rewriter, loc, 0);
// Fail if the number of indices doesn't match the rank.
if (indices.size() != strides.size()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this happen?

}

// Linearize: offset + sum(idx[i] * stride[i]).
Value linearIdx = arith::ConstantIndexOp::create(rewriter, loc, offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a utility function for this these days?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants