Add more tma tests #5540

liqiangxl · 2025-11-15T15:21:54Z

For #5503 (comment)

github-actions · 2025-11-15T15:22:39Z

Review updated until commit 5868dd7

Description

Add comprehensive debug logging to TMA analysis code for better troubleshooting
Implement 256-element hardware limitation check in MergeTileGroupsByRotation
Add size validation in DomainMerger to prevent exceeding hardware limits
Add three new test cases: two illegal size tests and one legal size test

Changes walkthrough

Relevant files

Enhancement

tma.cpp `Add TMA debug logging and 256-element limit checks` csrc/device_lower/analysis/tma.cpp Added extensive debug output throughout TMA analysis passes for better troubleshooting Implemented 256-element hardware limitation check in MergeTileGroupsByRotation::condition to prevent invalid merges Enhanced DomainMerger::shouldMerge with size validation against 256-element limit Added comprehensive logging for TMA domain merging phases and final dimension states	+265/-6

Tests

test_memory.cpp `Add TMA illegal size test cases` tests/cpp/test_memory.cpp Added OneDimTensor2dTile2x256Illegal test for 1D tensor with invalid 2x256 tiling Added TwoDimTensor2dTile2x256Illegal test for 2D tensor with invalid 2x256 tiling Added TwoDimTensor2dTile2x256Legal test for 2D tensor with valid 2x256 tiling	+133/-0

PR Reviewer Guide

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review
Debug Output Removal Extensive debug output has been added throughout the TMA analysis code (std::cout statements). This debug output should be removed or made conditional (guarded by a debug flag) before merging to production code. std::cout << "[MergeTileGroupsByRotation::condition] Checking expr: " << expr->front()->toString() << std::endl; std::cout << " from[0] (partitioned): " << from_.at(0)->toString() << " extent=" << from_.at(0)->front()->as<IterDomain>()->extent()->toString() << std::endl; std::cout << " from[1]: " << from_.at(1)->toString() << " extent=" << from_.at(1)->front()->as<IterDomain>()->extent()->toString() << std::endl; std::cout << " to[0]: " << to_.at(0)->toString() << " extent=" << to_.at(0)->front()->as<IterDomain>()->extent()->toString() << std::endl; std::cout << " partitioned_dim->box extent: " << partitioned_dim->box->front()->as<IterDomain>()->extent()->toString() << std::endl; std::cout << " partitioned_dim->box is bulk: " << box_is_bulk << std::endl; std::cout << " from[1] is bulk: " << from1_is_bulk << std::endl; if (bulk_groups_.count(partitioned_dim->box) == 0 \|\| bulk_groups_.count(from_.at(1)) == 0) { std::cout << " Result: FALSE (not both bulk)" << std::endl; return false; } NVF_ERROR( partitioned_dim->tile == partitioned_dim->box && partitioned_dim->stride == nullptr); // Check if merging would exceed the 256-element hardware limit auto box_extent = partitioned_dim->box->front()->as<IterDomain>()->extent(); auto from1_extent = from_.at(1)->front()->as<IterDomain>()->extent(); Val* merged_extent = SimplifyingIrBuilder::mulExpr(box_extent, from1_extent); constexpr int64_t largest_dim_size = 256; // Hardware limitation Val* too_large_after_merge = SimplifyingIrBuilder::gtExpr( merged_extent, IrBuilder::create<Val>(largest_dim_size)); std::cout << " Merged box size would be: " << box_extent->toString() << " * " << from1_extent->toString() << " = " << merged_extent->toString() << std::endl; if (simplifyExpr(too_large_after_merge)->isTrue()) { std::cout << " Result: FALSE (would exceed 256 element limit)" << std::endl; return false; } std::cout << " Result: TRUE - WILL MERGE (pattern matches and within limits!)" << std::endl; Performance Impact The debug output in loops (lines 661-703) could significantly impact performance if left in production code, especially for complex TMA operations with many iterations. std::cout << "\n[INFER_ROLES DEBUG] Starting infer_roles passes" << std::endl; int iteration = 0; bool changed = true; while (changed) { changed = false; std::cout << "\n[INFER_ROLES DEBUG] === Iteration " << iteration++ << " ===" << std::endl; std::cout << " Remaining exprs: " << exprs.size() << std::endl; std::cout << " Current inferred_dims: " << inferred_dims.size() << std::endl; for (const auto& dim : inferred_dims) { std::cout << " " << dim << std::endl; } bool b = bulk_pass.run(exprs); if (b) std::cout << " bulk_pass made changes" << std::endl; changed = changed \|\| b; b = nonbulk_pass.run(exprs); if (b) std::cout << " nonbulk_pass made changes" << std::endl; changed = changed \|\| b; b = striding_split_pass.run(exprs); if (b) std::cout << " striding_split_pass made changes" << std::endl; changed = changed \|\| b; b = boxing_split_pass.run(exprs); if (b) std::cout << " boxing_split_pass made changes" << std::endl; changed = changed \|\| b; b = move_partitioned_pass.run(exprs); if (b) std::cout << " move_partitioned_pass made changes" << std::endl; changed = changed \|\| b; b = merge_tile_pass.run(exprs); if (b) std::cout << " merge_tile_pass made changes" << std::endl; changed = changed \|\| b;

liqiangxl added 2 commits November 15, 2025 07:18

add more tma tests

dae85d5

add more tma tests

f24eea0

liqiangxl mentioned this pull request Nov 15, 2025

manual schedule pointwise kernel using 2D tma domain and tiles #5503

Merged

check 256 size limitation in MergeTileGroupsByRotation

5868dd7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add more tma tests #5540

Add more tma tests #5540

Uh oh!

liqiangxl commented Nov 15, 2025

Uh oh!

github-actions bot commented Nov 15, 2025 •

edited

Loading

Changes walkthrough

PR Reviewer Guide

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add more tma tests #5540

Are you sure you want to change the base?

Add more tma tests #5540

Uh oh!

Conversation

liqiangxl commented Nov 15, 2025

Uh oh!

github-actions bot commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes walkthrough

PR Reviewer Guide

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Nov 15, 2025 •

edited

Loading