Add event trace configuration to aie dialect#2705
Conversation
6539334 to
fc99d49
Compare
fc99d49 to
c670bcd
Compare
c670bcd to
586f76e
Compare
427f90a to
30f035d
Compare
This reverts commit 431a705b637b7849dc6220c15e7a392dbd6835f0.
- Created AIETraceAttrs.td with TraceModeAttr, TracePacketTypeAttr, TraceEventAttr - Created AIETraceOps.td with trace operations: - aie.trace (symbol operation) - aie.trace.mode, aie.trace.event, aie.trace.packet - aie.trace.start, aie.trace.stop - aie.trace.config, aie.trace.reg (intermediate ops) - aie.trace.start_config (runtime invocation) - Implemented basic C++ verifiers in AIETraceOps.cpp - Updated AIEAttrs.td and AIEOps.td to include trace definitions - Updated lib/Dialect/AIE/IR/CMakeLists.txt to build AIETraceOps.cpp - Added parsing test that validates operations parse correctly - Test passes: operations parse and print correctly
- Created test_trace_verify.mlir with negative tests: - Too many events (>8) - Packet ID out of range (0 and 32) - Start/stop events missing parameters - Start/stop events with conflicting parameters - All verifiers working correctly - Tests pass successfully
- Added pass definitions to AIEPasses.td for all three trace passes - Updated AIEPasses.h with pass creation function declarations - Implemented AIETraceToConfig.cpp: - Converts aie.trace → aie.trace.config - Emits aie.trace.reg operations for each register field - Handles Trace_Control0 (mode, start/stop events) - Handles Trace_Control1 (packet ID, packet type) - Handles Trace_Event0/1 (event slots 0-7) - Updates trace.start_config symbol references - Created stub implementations for AIEInlineTraceConfig and AIEConfigToNPU - Added test_trace_to_config.mlir that validates transformation - Test passes: trace ops correctly lowered to config ops
- Implemented AIEInlineTraceConfig.cpp: - Finds all aie.trace.start_config operations - Looks up referenced trace.config symbol - Clones all aie.trace.reg operations to call site - Removes trace.start_config invocation - Relaxed parent constraint on aie.trace.reg to allow DeviceOp parent (needed for inlined reg ops) - Added test_inline_trace_config.mlir - Test passes: trace.reg ops successfully inlined
- Implemented AIEConfigToNPU.cpp with prototype stub - Pass collects inlined trace.reg operations - Placeholder for full implementation that would: - Load RegisterDatabase - Resolve register names to offsets - Encode bitfield values - Merge writes to same register - Generate aiex.npu.write32 operations - Compiles successfully
- Created test_trace_end_to_end.mlir demonstrating complete pipeline - Tests full transformation: aie.trace → aie.trace.config → inlined aie.trace.reg - Validates: - High-level trace configuration with 4 events - Mode, packet routing, start/stop events - Correct lowering through both passes - Symbol references updated correctly - Register specifications generated for all fields - Test passes: complete pipeline working end-to-end
- Enhanced aie.trace.reg to include optional tile operand - Updated AIETraceToConfig to pass nullptr for tile (parent has it) - Updated AIEInlineTraceConfig to pass tile reference when cloning - Implemented full AIEConfigToNPU with: - RegisterDatabase loading and integration - Register name → offset resolution - Event name → event code resolution - Bitfield value encoding - Register write merging (multiple fields → single register) - Absolute address calculation - aiex.npu.write32 generation (when AIEX dialect available) - Pass validates register/field lookups work correctly - Demonstrates complete lowering pipeline infrastructure Note: NPU write generation requires AIEX dialect to be pre-loaded. This will be addressed in production integration.
PROBLEM FIXED: - trace.reg with 'for %tile' lost col/row information - Inlined trace.reg at device level was fragile SOLUTION: - Removed tile operand from trace.reg (now only in trace.config) - Simplified parent constraint: trace.reg only in TraceConfigOp - Moved RegisterDatabase integration from Pass 3 to Pass 2 - Pass 2 now generates npu.write32 directly with col/row from tile - Pass 3 is now a no-op (kept for extensibility) BENEFITS: - Col/row extracted immediately during inlining (not lost) - Cleaner IR (no intermediate trace.reg at device level) - Two-pass pipeline instead of three - npu.write32 has explicit column/row attributes This fixes the architectural issue identified in code review.
PROBLEM: AIEX dialect couldn't be loaded during AIE pass execution
SOLUTION: Move NPU-generating passes to AIEX dialect where they belong
- Moved AIEInlineTraceConfig.cpp to lib/Dialect/AIEX/Transforms/
- Moved AIEConfigToNPU.cpp to lib/Dialect/AIEX/Transforms/
- Updated pass definitions in AIEXPasses.td
- Removed from AIEPasses.td
- Updated CMakeLists for both dialects
- Updated pass registration headers
- Fixed namespaces (AIEX, not AIE)
RESULT: npu.write32 generation now works!
- Pass renamed: aie-inline-trace-config → aiex-inline-trace-config
- Pass renamed: aie-config-to-npu → aiex-config-to-npu
- Col/row preserved in npu.write32 operations
- RegisterDatabase integration functional
- Bitfield merging working
Example output:
aiex.npu.write32 {address=0xB40D0, column=0, row=2, value=0x1E2E0001}
This is the correct architectural placement: AIEX depends on AIE.
- Moved test_inline_trace_config.mlir to test/Dialect/AIEX/trace/ - Moved test_trace_end_to_end.mlir to test/Dialect/AIEX/trace/ - Updated pass name: aie-inline-trace-config → aiex-inline-trace-config - Added CHECK for aiex.npu.write32 operations - Verified col/row attributes are preserved - Both tests pass successfully Test organization: - AIE tests: parse, verify, trace-to-config (AIE dialect operations) - AIEX tests: inline-trace-config, end-to-end (NPU generation) This reflects the correct architectural separation.
- Added aiex.runtime_sequence wrapper to both tests - Shows proper usage: trace.start_config inside runtime_sequence - Tests validate npu.write32 generation within runtime context - All tests passing with correct architectural pattern This demonstrates the intended usage pattern where trace configuration is invoked from within a runtime sequence.
Implement aie.trace.port operation for hardware stream switch port monitoring: - Extend AIETargetModel with port mapping API (getStreamSwitchPortIndex, isValidStreamSwitchPort) - Add AIE_TracePortOp to AIETraceOps.td with slot (0-7), port, channel, master attributes - Implement TracePortOp::verify() with duplicate slot detection and port validation - Extend RegisterDatabase with resolvePortValue() for PORT:CHANNEL string parsing - Update AIETraceToConfig to process TracePortOp and generate register writes - Update AIETracePackRegWrites to resolve PORT:CHANNEL to hardware indices - Add comprehensive test suite: parse, verify, lowering, end-to-end tests Target registers: Stream_Switch_Event_Port_Selection_0/1 Enables monitoring of up to 8 stream switch ports with PORT_RUNNING, PORT_IDLE, PORT_STALLED, PORT_TLAST events. Note: Current implementation uses stub port mappings - actual hardware tables needed.
There was a problem hiding this comment.
Pull request overview
This PR proposes a declarative event-trace configuration API in the AIE dialect (aie.trace and related ops), plus a lowering pipeline that converts those declarations into register writes and finally inlines them as aiex.npu.write32/maskwrite32-style operations in runtime sequences.
Changes:
- Adds new AIE trace ops/attrs (trace mode, packet routing, events, ports, combo/edge events, and runtime
trace.start_config) with parsing/verification support. - Introduces lowering passes:
-aie-trace-to-config,-aie-trace-pack-reg-writes, and-aie-inline-trace-config. - Adds MLIR tests and a full programming example (MLIR + host code + trace visualization tooling).
Reviewed changes
Copilot reviewed 48 out of 48 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| utils/generate_events_tablegen.py | Adjusts AIE1 event enum naming suffix generation used by TableGen. |
| test/dialect/AIEX/trace/test_trace_end_to_end.mlir | End-to-end trace lowering/inlining test for AIEX pipeline. |
| test/dialect/AIEX/trace/test_inline_trace_config_verify.mlir | Verifies diagnostics when inlining is run without packing pass. |
| test/dialect/AIEX/trace/test_inline_trace_config.mlir | Tests inlining trace config into runtime sequence. |
| test/dialect/AIE/trace/test_trace_verify.mlir | Verifier coverage for trace op constraints and typed-enum arch mismatches. |
| test/dialect/AIE/trace/test_trace_to_config_verify.mlir | Verifies lowering errors for unknown events during trace-to-config. |
| test/dialect/AIE/trace/test_trace_to_config.mlir | Checks aie-trace-to-config output register-field writes. |
| test/dialect/AIE/trace/test_trace_parse.mlir | Parser/pretty-printer coverage for trace ops. |
| test/Dialect/AIEX/trace/test_trace_port_end_to_end.mlir | End-to-end test for port tracing lowering/inlining. |
| test/Dialect/AIE/trace/test_trace_port_verify.mlir | Verifier coverage for port slot/channel constraints and duplicates. |
| test/Dialect/AIE/trace/test_trace_port_to_config.mlir | Checks port configuration lowering to aie.trace.reg field writes. |
| test/Dialect/AIE/trace/test_trace_port_parse.mlir | Parser/pretty-printer coverage for trace port syntax. |
| test/Dialect/AIE/combo_edge/test_edge_to_config.mlir | Checks edge-event lowering to register writes. |
| test/Dialect/AIE/combo_edge/test_combo_to_config.mlir | Checks combo-event lowering to register writes. |
| test/Dialect/AIE/combo_edge/test_combo_event_verify.mlir | Verifier coverage for combo/edge slot rules and invalid sources. |
| test/Dialect/AIE/combo_edge/test_combo_event_parse.mlir | Parser/pretty-printer coverage for combo/edge event syntax. |
| test/Dialect/AIE/combo_edge/test_combo_edge_full.mlir | Integrated lowering test combining combo + edge + standard trace config. |
| test/CppTests/register_database.cpp | Minor formatting tweaks in register database unit tests. |
| python/dialects/aie.py | Adds Python attribute builder for TraceEventAttr and a trace region-op helper. |
| python/compiler/aiecc/main.py | Adds trace lowering passes into the default aie.device pipeline. |
| programming_examples/basic/event_trace/visualize_trace.py | New script to render timeline PNG from parsed trace JSON. |
| programming_examples/basic/event_trace/vector_scalar_mul.cc | New kernel emitting event markers for tracing. |
| programming_examples/basic/event_trace/test.py | New Python host runner that captures and dumps trace. |
| programming_examples/basic/event_trace/test.cpp | New C++ host runner wired into repo test utilities. |
| programming_examples/basic/event_trace/run_strix_makefile.lit | Lit recipe to run the example on NPU2 (“strix”). |
| programming_examples/basic/event_trace/run_makefile.lit | Lit recipe to run the example on NPU1. |
| programming_examples/basic/event_trace/aie_trace.mlir | Full example MLIR using the new declarative trace syntax. |
| programming_examples/basic/event_trace/README.md | Documentation for the example and lowering pipeline. |
| programming_examples/basic/event_trace/Makefile | Build/run orchestration for the example (kernel, xclbin, host, parsing, viz). |
| programming_examples/basic/event_trace/CMakeLists.txt | Builds the example host executable against XRT and test utils. |
| lib/Dialect/AIEX/Transforms/CMakeLists.txt | Registers the new AIEX inline-trace-config transform source. |
| lib/Dialect/AIEX/Transforms/AIEInlineTraceConfig.cpp | New pass that inlines trace configs into runtime sequences as NPU writes. |
| lib/Dialect/AIE/Transforms/CMakeLists.txt | Registers new AIE trace-to-config transform source. |
| lib/Dialect/AIE/Transforms/AIETraceToConfig.cpp | New passes: trace-to-config lowering + packing/merging register writes. |
| lib/Dialect/AIE/IR/CMakeLists.txt | Adds trace ops implementation file to AIE IR library build. |
| lib/Dialect/AIE/IR/AIETraceOps.cpp | Implements parsing/printing/verification for trace ops/attrs. |
| lib/Dialect/AIE/IR/AIETargetModel.cpp | Adds event-module mapping tweaks and helpers for masks/port resolution. |
| lib/Dialect/AIE/IR/AIEDialect.cpp | Hooks custom parse/print helpers for trace-related attribute/value forms. |
| include/aie/Dialect/AIEX/Transforms/AIEXPasses.td | Declares aie-inline-trace-config pass. |
| include/aie/Dialect/AIEX/Transforms/AIEXPasses.h | Adds factory declaration for inline-trace-config pass. |
| include/aie/Dialect/AIE/Transforms/AIEPasses.td | Declares aie-trace-to-config and aie-trace-pack-reg-writes passes. |
| include/aie/Dialect/AIE/Transforms/AIEPasses.h | Adds factory declarations for new AIE trace passes. |
| include/aie/Dialect/AIE/IR/AIETraceOps.td | ODS/TableGen definitions for trace ops. |
| include/aie/Dialect/AIE/IR/AIETraceAttrs.td | ODS/TableGen definitions for trace attrs/enums. |
| include/aie/Dialect/AIE/IR/AIETargetModel.h | Adds APIs for stream-switch port index, field mask, and port resolution. |
| include/aie/Dialect/AIE/IR/AIEOps.td | Includes trace ops definitions into the AIE op set. |
| include/aie/Dialect/AIE/IR/AIEDialect.h | Exposes shared trace event parse/print helpers. |
| include/aie/Dialect/AIE/IR/AIEAttrs.td | Includes trace attrs and updates event enum include naming. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Extract value (mask is discarded) | ||
| uint32_t value = 0; | ||
| if (auto intAttr = llvm::dyn_cast<IntegerAttr>(regOp.getValue())) { | ||
| value = intAttr.getInt(); | ||
| } else { | ||
| regOp.emitError("value must be an integer after packing"); | ||
| return signalPassFailure(); | ||
| } | ||
|
|
||
| // Generate aiex.npu.write32 operation with col/row | ||
| builder.create<AIEX::NpuWrite32Op>( | ||
| regOp.getLoc(), builder.getUI32IntegerAttr(regInfo->offset), | ||
| builder.getUI32IntegerAttr(value), | ||
| nullptr, // buffer | ||
| builder.getI32IntegerAttr(col), // column | ||
| builder.getI32IntegerAttr(row) // row | ||
| ); |
There was a problem hiding this comment.
aie-inline-trace-config currently discards the mask on packed aie.trace.reg writes and always emits aiex.npu.write32. If mask is not 0xFFFFFFFF, this will clobber unrelated bits in the target register. Consider emitting aiex.npu.maskwrite32 when regOp.getMask() is present (and only using write32 when the mask is full).
There was a problem hiding this comment.
This is a good catch, but for the record this was an intentional design decision. It ends up producing the same code as one would write by hand, instead of more costly maskwrite32 (rmw) instructions masking off useless bits.
Add aie-trace-to-config, aie-trace-pack-reg-writes, and aie-inline-trace-config to runResourceAllocationPipeline(), matching the ordering from PR #2705 which added them to the Python aiecc main.py pipeline before aie-assign-lock-ids. These passes were never ported to the C++ driver, causing the event_trace programming example to fail. Co-Authored-By: Claude Opus 4.6 <[email protected]>
This PR is a proposal to add declarative event trace configuration to AIE dialect.
The current implementation:
Input code to describe the event trace configuration for one tile (what the user writes):
Intermediate steps in the lowering where the
aie.tracedescription has been lowered to anaie.trace.configsequence of register writes, then tonpu.write32:Combo events:
Edge Events:
Depends on #2712 and #2696