Skip to content

Add event trace configuration to aie dialect#2705

Merged
fifield merged 50 commits intoXilinx:mainfrom
fifield:events_proposal
Mar 4, 2026
Merged

Add event trace configuration to aie dialect#2705
fifield merged 50 commits intoXilinx:mainfrom
fifield:events_proposal

Conversation

@fifield
Copy link
Collaborator

@fifield fifield commented Nov 12, 2025

This PR is a proposal to add declarative event trace configuration to AIE dialect.

The current implementation:

Input code to describe the event trace configuration for one tile (what the user writes):

aie.device(npu1_1col) {
  %tile02 = aie.tile(0, 2)
  
    // Trace configuration for compute tile (0,2) - core events
    aie.trace @core_trace(%tile_0_2) {
      // Set trace mode (Event-Time captures timestamps)
      aie.trace.mode "Event-Time"

      // Configure packet routing (ID and type for packet-switched routing)
      aie.trace.packet id=1 type=core

      // Specify which events to capture (up to 8 events)
      aie.trace.event<"INSTR_EVENT_0">        // User event 0 (start marker)
      aie.trace.event<"INSTR_EVENT_1">        // User event 1 (end marker)
      aie.trace.event<"INSTR_VECTOR">         // Vector instructions
      aie.trace.event<"MEMORY_STALL">         // Memory access stalls
      aie.trace.event<"STREAM_STALL">         // Stream buffer stalls
      aie.trace.event<"LOCK_STALL">           // Lock acquisition stalls
      aie.trace.event<"PORT_RUNNING_0">       // DMA:0 S2MM running
      aie.trace.event<"PORT_IDLE_1">          // DMA:1 MM2S idle
      aie.trace.port<0> port=DMA channel=0 direction=S2MM
      aie.trace.port<1> port=DMA channel=0 direction=MM2S

      // Specify start/stop control (broadcast events)
      aie.trace.start event=<"BROADCAST_15">
      aie.trace.stop event=<"BROADCAST_14">
    }
  
  // Runtime sequence with trace invocation
  aiex.runtime_sequence @seq(%arg0: memref<32xi32>) {
    aie.trace.start_config @core_trace
    // ... other runtime operations
  }
}

Intermediate steps in the lowering where the aie.trace description has been lowered to an aie.trace.config sequence of register writes, then to npu.write32:

  // Intermediate representation (after -aie-trace-to-config)
  aie.trace.config @core_trace_config(%tile_0_2) packet_type = core {
    aie.trace.reg register = "Trace_Control0" field = "Mode" value = 0 : i32 comment = "trace mode"
    aie.trace.reg register = "Trace_Control1" field = "ID" value = 1 : i32 comment = "packet ID"
    aie.trace.reg register = "Trace_Control1" field = "Packet_Type" value = 0 : i32 comment = "packet type"
    aie.trace.reg register = "Trace_Control0" field = "Trace_Start_Event" value = 122 : i32 comment = "start event"
    aie.trace.reg register = "Trace_Control0" field = "Trace_Stop_Event" value = 121 : i32 comment = "stop event"
    aie.trace.reg register = "Stream_Switch_Event_Port_Selection_0" field = "Port_0_ID" value = "DMA:0" comment = "port 0 ID"
    aie.trace.reg register = "Stream_Switch_Event_Port_Selection_0" field = "Port_0_Master_Slave" value = 1 : i32 comment = "port 0 master/slave"
    aie.trace.reg register = "Stream_Switch_Event_Port_Selection_0" field = "Port_1_ID" value = "DMA:0" comment = "port 1 ID"
    aie.trace.reg register = "Stream_Switch_Event_Port_Selection_0" field = "Port_1_Master_Slave" value = 0 : i32 comment = "port 1 master/slave"
    aie.trace.reg register = "Trace_Event0" field = "Trace_Event0" value = 33 : i32 comment = "event slot 0"
    aie.trace.reg register = "Trace_Event0" field = "Trace_Event1" value = 34 : i32 comment = "event slot 1"
    aie.trace.reg register = "Trace_Event0" field = "Trace_Event2" value = 37 : i32 comment = "event slot 2"
    aie.trace.reg register = "Trace_Event0" field = "Trace_Event3" value = 23 : i32 comment = "event slot 3"
    aie.trace.reg register = "Trace_Event1" field = "Trace_Event4" value = 24 : i32 comment = "event slot 4"
    aie.trace.reg register = "Trace_Event1" field = "Trace_Event5" value = 26 : i32 comment = "event slot 5"
    aie.trace.reg register = "Trace_Event1" field = "Trace_Event6" value = 79 : i32 comment = "event slot 6"
    aie.trace.reg register = "Trace_Event1" field = "Trace_Event7" value = 78 : i32 comment = "event slot 7"
  }

  // Intermediate representation (after -aie-trace-pack-reg-writes)
  aie.trace.config @core_trace_config(%tile_0_2) packet_type = core {
    aie.trace.reg register = "Trace_Control0" value = 2038038528 : i32 mask = 2139029507 comment = "trace mode + start event + stop event"
    aie.trace.reg register = "Trace_Control1" value = 1 : i32 mask = 28703 comment = "packet ID + packet type"
    aie.trace.reg register = "Stream_Switch_Event_Port_Selection_0" value = 289 : i32 mask = 16191 comment = "port 0 ID + port 0 master/slave + port 1 ID + port 1 master/slave"
    aie.trace.reg register = "Trace_Event0" value = 388309537 : i32 mask = 2139062143 comment = "event slot 0 + event slot 1 + event slot 2 + event slot 3"
    aie.trace.reg register = "Trace_Event1" value = 1313806872 : i32 mask = 2139062143 comment = "event slot 4 + event slot 5 + event slot 6 + event slot 7"
  }

  // Final output (after -aiex-inline-trace-config)
  aiex.runtime_sequence @seq(%arg0: memref<32xi32>) {
    aiex.npu.write32 {address = 213200 : ui32, column = 0 : i32, row = 2 : i32, value = 2038038528 : ui32}
    aiex.npu.write32 {address = 213204 : ui32, column = 0 : i32, row = 2 : i32, value = 1 : ui32}
    aiex.npu.write32 {address = 261888 : ui32, column = 0 : i32, row = 2 : i32, value = 289 : ui32}
    aiex.npu.write32 {address = 213216 : ui32, column = 0 : i32, row = 2 : i32, value = 388309537 : ui32}
    aiex.npu.write32 {address = 213220 : ui32, column = 0 : i32, row = 2 : i32, value = 1313806872 : ui32}
    // Additional npu.write32 for other registers...
  }

Combo events:

aie.trace @my_trace(%tile02) {
  // Combo 0: lock stalled AND NOT DMA active
  aie.trace.combo_event<0> "LOCK_STALL" AND_NOT "DMA_S2MM_0_STALLED"
  
  // Combo 1: instruction event OR vector operation
  aie.trace.combo_event<1> "INSTR_EVENT_0" OR "INSTR_VECTOR"
  
  // Combo 2: (combo0) AND (combo1)
  aie.trace.combo_event<2> "COMBO_EVENT_0" AND "COMBO_EVENT_1"
  
  // Trace the combo results
  aie.trace.event<"COMBO_EVENT_0">
  aie.trace.event<"COMBO_EVENT_1">
  aie.trace.event<"COMBO_EVENT_2">
  ...
}

Edge Events:

aie.trace @my_trace(%tile02) {
  // Edge detector 0: count lock stalls (rising edges)
  aie.trace.edge_event<0> event="LOCK_STALL" trigger=RISING
  
  // Edge detector 1: count transitions (both edges)
  aie.trace.edge_event<1> event="LOCK_STALL" trigger=BOTH
  
  // Trace the edge-detected events
  aie.trace.event<"EDGE_DETECTION_EVENT_0">
  aie.trace.event<"EDGE_DETECTION_EVENT_1">
  ...
}

Depends on #2712 and #2696

@fifield fifield force-pushed the events_proposal branch 5 times, most recently from 6539334 to fc99d49 Compare November 17, 2025 22:32
@fifield fifield mentioned this pull request Jan 15, 2026
@fifield fifield force-pushed the events_proposal branch 2 times, most recently from 427f90a to 30f035d Compare March 3, 2026 04:21
@fifield fifield changed the title [proposal][wip] Add event trace configuration to mlir-aie Add event trace configuration to aie dialect Mar 3, 2026
fifield added 18 commits March 3, 2026 14:36
This reverts commit 431a705b637b7849dc6220c15e7a392dbd6835f0.
- Created AIETraceAttrs.td with TraceModeAttr, TracePacketTypeAttr, TraceEventAttr
- Created AIETraceOps.td with trace operations:
  - aie.trace (symbol operation)
  - aie.trace.mode, aie.trace.event, aie.trace.packet
  - aie.trace.start, aie.trace.stop
  - aie.trace.config, aie.trace.reg (intermediate ops)
  - aie.trace.start_config (runtime invocation)
- Implemented basic C++ verifiers in AIETraceOps.cpp
- Updated AIEAttrs.td and AIEOps.td to include trace definitions
- Updated lib/Dialect/AIE/IR/CMakeLists.txt to build AIETraceOps.cpp
- Added parsing test that validates operations parse correctly
- Test passes: operations parse and print correctly
- Created test_trace_verify.mlir with negative tests:
  - Too many events (>8)
  - Packet ID out of range (0 and 32)
  - Start/stop events missing parameters
  - Start/stop events with conflicting parameters
- All verifiers working correctly
- Tests pass successfully
- Added pass definitions to AIEPasses.td for all three trace passes
- Updated AIEPasses.h with pass creation function declarations
- Implemented AIETraceToConfig.cpp:
  - Converts aie.trace → aie.trace.config
  - Emits aie.trace.reg operations for each register field
  - Handles Trace_Control0 (mode, start/stop events)
  - Handles Trace_Control1 (packet ID, packet type)
  - Handles Trace_Event0/1 (event slots 0-7)
  - Updates trace.start_config symbol references
- Created stub implementations for AIEInlineTraceConfig and AIEConfigToNPU
- Added test_trace_to_config.mlir that validates transformation
- Test passes: trace ops correctly lowered to config ops
- Implemented AIEInlineTraceConfig.cpp:
  - Finds all aie.trace.start_config operations
  - Looks up referenced trace.config symbol
  - Clones all aie.trace.reg operations to call site
  - Removes trace.start_config invocation
- Relaxed parent constraint on aie.trace.reg to allow DeviceOp parent
  (needed for inlined reg ops)
- Added test_inline_trace_config.mlir
- Test passes: trace.reg ops successfully inlined
- Implemented AIEConfigToNPU.cpp with prototype stub
- Pass collects inlined trace.reg operations
- Placeholder for full implementation that would:
  - Load RegisterDatabase
  - Resolve register names to offsets
  - Encode bitfield values
  - Merge writes to same register
  - Generate aiex.npu.write32 operations
- Compiles successfully
- Created test_trace_end_to_end.mlir demonstrating complete pipeline
- Tests full transformation: aie.trace → aie.trace.config → inlined aie.trace.reg
- Validates:
  - High-level trace configuration with 4 events
  - Mode, packet routing, start/stop events
  - Correct lowering through both passes
  - Symbol references updated correctly
  - Register specifications generated for all fields
- Test passes: complete pipeline working end-to-end
- Enhanced aie.trace.reg to include optional tile operand
- Updated AIETraceToConfig to pass nullptr for tile (parent has it)
- Updated AIEInlineTraceConfig to pass tile reference when cloning
- Implemented full AIEConfigToNPU with:
  - RegisterDatabase loading and integration
  - Register name → offset resolution
  - Event name → event code resolution
  - Bitfield value encoding
  - Register write merging (multiple fields → single register)
  - Absolute address calculation
  - aiex.npu.write32 generation (when AIEX dialect available)
- Pass validates register/field lookups work correctly
- Demonstrates complete lowering pipeline infrastructure

Note: NPU write generation requires AIEX dialect to be pre-loaded.
This will be addressed in production integration.
PROBLEM FIXED:
- trace.reg with 'for %tile' lost col/row information
- Inlined trace.reg at device level was fragile

SOLUTION:
- Removed tile operand from trace.reg (now only in trace.config)
- Simplified parent constraint: trace.reg only in TraceConfigOp
- Moved RegisterDatabase integration from Pass 3 to Pass 2
- Pass 2 now generates npu.write32 directly with col/row from tile
- Pass 3 is now a no-op (kept for extensibility)

BENEFITS:
- Col/row extracted immediately during inlining (not lost)
- Cleaner IR (no intermediate trace.reg at device level)
- Two-pass pipeline instead of three
- npu.write32 has explicit column/row attributes

This fixes the architectural issue identified in code review.
PROBLEM: AIEX dialect couldn't be loaded during AIE pass execution

SOLUTION: Move NPU-generating passes to AIEX dialect where they belong
- Moved AIEInlineTraceConfig.cpp to lib/Dialect/AIEX/Transforms/
- Moved AIEConfigToNPU.cpp to lib/Dialect/AIEX/Transforms/
- Updated pass definitions in AIEXPasses.td
- Removed from AIEPasses.td
- Updated CMakeLists for both dialects
- Updated pass registration headers
- Fixed namespaces (AIEX, not AIE)

RESULT: npu.write32 generation now works!
- Pass renamed: aie-inline-trace-config → aiex-inline-trace-config
- Pass renamed: aie-config-to-npu → aiex-config-to-npu
- Col/row preserved in npu.write32 operations
- RegisterDatabase integration functional
- Bitfield merging working

Example output:
aiex.npu.write32 {address=0xB40D0, column=0, row=2, value=0x1E2E0001}

This is the correct architectural placement: AIEX depends on AIE.
- Moved test_inline_trace_config.mlir to test/Dialect/AIEX/trace/
- Moved test_trace_end_to_end.mlir to test/Dialect/AIEX/trace/
- Updated pass name: aie-inline-trace-config → aiex-inline-trace-config
- Added CHECK for aiex.npu.write32 operations
- Verified col/row attributes are preserved
- Both tests pass successfully

Test organization:
- AIE tests: parse, verify, trace-to-config (AIE dialect operations)
- AIEX tests: inline-trace-config, end-to-end (NPU generation)

This reflects the correct architectural separation.
- Added aiex.runtime_sequence wrapper to both tests
- Shows proper usage: trace.start_config inside runtime_sequence
- Tests validate npu.write32 generation within runtime context
- All tests passing with correct architectural pattern

This demonstrates the intended usage pattern where trace configuration
is invoked from within a runtime sequence.
Implement aie.trace.port operation for hardware stream switch port monitoring:

- Extend AIETargetModel with port mapping API (getStreamSwitchPortIndex, isValidStreamSwitchPort)
- Add AIE_TracePortOp to AIETraceOps.td with slot (0-7), port, channel, master attributes
- Implement TracePortOp::verify() with duplicate slot detection and port validation
- Extend RegisterDatabase with resolvePortValue() for PORT:CHANNEL string parsing
- Update AIETraceToConfig to process TracePortOp and generate register writes
- Update AIETracePackRegWrites to resolve PORT:CHANNEL to hardware indices
- Add comprehensive test suite: parse, verify, lowering, end-to-end tests

Target registers: Stream_Switch_Event_Port_Selection_0/1
Enables monitoring of up to 8 stream switch ports with PORT_RUNNING, PORT_IDLE, PORT_STALLED, PORT_TLAST events.

Note: Current implementation uses stub port mappings - actual hardware tables needed.
@fifield fifield marked this pull request as ready for review March 3, 2026 21:36
Copilot AI review requested due to automatic review settings March 3, 2026 21:36
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR proposes a declarative event-trace configuration API in the AIE dialect (aie.trace and related ops), plus a lowering pipeline that converts those declarations into register writes and finally inlines them as aiex.npu.write32/maskwrite32-style operations in runtime sequences.

Changes:

  • Adds new AIE trace ops/attrs (trace mode, packet routing, events, ports, combo/edge events, and runtime trace.start_config) with parsing/verification support.
  • Introduces lowering passes: -aie-trace-to-config, -aie-trace-pack-reg-writes, and -aie-inline-trace-config.
  • Adds MLIR tests and a full programming example (MLIR + host code + trace visualization tooling).

Reviewed changes

Copilot reviewed 48 out of 48 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
utils/generate_events_tablegen.py Adjusts AIE1 event enum naming suffix generation used by TableGen.
test/dialect/AIEX/trace/test_trace_end_to_end.mlir End-to-end trace lowering/inlining test for AIEX pipeline.
test/dialect/AIEX/trace/test_inline_trace_config_verify.mlir Verifies diagnostics when inlining is run without packing pass.
test/dialect/AIEX/trace/test_inline_trace_config.mlir Tests inlining trace config into runtime sequence.
test/dialect/AIE/trace/test_trace_verify.mlir Verifier coverage for trace op constraints and typed-enum arch mismatches.
test/dialect/AIE/trace/test_trace_to_config_verify.mlir Verifies lowering errors for unknown events during trace-to-config.
test/dialect/AIE/trace/test_trace_to_config.mlir Checks aie-trace-to-config output register-field writes.
test/dialect/AIE/trace/test_trace_parse.mlir Parser/pretty-printer coverage for trace ops.
test/Dialect/AIEX/trace/test_trace_port_end_to_end.mlir End-to-end test for port tracing lowering/inlining.
test/Dialect/AIE/trace/test_trace_port_verify.mlir Verifier coverage for port slot/channel constraints and duplicates.
test/Dialect/AIE/trace/test_trace_port_to_config.mlir Checks port configuration lowering to aie.trace.reg field writes.
test/Dialect/AIE/trace/test_trace_port_parse.mlir Parser/pretty-printer coverage for trace port syntax.
test/Dialect/AIE/combo_edge/test_edge_to_config.mlir Checks edge-event lowering to register writes.
test/Dialect/AIE/combo_edge/test_combo_to_config.mlir Checks combo-event lowering to register writes.
test/Dialect/AIE/combo_edge/test_combo_event_verify.mlir Verifier coverage for combo/edge slot rules and invalid sources.
test/Dialect/AIE/combo_edge/test_combo_event_parse.mlir Parser/pretty-printer coverage for combo/edge event syntax.
test/Dialect/AIE/combo_edge/test_combo_edge_full.mlir Integrated lowering test combining combo + edge + standard trace config.
test/CppTests/register_database.cpp Minor formatting tweaks in register database unit tests.
python/dialects/aie.py Adds Python attribute builder for TraceEventAttr and a trace region-op helper.
python/compiler/aiecc/main.py Adds trace lowering passes into the default aie.device pipeline.
programming_examples/basic/event_trace/visualize_trace.py New script to render timeline PNG from parsed trace JSON.
programming_examples/basic/event_trace/vector_scalar_mul.cc New kernel emitting event markers for tracing.
programming_examples/basic/event_trace/test.py New Python host runner that captures and dumps trace.
programming_examples/basic/event_trace/test.cpp New C++ host runner wired into repo test utilities.
programming_examples/basic/event_trace/run_strix_makefile.lit Lit recipe to run the example on NPU2 (“strix”).
programming_examples/basic/event_trace/run_makefile.lit Lit recipe to run the example on NPU1.
programming_examples/basic/event_trace/aie_trace.mlir Full example MLIR using the new declarative trace syntax.
programming_examples/basic/event_trace/README.md Documentation for the example and lowering pipeline.
programming_examples/basic/event_trace/Makefile Build/run orchestration for the example (kernel, xclbin, host, parsing, viz).
programming_examples/basic/event_trace/CMakeLists.txt Builds the example host executable against XRT and test utils.
lib/Dialect/AIEX/Transforms/CMakeLists.txt Registers the new AIEX inline-trace-config transform source.
lib/Dialect/AIEX/Transforms/AIEInlineTraceConfig.cpp New pass that inlines trace configs into runtime sequences as NPU writes.
lib/Dialect/AIE/Transforms/CMakeLists.txt Registers new AIE trace-to-config transform source.
lib/Dialect/AIE/Transforms/AIETraceToConfig.cpp New passes: trace-to-config lowering + packing/merging register writes.
lib/Dialect/AIE/IR/CMakeLists.txt Adds trace ops implementation file to AIE IR library build.
lib/Dialect/AIE/IR/AIETraceOps.cpp Implements parsing/printing/verification for trace ops/attrs.
lib/Dialect/AIE/IR/AIETargetModel.cpp Adds event-module mapping tweaks and helpers for masks/port resolution.
lib/Dialect/AIE/IR/AIEDialect.cpp Hooks custom parse/print helpers for trace-related attribute/value forms.
include/aie/Dialect/AIEX/Transforms/AIEXPasses.td Declares aie-inline-trace-config pass.
include/aie/Dialect/AIEX/Transforms/AIEXPasses.h Adds factory declaration for inline-trace-config pass.
include/aie/Dialect/AIE/Transforms/AIEPasses.td Declares aie-trace-to-config and aie-trace-pack-reg-writes passes.
include/aie/Dialect/AIE/Transforms/AIEPasses.h Adds factory declarations for new AIE trace passes.
include/aie/Dialect/AIE/IR/AIETraceOps.td ODS/TableGen definitions for trace ops.
include/aie/Dialect/AIE/IR/AIETraceAttrs.td ODS/TableGen definitions for trace attrs/enums.
include/aie/Dialect/AIE/IR/AIETargetModel.h Adds APIs for stream-switch port index, field mask, and port resolution.
include/aie/Dialect/AIE/IR/AIEOps.td Includes trace ops definitions into the AIE op set.
include/aie/Dialect/AIE/IR/AIEDialect.h Exposes shared trace event parse/print helpers.
include/aie/Dialect/AIE/IR/AIEAttrs.td Includes trace attrs and updates event enum include naming.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +95 to +111
// Extract value (mask is discarded)
uint32_t value = 0;
if (auto intAttr = llvm::dyn_cast<IntegerAttr>(regOp.getValue())) {
value = intAttr.getInt();
} else {
regOp.emitError("value must be an integer after packing");
return signalPassFailure();
}

// Generate aiex.npu.write32 operation with col/row
builder.create<AIEX::NpuWrite32Op>(
regOp.getLoc(), builder.getUI32IntegerAttr(regInfo->offset),
builder.getUI32IntegerAttr(value),
nullptr, // buffer
builder.getI32IntegerAttr(col), // column
builder.getI32IntegerAttr(row) // row
);
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aie-inline-trace-config currently discards the mask on packed aie.trace.reg writes and always emits aiex.npu.write32. If mask is not 0xFFFFFFFF, this will clobber unrelated bits in the target register. Consider emitting aiex.npu.maskwrite32 when regOp.getMask() is present (and only using write32 when the mask is full).

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch, but for the record this was an intentional design decision. It ends up producing the same code as one would write by hand, instead of more costly maskwrite32 (rmw) instructions masking off useless bits.

@fifield fifield enabled auto-merge March 4, 2026 21:33
@fifield fifield added this pull request to the merge queue Mar 4, 2026
Merged via the queue into Xilinx:main with commit e2e9c5f Mar 4, 2026
59 of 60 checks passed
jgmelber added a commit that referenced this pull request Mar 5, 2026
Add aie-trace-to-config, aie-trace-pack-reg-writes, and
aie-inline-trace-config to runResourceAllocationPipeline(), matching
the ordering from PR #2705 which added them to the Python aiecc
main.py pipeline before aie-assign-lock-ids. These passes were never
ported to the C++ driver, causing the event_trace programming example
to fail.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants