Skip to content

Conversation

@cofibrant
Copy link
Contributor

Adds ISel patterns for LDAPURS* instructions, from which the compiler is able to infer that these instructions may load.

Related to #171142

@llvmbot
Copy link
Member

llvmbot commented Dec 11, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Nathan Corbyn (cofibrant)

Changes

Adds ISel patterns for LDAPURS* instructions, from which the compiler is able to infer that these instructions may load.

Related to #171142


Patch is 51.24 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/171788.diff

8 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrAtomics.td (+22)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll (+307)
  • (modified) llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-rcpc-immo-instructions.s (+19-19)
  • (modified) llvm/test/tools/llvm-mca/AArch64/Neoverse/N3-rcpc-immo-instructions.s (+19-19)
  • (modified) llvm/test/tools/llvm-mca/AArch64/Neoverse/V1-rcpc-immo-instructions.s (+19-19)
  • (modified) llvm/test/tools/llvm-mca/AArch64/Neoverse/V2-rcpc-immo-instructions.s (+19-19)
  • (modified) llvm/test/tools/llvm-mca/AArch64/Neoverse/V3-rcpc-immo-instructions.s (+19-19)
  • (modified) llvm/test/tools/llvm-mca/AArch64/Neoverse/V3AE-rcpc-immo-instructions.s (+19-19)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrAtomics.td b/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
index 5d9215dd71233..32a86cbbff18c 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrAtomics.td
@@ -602,6 +602,8 @@ let Predicates = [HasRCPC3, HasNEON] in {
 // v8.4a FEAT_LRCPC2 patterns
 let Predicates = [HasRCPC_IMMO, UseLDAPUR] in {
   // Load-Acquire RCpc Register unscaled loads
+
+  // Zero-extended
   def : Pat<(acquiring_load<atomic_load_azext_8>
                (am_unscaled8 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURBi GPR64sp:$Rn, simm9:$offset)>;
@@ -614,6 +616,26 @@ let Predicates = [HasRCPC_IMMO, UseLDAPUR] in {
   def : Pat<(acquiring_load<atomic_load_nonext_64>
                (am_unscaled64 GPR64sp:$Rn, simm9:$offset)),
           (LDAPURXi GPR64sp:$Rn, simm9:$offset)>;
+
+  // Sign-extended
+  def : Pat<(sext_inreg (acquiring_load<atomic_load_8>
+                (am_unscaled8 GPR64sp:$Rn, simm9:$offset)), i8),
+          (LDAPURSBWi GPR64sp:$Rn, i64:$offset)>;
+  def : Pat<(sext_inreg (i64 (anyext (i32
+                (acquiring_load<atomic_load_8>
+                    (am_unscaled8 GPR64sp:$Rn, simm9:$offset))))), i8),
+          (LDAPURSBXi GPR64sp:$Rn, i64:$offset)>;
+  def : Pat<(sext_inreg (acquiring_load<atomic_load_16>
+                (am_unscaled16 GPR64sp:$Rn, simm9:$offset)), i16),
+          (LDAPURSHWi GPR64sp:$Rn, i64:$offset)>;
+  def : Pat<(sext_inreg (i64 (anyext (i32
+                (acquiring_load<atomic_load_16>
+                    (am_unscaled16 GPR64sp:$Rn, simm9:$offset))))), i16),
+          (LDAPURSHXi GPR64sp:$Rn, i64:$offset)>;
+  def : Pat<(i64 (sext (i32
+                (acquiring_load<atomic_load_32>
+                    (am_unscaled32 GPR64sp:$Rn, simm9:$offset))))),
+          (LDAPURSWi GPR64sp:$Rn, i64:$offset)>;
 }
 
 let Predicates = [HasRCPC_IMMO] in {
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll
index 02ff12c27fcda..572f8215b2211 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc_immo.ll
@@ -59,6 +59,44 @@ define i8 @load_atomic_i8_aligned_acquire(ptr %ptr) {
     ret i8 %r
 }
 
+define i32 @load_atomic_i8_aligned_acquire_sext_i32(ptr %ptr) {
+; GISEL-LABEL: load_atomic_i8_aligned_acquire_sext_i32:
+; GISEL:    add x8, x0, #4
+; GISEL:    ldaprb w8, [x8]
+; GISEL:    sxtb w0, w8
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i8_aligned_acquire_sext_i32:
+; SDAG-AVOIDLDAPUR:    add x8, x0, #4
+; SDAG-AVOIDLDAPUR:    ldaprb w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxtb w0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i8_aligned_acquire_sext_i32:
+; SDAG-NOAVOIDLDAPUR:    ldapursb w0, [x0, #4]
+    %gep = getelementptr inbounds i8, ptr %ptr, i32 4
+    %r = load atomic i8, ptr %gep acquire, align 1
+    %r.sext = sext i8 %r to i32
+    ret i32 %r.sext
+}
+
+define i64 @load_atomic_i8_aligned_acquire_sext_i64(ptr %ptr) {
+; GISEL-LABEL: load_atomic_i8_aligned_acquire_sext_i64:
+; GISEL:    add x8, x0, #4
+; GISEL:    ldaprb w8, [x8]
+; GISEL:    sxtb x0, w8
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i8_aligned_acquire_sext_i64:
+; SDAG-AVOIDLDAPUR:    add x8, x0, #4
+; SDAG-AVOIDLDAPUR:    ldaprb w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxtb x0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i8_aligned_acquire_sext_i64:
+; SDAG-NOAVOIDLDAPUR:    ldapursb x0, [x0, #4]
+    %gep = getelementptr inbounds i8, ptr %ptr, i32 4
+    %r = load atomic i8, ptr %gep acquire, align 1
+    %r.sext = sext i8 %r to i64
+    ret i64 %r.sext
+}
+
 define i8 @load_atomic_i8_aligned_acquire_const(ptr readonly %ptr) {
 ; GISEL-LABEL: load_atomic_i8_aligned_acquire_const:
 ; GISEL:    add x8, x0, #4
@@ -75,6 +113,44 @@ define i8 @load_atomic_i8_aligned_acquire_const(ptr readonly %ptr) {
     ret i8 %r
 }
 
+define i32 @load_atomic_i8_aligned_acquire_const_sext_i32(ptr readonly %ptr) {
+; GISEL-LABEL: load_atomic_i8_aligned_acquire_const_sext_i32:
+; GISEL:    add x8, x0, #4
+; GISEL:    ldaprb w8, [x8]
+; GISEL:    sxtb w0, w8
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i8_aligned_acquire_const_sext_i32:
+; SDAG-AVOIDLDAPUR:    add x8, x0, #4
+; SDAG-AVOIDLDAPUR:    ldaprb w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxtb w0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i8_aligned_acquire_const_sext_i32:
+; SDAG-NOAVOIDLDAPUR:    ldapursb w0, [x0, #4]
+    %gep = getelementptr inbounds i8, ptr %ptr, i32 4
+    %r = load atomic i8, ptr %gep acquire, align 1
+    %r.sext = sext i8 %r to i32
+    ret i32 %r.sext
+}
+
+define i64 @load_atomic_i8_aligned_acquire_const_sext_i64(ptr readonly %ptr) {
+; GISEL-LABEL: load_atomic_i8_aligned_acquire_const_sext_i64:
+; GISEL:    add x8, x0, #4
+; GISEL:    ldaprb w8, [x8]
+; GISEL:    sxtb x0, w8
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i8_aligned_acquire_const_sext_i64:
+; SDAG-AVOIDLDAPUR:    add x8, x0, #4
+; SDAG-AVOIDLDAPUR:    ldaprb w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxtb x0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i8_aligned_acquire_const_sext_i64:
+; SDAG-NOAVOIDLDAPUR:    ldapursb x0, [x0, #4]
+    %gep = getelementptr inbounds i8, ptr %ptr, i32 4
+    %r = load atomic i8, ptr %gep acquire, align 1
+    %r.sext = sext i8 %r to i64
+    ret i64 %r.sext
+}
+
 define i8 @load_atomic_i8_aligned_seq_cst(ptr %ptr) {
 ; CHECK-LABEL: load_atomic_i8_aligned_seq_cst:
 ; CHECK:    add x8, x0, #4
@@ -141,6 +217,44 @@ define i16 @load_atomic_i16_aligned_acquire(ptr %ptr) {
     ret i16 %r
 }
 
+define i32 @load_atomic_i16_aligned_acquire_sext_i32(ptr %ptr) {
+; GISEL-LABEL: load_atomic_i16_aligned_acquire_sext_i32:
+; GISEL:    add x8, x0, #8
+; GISEL:    ldaprh w8, [x8]
+; GISEL:    sxth w0, w8
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i16_aligned_acquire_sext_i32:
+; SDAG-AVOIDLDAPUR:    add x8, x0, #8
+; SDAG-AVOIDLDAPUR:    ldaprh w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxth w0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i16_aligned_acquire_sext_i32:
+; SDAG-NOAVOIDLDAPUR:    ldapursh w0, [x0, #8]
+    %gep = getelementptr inbounds i16, ptr %ptr, i32 4
+    %r = load atomic i16, ptr %gep acquire, align 2
+    %r.sext = sext i16 %r to i32
+    ret i32 %r.sext
+}
+
+define i64 @load_atomic_i16_aligned_acquire_sext_i64(ptr %ptr) {
+; GISEL-LABEL: load_atomic_i16_aligned_acquire_sext_i64:
+; GISEL:    add x8, x0, #8
+; GISEL:    ldaprh w8, [x8]
+; GISEL:    sxth x0, w8
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i16_aligned_acquire_sext_i64:
+; SDAG-AVOIDLDAPUR:    add x8, x0, #8
+; SDAG-AVOIDLDAPUR:    ldaprh w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxth x0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i16_aligned_acquire_sext_i64:
+; SDAG-NOAVOIDLDAPUR:    ldapursh x0, [x0, #8]
+    %gep = getelementptr inbounds i16, ptr %ptr, i32 4
+    %r = load atomic i16, ptr %gep acquire, align 2
+    %r.sext = sext i16 %r to i64
+    ret i64 %r.sext
+}
+
 define i16 @load_atomic_i16_aligned_acquire_const(ptr readonly %ptr) {
 ; GISEL-LABEL: load_atomic_i16_aligned_acquire_const:
 ; GISEL:    add x8, x0, #8
@@ -157,6 +271,44 @@ define i16 @load_atomic_i16_aligned_acquire_const(ptr readonly %ptr) {
     ret i16 %r
 }
 
+define i32 @load_atomic_i16_aligned_acquire_const_sext_i32(ptr readonly %ptr) {
+; GISEL-LABEL: load_atomic_i16_aligned_acquire_const_sext_i32:
+; GISEL:    add x8, x0, #8
+; GISEL:    ldaprh w8, [x8]
+; GISEL:    sxth w0, w8
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i16_aligned_acquire_const_sext_i32:
+; SDAG-AVOIDLDAPUR:    add x8, x0, #8
+; SDAG-AVOIDLDAPUR:    ldaprh w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxth w0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i16_aligned_acquire_const_sext_i32:
+; SDAG-NOAVOIDLDAPUR:    ldapursh w0, [x0, #8]
+    %gep = getelementptr inbounds i16, ptr %ptr, i32 4
+    %r = load atomic i16, ptr %gep acquire, align 2
+    %r.sext = sext i16 %r to i32
+    ret i32 %r.sext
+}
+
+define i64 @load_atomic_i16_aligned_acquire_const_sext_i64(ptr readonly %ptr) {
+; GISEL-LABEL: load_atomic_i16_aligned_acquire_const_sext_i64:
+; GISEL:    add x8, x0, #8
+; GISEL:    ldaprh w8, [x8]
+; GISEL:    sxth x0, w8
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i16_aligned_acquire_const_sext_i64:
+; SDAG-AVOIDLDAPUR:    add x8, x0, #8
+; SDAG-AVOIDLDAPUR:    ldaprh w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxth x0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i16_aligned_acquire_const_sext_i64:
+; SDAG-NOAVOIDLDAPUR:    ldapursh x0, [x0, #8]
+    %gep = getelementptr inbounds i16, ptr %ptr, i32 4
+    %r = load atomic i16, ptr %gep acquire, align 2
+    %r.sext = sext i16 %r to i64
+    ret i64 %r.sext
+}
+
 define i16 @load_atomic_i16_aligned_seq_cst(ptr %ptr) {
 ; CHECK-LABEL: load_atomic_i16_aligned_seq_cst:
 ; CHECK:    add x8, x0, #8
@@ -222,6 +374,24 @@ define i32 @load_atomic_i32_aligned_acquire(ptr %ptr) {
     ret i32 %r
 }
 
+define i64 @load_atomic_i32_aligned_acquire_sext_i64(ptr %ptr) {
+; GISEL-LABEL: load_atomic_i32_aligned_acquire_sext_i64:
+; GISEL:    ldapur w8, [x0, #16]
+; GISEL:    ldapursw x0, [x0, #16]
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i32_aligned_acquire_sext_i64:
+; SDAG-AVOIDLDAPUR:    add x8, x0, #16
+; SDAG-AVOIDLDAPUR:    ldapr w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxtw x0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i32_aligned_acquire_sext_i64:
+; SDAG-NOAVOIDLDAPUR:    ldapursw x0, [x0, #16]
+    %gep = getelementptr inbounds i32, ptr %ptr, i32 4
+    %r = load atomic i32, ptr %gep acquire, align 4
+    %r.sext = sext i32 %r to i64
+    ret i64 %r.sext
+}
+
 define i32 @load_atomic_i32_aligned_acquire_const(ptr readonly %ptr) {
 ; GISEL-LABEL: load_atomic_i32_aligned_acquire_const:
 ; GISEL:    ldapur w0, [x0, #16]
@@ -237,6 +407,24 @@ define i32 @load_atomic_i32_aligned_acquire_const(ptr readonly %ptr) {
     ret i32 %r
 }
 
+define i64 @load_atomic_i32_aligned_acquire_const_sext_i64(ptr readonly %ptr) {
+; GISEL-LABEL: load_atomic_i32_aligned_acquire_const_sext_i64:
+; GISEL:    ldapur w8, [x0, #16]
+; GISEL:    ldapursw x0, [x0, #16]
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i32_aligned_acquire_const_sext_i64:
+; SDAG-AVOIDLDAPUR:    add x8, x0, #16
+; SDAG-AVOIDLDAPUR:    ldapr w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxtw x0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i32_aligned_acquire_const_sext_i64:
+; SDAG-NOAVOIDLDAPUR:    ldapursw x0, [x0, #16]
+    %gep = getelementptr inbounds i32, ptr %ptr, i32 4
+    %r = load atomic i32, ptr %gep acquire, align 4
+    %r.sext = sext i32 %r to i64
+    ret i64 %r.sext
+}
+
 define i32 @load_atomic_i32_aligned_seq_cst(ptr %ptr) {
 ; CHECK-LABEL: load_atomic_i32_aligned_seq_cst:
 ; CHECK:    add x8, x0, #16
@@ -922,6 +1110,54 @@ define i8 @load_atomic_i8_from_gep() {
   ret i8 %l
 }
 
+define i32 @load_atomic_i8_from_gep_sext_i32() {
+; GISEL-LABEL: load_atomic_i8_from_gep_sext_i32:
+; GISEL:    bl init
+; GISEL:    add x8, x8, #1
+; GISEL:    ldaprb w8, [x8]
+; GISEL:    sxtb w0, w8
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i8_from_gep_sext_i32:
+; SDAG-AVOIDLDAPUR:    bl init
+; SDAG-AVOIDLDAPUR:    orr x8, x19, #0x1
+; SDAG-AVOIDLDAPUR:    ldaprb w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxtb w0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i8_from_gep_sext_i32:
+; SDAG-NOAVOIDLDAPUR:    bl init
+; SDAG-NOAVOIDLDAPUR:    ldapursb w0, [sp, #13]
+  %a = alloca [3 x i8]
+  call void @init(ptr %a)
+  %arrayidx  = getelementptr [3 x i8], ptr %a, i64 0, i64 1
+  %l = load atomic i8, ptr %arrayidx acquire, align 8
+  %l.sext = sext i8 %l to i32
+  ret i32 %l.sext
+}
+
+define i64 @load_atomic_i8_from_gep_sext_i64() {
+; GISEL-LABEL: load_atomic_i8_from_gep_sext_i64:
+; GISEL:    bl init
+; GISEL:    add x8, x8, #1
+; GISEL:    ldaprb w8, [x8]
+; GISEL:    sxtb x0, w8
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i8_from_gep_sext_i64:
+; SDAG-AVOIDLDAPUR:    bl init
+; SDAG-AVOIDLDAPUR:    orr x8, x19, #0x1
+; SDAG-AVOIDLDAPUR:    ldaprb w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxtb x0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i8_from_gep_sext_i64:
+; SDAG-NOAVOIDLDAPUR:    bl init
+; SDAG-NOAVOIDLDAPUR:    ldapursb x0, [sp, #13]
+  %a = alloca [3 x i8]
+  call void @init(ptr %a)
+  %arrayidx  = getelementptr [3 x i8], ptr %a, i64 0, i64 1
+  %l = load atomic i8, ptr %arrayidx acquire, align 8
+  %l.sext = sext i8 %l to i64
+  ret i64 %l.sext
+}
+
 define i16 @load_atomic_i16_from_gep() {
 ; GISEL-LABEL: load_atomic_i16_from_gep:
 ; GISEL:    bl init
@@ -943,6 +1179,54 @@ define i16 @load_atomic_i16_from_gep() {
   ret i16 %l
 }
 
+define i32 @load_atomic_i16_from_gep_sext_i32() {
+; GISEL-LABEL: load_atomic_i16_from_gep_sext_i32:
+; GISEL:    bl init
+; GISEL:    add x8, x8, #2
+; GISEL:    ldaprh w8, [x8]
+; GISEL:    sxth w0, w8
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i16_from_gep_sext_i32:
+; SDAG-AVOIDLDAPUR:    bl init
+; SDAG-AVOIDLDAPUR:    orr x8, x19, #0x2
+; SDAG-AVOIDLDAPUR:    ldaprh w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxth w0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i16_from_gep_sext_i32:
+; SDAG-NOAVOIDLDAPUR:    bl init
+; SDAG-NOAVOIDLDAPUR:    ldapursh w0, [sp, #10]
+  %a = alloca [3 x i16]
+  call void @init(ptr %a)
+  %arrayidx  = getelementptr [3 x i16], ptr %a, i64 0, i64 1
+  %l = load atomic i16, ptr %arrayidx acquire, align 8
+  %l.sext = sext i16 %l to i32
+  ret i32 %l.sext
+}
+
+define i64 @load_atomic_i16_from_gep_sext_i64() {
+; GISEL-LABEL: load_atomic_i16_from_gep_sext_i64:
+; GISEL:    bl init
+; GISEL:    add x8, x8, #2
+; GISEL:    ldaprh w8, [x8]
+; GISEL:    sxth x0, w8
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i16_from_gep_sext_i64:
+; SDAG-AVOIDLDAPUR:    bl init
+; SDAG-AVOIDLDAPUR:    orr x8, x19, #0x2
+; SDAG-AVOIDLDAPUR:    ldaprh w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxth x0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i16_from_gep_sext_i64:
+; SDAG-NOAVOIDLDAPUR:    bl init
+; SDAG-NOAVOIDLDAPUR:    ldapursh x0, [sp, #10]
+  %a = alloca [3 x i16]
+  call void @init(ptr %a)
+  %arrayidx  = getelementptr [3 x i16], ptr %a, i64 0, i64 1
+  %l = load atomic i16, ptr %arrayidx acquire, align 8
+  %l.sext = sext i16 %l to i64
+  ret i64 %l.sext
+}
+
 define i32 @load_atomic_i32_from_gep() {
 ; GISEL-LABEL: load_atomic_i32_from_gep:
 ; GISEL:    bl init
@@ -963,6 +1247,29 @@ define i32 @load_atomic_i32_from_gep() {
   ret i32 %l
 }
 
+define i64 @load_atomic_i32_from_gep_sext_i64() {
+; GISEL-LABEL: load_atomic_i32_from_gep_sext_i64:
+; GISEL:    bl init
+; GISEL:    ldapur w9, [x8, #4]
+; GISEL:    ldapursw x0, [x8, #4]
+;
+; SDAG-AVOIDLDAPUR-LABEL: load_atomic_i32_from_gep_sext_i64:
+; SDAG-AVOIDLDAPUR:    bl init
+; SDAG-AVOIDLDAPUR:    add x8, x19, #4
+; SDAG-AVOIDLDAPUR:    ldapr w8, [x8]
+; SDAG-AVOIDLDAPUR:    sxtw x0, w8
+;
+; SDAG-NOAVOIDLDAPUR-LABEL: load_atomic_i32_from_gep_sext_i64:
+; SDAG-NOAVOIDLDAPUR:    bl init
+; SDAG-NOAVOIDLDAPUR:    ldapursw x0, [sp, #8]
+  %a = alloca [3 x i32]
+  call void @init(ptr %a)
+  %arrayidx  = getelementptr [3 x i32], ptr %a, i64 0, i64 1
+  %l = load atomic i32, ptr %arrayidx acquire, align 8
+  %l.sext = sext i32 %l to i64
+  ret i64 %l.sext
+}
+
 define i64 @load_atomic_i64_from_gep() {
 ; GISEL-LABEL: load_atomic_i64_from_gep:
 ; GISEL:    bl init
diff --git a/llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-rcpc-immo-instructions.s b/llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-rcpc-immo-instructions.s
index d9943f342b827..d967b1b651b4b 100644
--- a/llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-rcpc-immo-instructions.s
+++ b/llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-rcpc-immo-instructions.s
@@ -10,15 +10,15 @@
 # CHECK-NEXT: [6]: HasSideEffects (U)
 
 # CHECK:      [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
-# CHECK-NEXT:  1      4     0.33    *                   ldapur	w7, [x24]
-# CHECK-NEXT:  1      4     0.33    *                   ldapur	x20, [x13]
-# CHECK-NEXT:  1      4     0.33    *                   ldapurb	w13, [x17]
-# CHECK-NEXT:  1      4     0.33    *                   ldapurh	w3, [x22]
-# CHECK-NEXT:  1      4     0.33                  U     ldapursb	w7, [x8]
-# CHECK-NEXT:  1      4     0.33                  U     ldapursb	x29, [x7]
-# CHECK-NEXT:  1      4     0.33                  U     ldapursh	w17, [x19]
-# CHECK-NEXT:  1      4     0.33                  U     ldapursh	x3, [x3]
-# CHECK-NEXT:  1      4     0.33                  U     ldapursw	x3, [x18]
+# CHECK-NEXT:  2      1     0.50    *                   ldapur	w7, [x24]
+# CHECK-NEXT:  2      1     0.50    *                   ldapur	x20, [x13]
+# CHECK-NEXT:  2      1     0.50    *                   ldapurb	w13, [x17]
+# CHECK-NEXT:  2      1     0.50    *                   ldapurh	w3, [x22]
+# CHECK-NEXT:  2      1     0.50    *                   ldapursb	w7, [x8]
+# CHECK-NEXT:  2      1     0.50    *                   ldapursb	x29, [x7]
+# CHECK-NEXT:  2      1     0.50    *                   ldapursh	w17, [x19]
+# CHECK-NEXT:  2      1     0.50    *                   ldapursh	x3, [x3]
+# CHECK-NEXT:  2      1     0.50    *                   ldapursw	x3, [x18]
 # CHECK-NEXT:  2      1     0.50           *            stlur	w3, [x27]
 # CHECK-NEXT:  2      1     0.50           *            stlur	x23, [x25]
 # CHECK-NEXT:  2      1     0.50           *            stlurb	w30, [x17]
@@ -41,19 +41,19 @@
 
 # CHECK:      Resource pressure per iteration:
 # CHECK-NEXT: [0.0]  [0.1]  [1.0]  [1.1]  [2]    [3.0]  [3.1]  [4]    [5]    [6.0]  [6.1]  [7]    [8]
-# CHECK-NEXT:  -      -     2.00   2.00   3.00   5.00   5.00    -      -      -      -      -      -
+# CHECK-NEXT:  -      -     6.50   6.50    -     6.50   6.50    -      -      -      -      -      -
 
 # CHECK:      Resource pressure by instruction:
 # CHECK-NEXT: [0.0]  [0.1]  [1.0]  [1.1]  [2]    [3.0]  [3.1]  [4]    [5]    [6.0]  [6.1]  [7]    [8]    Instructions:
-# CHECK-NEXT:  -      -      -      -     0.33   0.33   0.33    -      -      -      -      -      -     ldapur	w7, [x24]
-# CHECK-NEXT:  -      -      -      -     0.33   0.33   0.33    -      -      -      -      -      -     ldapur	x20, [x13]
-# CHECK-NEXT:  -      -      -      -     0.33   0.33   0.33    -      -      -      -      -      -     ldapurb	w13, [x17]
-# CHECK-NEXT:  -      -      -      -     0.33   0.33   0.33    -      -      -      -      -      -     ldapurh	w3, [x22]
-# CHECK-NEXT:  -      -      -      -     0.33   0.33   0.33    -      -      -      -      -      -     ldapursb	w7, [x8]
-# CHECK-NEXT:  -      -      -      -     0.33   0.33   0.33    -      -      -      -      -      -     ldapursb	x29, [x7]
-# CHECK-NEXT:  -      -      -      -     0.33   0.33   0.33    -      -      -      -      -      -     ldapursh	w17, [x19]
-# CHECK-NEXT:  -      -      -      -     0.33   0.33   0.33    -      -      -      -      -      -     ldapursh	x3, [x3]
-# CHECK-NEXT:  -      -      -      -     0.33   0.33   0.33    -      -      -      -      -      -     ldapursw	x3, [x18]
+# CHECK-NEXT:  -      -     0.50   0.50    -     0.50   0.50    -      -      -      -      -      -     ldapur	w7, [x24]
+# CHECK-NEXT:  -      -     0.50   0.50    -     0.50   0.50    -      -      -      -      -      -     ldapur	x20, [x13]
+# CHECK-NEXT:  -      -     0.50   0.50    -     0.50   0.50    -      -      -      -      -      -     ldapurb	w13, [x17]
+# CHECK-NEXT:  -      -     0.50   0.50    -     0.50   0.50    -      -      -      -      -      -     ldapurh	w3, [x22]
+# CHECK-NEXT:  -      -     0.50   0.50    -     0.50   0.50    -      -      -      -      -      -     ldapursb	w7, [x8]
+# CHECK-NEXT:  -      -     0.50   0.50    -     0.50   0.50    -      -      -      -      -      -     ldapursb	x29, [x7]
+# CHECK-NEXT:  -      -     0.50   0.50    -     0.50   0.50    -      -      -      -      -      -     ldapursh	w17, [x19]
+# CHECK-NEXT:  -      -     0.50   0.50    -     0.50   0.50    -      -      -      -      -      -     ldapursh	x3, [x3]
+# CHECK-NEXT:  -      -     0.50   0.50    -     0.50   0.50    -      -      -      -      -      -     ldapursw	x3, [x18]
 # CHECK-NEXT:  -      -     0.50   0.50    -     0.50   0.50    -      -      -      -      -      -     stlur	w3, [x27]
 # CHECK-NEXT:  -      -     0.50   0.50    -     0.50   0.50    -      -      -      -      -      -     stlur	x23, [x25]
 # CHECK-NEXT:  -      -     0.50   0.50    -     0.50   0.50    -      -      -      -      -      -     stlurb	w30, [x17]
diff --git a/llvm/test/tools/llvm-mca/AArch64/Neoverse/N3-rcpc-immo...
[truncated]

@llvm llvm deleted a comment from github-actions bot Dec 11, 2025
@llvm llvm deleted a comment from github-actions bot Dec 11, 2025
@davemgreen
Copy link
Collaborator

Can you rebase? I think this sounds OK.

@cofibrant
Copy link
Contributor Author

Can you rebase? I think this sounds OK.

Will do, thanks!

@cofibrant cofibrant force-pushed the users/cofibrant/ldapurs-patterns branch from b3afaf0 to 77599c7 Compare December 15, 2025 09:05
ret i8 %r
}

define i32 @load_atomic_i8_aligned_acquire_sext_i32(ptr %ptr) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you use llvm/utils/update_llc_test_checks.py to generate the CHECK lines? I noticed in the existing test above SDAG-NOAVOIDLDAPUR comes before SDAG-AVOIDLDAPUR and when i run the script i'm seeing a diff

Also, pre-committing tests or staging the PR commits so the before/after can be seen makes reviewing these sort of patches much easier. I'm not requesting any changes, but just something to bear in mind in future.

Copy link
Contributor Author

@cofibrant cofibrant Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did use llvm/utils/update_llc_test_checks.py to generate the CHECK lines, but I reverted the hunks that just reordered the CHECK lines in the existing tests as I thought this would just add noise in the PR. I can reorder these if you like, though.

You're right, though, I should have precommited the tests. Good spot 😅

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, so it's the existing tests that need re-generating causing the diff, no worries.

@cofibrant
Copy link
Contributor Author

@davemgreen, @c-rhodes, do you think this looks alright to merge, then?

Copy link
Collaborator

@c-rhodes c-rhodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the patterns and tests look good to me, just one question changes in GISel

Comment on lines +379 to +380
; GISEL: ldapur w8, [x0, #16]
; GISEL: ldapursw x0, [x0, #16]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like GISEL has regressed here with the original ldapur remaining? I know this is O0 but it's also at higher optimization levels

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wow! I hadn't spotted this. I'll get the debugger out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For whatever reason, when the importer matches this pattern, it only erases the corresponding G_SEXT instruction, not the associated G_LOAD. As such, the G_LOAD ends up being selected twice. Later passes identify the ldapur as dead, but don't delete the instruction.

Copy link
Contributor Author

@cofibrant cofibrant Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @davemgreen knows what to do in these situations? (I'm not even sure why this is lowered to a G_LOAD followed by G_SEXT rather than just G_SEXTLOAD...)

@cofibrant
Copy link
Contributor Author

Sorry to ping but I just wanted to bump this as I wanted to get this squared away before I finish for the year

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants