Skip to content

Conversation

@dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented May 31, 2024

Proof: https://alive2.llvm.org/ce/z/J7GBMU

Same as #92753, the types of LHS and RHS in shift nodes may differ.

  • When VT is smaller than ShiftVT, it is safe to use trunc.
  • When VT is larger than ShiftVT, it is safe to use zext iff is_zero_poison is true (i.e., opcode == ISD::CTTZ_ZERO_UNDEF). See also the counterexample src_shl_cttz2 -> tgt_shl_cttz2 in the alive2 proofs.

Fixes issue #85066 (comment).

@llvmbot
Copy link
Member

llvmbot commented May 31, 2024

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-powerpc

Author: Yingwei Zheng (dtcxzyw)

Changes

Proof: https://alive2.llvm.org/ce/z/J7GBMU

Same as #92753, the types of LHS and RHS in shift nodes may differ.

  • When VT is smaller than ShiftVT, it is safe to use trunc.
  • When VT is larger than ShiftVT, it is safe to use zext iff is_zero_poison is true (i.e., opcode == ISD::CTTZ_ZERO_UNDEF). See also the counterexample src_shl_cttz2 -> tgt_shl_cttz2 in the alive2 proofs.

Fixes issue #85066 (comment).


Full diff: https://github.com/llvm/llvm-project/pull/94008.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+7-4)
  • (added) llvm/test/CodeGen/PowerPC/pr85066.ll (+45)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 42e861e61201c..2084f9727f9bb 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -10109,13 +10109,16 @@ SDValue DAGCombiner::visitSHL(SDNode *N) {
 
   // fold (shl X, cttz(Y)) -> (mul (Y & -Y), X) if cttz is unsupported on the
   // target.
-  if ((N1.getOpcode() == ISD::CTTZ || N1.getOpcode() == ISD::CTTZ_ZERO_UNDEF) &&
-      N1.hasOneUse() && !TLI.isOperationLegalOrCustom(ISD::CTTZ, VT) &&
+  if (((N1.getOpcode() == ISD::CTTZ &&
+        VT.getScalarSizeInBits() >= ShiftVT.getScalarSizeInBits()) ||
+       N1.getOpcode() == ISD::CTTZ_ZERO_UNDEF) &&
+      N1.hasOneUse() && !TLI.isOperationLegalOrCustom(ISD::CTTZ, ShiftVT) &&
       TLI.isOperationLegalOrCustom(ISD::MUL, VT)) {
     SDValue Y = N1.getOperand(0);
     SDLoc DL(N);
-    SDValue NegY = DAG.getNegative(Y, DL, VT);
-    SDValue And = DAG.getNode(ISD::AND, DL, VT, Y, NegY);
+    SDValue NegY = DAG.getNegative(Y, DL, ShiftVT);
+    SDValue And =
+        DAG.getZExtOrTrunc(DAG.getNode(ISD::AND, DL, ShiftVT, Y, NegY), DL, VT);
     return DAG.getNode(ISD::MUL, DL, VT, And, N0);
   }
 
diff --git a/llvm/test/CodeGen/PowerPC/pr85066.ll b/llvm/test/CodeGen/PowerPC/pr85066.ll
new file mode 100644
index 0000000000000..9b47d6e6508fa
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/pr85066.ll
@@ -0,0 +1,45 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=powerpc64le -verify-machineinstrs < %s | FileCheck %s
+
+; Tests from pr85066
+define i64 @test_shl_zext_cttz(i16 %x) {
+; CHECK-LABEL: test_shl_zext_cttz:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    oris 3, 3, 1
+; CHECK-NEXT:    neg 4, 3
+; CHECK-NEXT:    and 3, 3, 4
+; CHECK-NEXT:    clrldi 3, 3, 32
+; CHECK-NEXT:    blr
+entry:
+  %cttz = tail call i16 @llvm.cttz.i16(i16 %x, i1 false)
+  %zext = zext i16 %cttz to i64
+  %res = shl i64 1, %zext
+  ret i64 %res
+}
+
+define i64 @test_shl_zext_cttz_zero_is_poison(i16 %x) {
+; CHECK-LABEL: test_shl_zext_cttz_zero_is_poison:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    neg 4, 3
+; CHECK-NEXT:    and 3, 3, 4
+; CHECK-NEXT:    clrldi 3, 3, 32
+; CHECK-NEXT:    blr
+entry:
+  %cttz = tail call i16 @llvm.cttz.i16(i16 %x, i1 true)
+  %zext = zext i16 %cttz to i64
+  %res = shl i64 1, %zext
+  ret i64 %res
+}
+
+define i16 @test_shl_trunc_cttz(i32 %x) {
+; CHECK-LABEL: test_shl_trunc_cttz:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    neg 4, 3
+; CHECK-NEXT:    and 3, 3, 4
+; CHECK-NEXT:    blr
+entry:
+  %cttz = tail call i32 @llvm.cttz.i32(i32 %x, i1 false)
+  %trunc = trunc i32 %cttz to i16
+  %res = shl i16 1, %trunc
+  ret i16 %res
+}

@dtcxzyw dtcxzyw merged commit 47fd32f into llvm:main Jun 1, 2024
@dtcxzyw dtcxzyw deleted the fix-pr85066 branch June 1, 2024 11:04
dtcxzyw added a commit that referenced this pull request Jun 8, 2024
The pr description in #94008 mismatches with the code.
> + When VT is smaller than ShiftVT, it is safe to use trunc.
> + When VT is larger than ShiftVT, it is safe to use zext iff
`is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See
also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2
    proofs.

Closes #94824.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:PowerPC llvm:SelectionDAG SelectionDAGISel as well

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants