[RISCV] Refine cost on Min/Max reduction #79402

arcbbb · 2024-01-25T04:07:34Z

This patch is split off from #77342, and follows #79103

Correct for CodeSize cost that 1 instruction is not included. 3 is from {VMV.S, ReductionOp, VMV.X}
Add SplitCost which chains a series of VMAX/VMIN/... which scales with LMUL.
Use MVT to estimate VL.

This patch is split off from llvm#77342, and follows llvm#79103 - Correct for CodeSize cost that 1 instruction is not included. 3 is from {VMV.S, ReductionOp, VMV.X} - Add SplitCost which chains a series of VMAX/VMIN/... which scales with LMUL. - Use MVT to estimate VL.

llvmbot · 2024-01-25T04:08:03Z

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-backend-risc-v

Author: Shih-Po Hung (arcbbb)

Changes

This patch is split off from #77342, and follows #79103

Correct for CodeSize cost that 1 instruction is not included. 3 is from {VMV.S, ReductionOp, VMV.X}
Add SplitCost which chains a series of VMAX/VMIN/... which scales with LMUL.
Use MVT to estimate VL.

Patch is 107.33 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79402.diff

5 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+36-7)
(modified) llvm/test/Analysis/CostModel/RISCV/reduce-max.ll (+70-70)
(modified) llvm/test/Analysis/CostModel/RISCV/reduce-min.ll (+70-70)
(modified) llvm/test/Analysis/CostModel/RISCV/reduce-scalable-fp.ll (+42-42)
(modified) llvm/test/Analysis/CostModel/RISCV/reduce-scalable-int.ll (+48-48)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 0799af0a7bad550..0ab795ec99d8a87 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -942,13 +942,42 @@ RISCVTTIImpl::getMinMaxReductionCost(Intrinsic::ID IID, VectorType *Ty,
     return (LT.first - 1) + 3;
 
   // IR Reduction is composed by two vmv and one rvv reduction instruction.
-  InstructionCost BaseCost = 2;
-
-  if (CostKind == TTI::TCK_CodeSize)
-    return (LT.first - 1) + BaseCost;
-
-  unsigned VL = getEstimatedVLFor(Ty);
-  return (LT.first - 1) + BaseCost + Log2_32_Ceil(VL);
+  unsigned SplitOp;
+  SmallVector<unsigned, 3> Opcodes;
+  switch (IID) {
+  default:
+    llvm_unreachable("Unsupported intrinsic");
+  case Intrinsic::smax:
+    SplitOp = RISCV::VMAX_VV;
+    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMAX_VS, RISCV::VMV_X_S};
+    break;
+  case Intrinsic::smin:
+    SplitOp = RISCV::VMIN_VV;
+    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMIN_VS, RISCV::VMV_X_S};
+    break;
+  case Intrinsic::umax:
+    SplitOp = RISCV::VMAXU_VV;
+    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMAXU_VS, RISCV::VMV_X_S};
+    break;
+  case Intrinsic::umin:
+    SplitOp = RISCV::VMINU_VV;
+    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMINU_VS, RISCV::VMV_X_S};
+    break;
+  case Intrinsic::maxnum:
+    SplitOp = RISCV::VFMAX_VV;
+    Opcodes = {RISCV::VFMV_S_F, RISCV::VFREDMAX_VS, RISCV::VFMV_F_S};
+    break;
+  case Intrinsic::minnum:
+    SplitOp = RISCV::VFMIN_VV;
+    Opcodes = {RISCV::VFMV_S_F, RISCV::VFREDMIN_VS, RISCV::VFMV_F_S};
+    break;
+  }
+  // Add a cost for data larger than LMUL8
+  InstructionCost SplitCost =
+      (LT.first > 1) ? (LT.first - 1) *
+                           getRISCVInstructionCost(SplitOp, LT.second, CostKind)
+                     : 0;
+  return SplitCost + getRISCVInstructionCost(Opcodes, LT.second, CostKind);
 }
 
 InstructionCost
diff --git a/llvm/test/Analysis/CostModel/RISCV/reduce-max.ll b/llvm/test/Analysis/CostModel/RISCV/reduce-max.ll
index c21b8520112cf88..11fa50d355e833d 100644
--- a/llvm/test/Analysis/CostModel/RISCV/reduce-max.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/reduce-max.ll
@@ -51,14 +51,14 @@ define i32 @reduce_umax_i8(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_umax_i8'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i8 @llvm.vector.reduce.umax.v1i8(<1 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i8 @llvm.vector.reduce.umax.v2i8(<2 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i8 @llvm.vector.reduce.umax.v4i8(<4 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i8 @llvm.vector.reduce.umax.v8i8(<8 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i8 @llvm.vector.reduce.umax.v16i8(<16 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i8 @llvm.vector.reduce.umax.v32i8(<32 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i8 @llvm.vector.reduce.umax.v64i8(<64 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = call i8 @llvm.vector.reduce.umax.v128i8(<128 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i8 @llvm.vector.reduce.umax.v1i8(<1 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.vector.reduce.umax.v2i8(<2 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i8 @llvm.vector.reduce.umax.v4i8(<4 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i8 @llvm.vector.reduce.umax.v8i8(<8 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i8 @llvm.vector.reduce.umax.v16i8(<16 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i8 @llvm.vector.reduce.umax.v32i8(<32 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i8 @llvm.vector.reduce.umax.v64i8(<64 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i8 @llvm.vector.reduce.umax.v128i8(<128 x i8> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i8 @llvm.vector.reduce.umax.v1i8(<1 x i8> undef)
@@ -85,14 +85,14 @@ define i32 @reduce_umax_i16(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_umax_i16'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i16 @llvm.vector.reduce.umax.v1i16(<1 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.vector.reduce.umax.v2i16(<2 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i16 @llvm.vector.reduce.umax.v4i16(<4 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i16 @llvm.vector.reduce.umax.v8i16(<8 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i16 @llvm.vector.reduce.umax.v16i16(<16 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i16 @llvm.vector.reduce.umax.v32i16(<32 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i16 @llvm.vector.reduce.umax.v64i16(<64 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = call i16 @llvm.vector.reduce.umax.v128i16(<128 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i16 @llvm.vector.reduce.umax.v1i16(<1 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i16 @llvm.vector.reduce.umax.v2i16(<2 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.vector.reduce.umax.v4i16(<4 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i16 @llvm.vector.reduce.umax.v8i16(<8 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i16 @llvm.vector.reduce.umax.v16i16(<16 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i16 @llvm.vector.reduce.umax.v32i16(<32 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i16 @llvm.vector.reduce.umax.v64i16(<64 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i16 @llvm.vector.reduce.umax.v128i16(<128 x i16> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i16 @llvm.vector.reduce.umax.v1i16(<1 x i16> undef)
@@ -115,18 +115,18 @@ define i32 @reduce_umax_i32(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i32 @llvm.vector.reduce.umax.v16i32(<16 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V32 = call i32 @llvm.vector.reduce.umax.v32i32(<32 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V64 = call i32 @llvm.vector.reduce.umax.v64i32(<64 x i32> undef)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %V128 = call i32 @llvm.vector.reduce.umax.v128i32(<128 x i32> undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %V128 = call i32 @llvm.vector.reduce.umax.v128i32(<128 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_umax_i32'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i32 @llvm.vector.reduce.umax.v1i32(<1 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.umax.v2i32(<2 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i32 @llvm.vector.reduce.umax.v8i32(<8 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i32 @llvm.vector.reduce.umax.v16i32(<16 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i32 @llvm.vector.reduce.umax.v32i32(<32 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i32 @llvm.vector.reduce.umax.v64i32(<64 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i32 @llvm.vector.reduce.umax.v128i32(<128 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i32 @llvm.vector.reduce.umax.v1i32(<1 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i32 @llvm.vector.reduce.umax.v2i32(<2 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i32 @llvm.vector.reduce.umax.v8i32(<8 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i32 @llvm.vector.reduce.umax.v16i32(<16 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i32 @llvm.vector.reduce.umax.v32i32(<32 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i32 @llvm.vector.reduce.umax.v64i32(<64 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V128 = call i32 @llvm.vector.reduce.umax.v128i32(<128 x i32> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i32 @llvm.vector.reduce.umax.v1i32(<1 x i32> undef)
@@ -148,19 +148,19 @@ define i32 @reduce_umax_i64(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V8 = call i64 @llvm.vector.reduce.umax.v8i64(<8 x i64> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i64 @llvm.vector.reduce.umax.v16i64(<16 x i64> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V32 = call i64 @llvm.vector.reduce.umax.v32i64(<32 x i64> undef)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %V64 = call i64 @llvm.vector.reduce.umax.v64i64(<64 x i64> undef)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V128 = call i64 @llvm.vector.reduce.umax.v128i64(<128 x i64> undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %V64 = call i64 @llvm.vector.reduce.umax.v64i64(<64 x i64> undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 31 for instruction: %V128 = call i64 @llvm.vector.reduce.umax.v128i64(<128 x i64> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_umax_i64'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i64 @llvm.vector.reduce.umax.v1i64(<1 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i64 @llvm.vector.reduce.umax.v2i64(<2 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i64 @llvm.vector.reduce.umax.v4i64(<4 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i64 @llvm.vector.reduce.umax.v8i64(<8 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i64 @llvm.vector.reduce.umax.v16i64(<16 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i64 @llvm.vector.reduce.umax.v32i64(<32 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i64 @llvm.vector.reduce.umax.v64i64(<64 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V128 = call i64 @llvm.vector.reduce.umax.v128i64(<128 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i64 @llvm.vector.reduce.umax.v1i64(<1 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i64 @llvm.vector.reduce.umax.v2i64(<2 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i64 @llvm.vector.reduce.umax.v4i64(<4 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i64 @llvm.vector.reduce.umax.v8i64(<8 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i64 @llvm.vector.reduce.umax.v16i64(<16 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i64 @llvm.vector.reduce.umax.v32i64(<32 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V64 = call i64 @llvm.vector.reduce.umax.v64i64(<64 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V128 = call i64 @llvm.vector.reduce.umax.v128i64(<128 x i64> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i64 @llvm.vector.reduce.umax.v1i64(<1 x i64> undef)
@@ -221,14 +221,14 @@ define i32 @reduce_smax_i8(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_smax_i8'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i8 @llvm.vector.reduce.smax.v1i8(<1 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i8 @llvm.vector.reduce.smax.v2i8(<2 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i8 @llvm.vector.reduce.smax.v8i8(<8 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i8 @llvm.vector.reduce.smax.v16i8(<16 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i8 @llvm.vector.reduce.smax.v32i8(<32 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i8 @llvm.vector.reduce.smax.v64i8(<64 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = call i8 @llvm.vector.reduce.smax.v128i8(<128 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i8 @llvm.vector.reduce.smax.v1i8(<1 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.vector.reduce.smax.v2i8(<2 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i8 @llvm.vector.reduce.smax.v8i8(<8 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i8 @llvm.vector.reduce.smax.v16i8(<16 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i8 @llvm.vector.reduce.smax.v32i8(<32 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i8 @llvm.vector.reduce.smax.v64i8(<64 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i8 @llvm.vector.reduce.smax.v128i8(<128 x i8> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i8 @llvm.vector.reduce.smax.v1i8(<1 x i8> undef)
@@ -255,14 +255,14 @@ define i32 @reduce_smax_i16(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_smax_i16'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i16 @llvm.vector.reduce.smax.v1i16(<1 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.vector.reduce.smax.v2i16(<2 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i16 @llvm.vector.reduce.smax.v4i16(<4 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i16 @llvm.vector.reduce.smax.v8i16(<8 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i16 @llvm.vector.reduce.smax.v16i16(<16 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i16 @llvm.vector.reduce.smax.v32i16(<32 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i16 @llvm.vector.reduce.smax.v64i16(<64 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = call i16 @llvm.vector.reduce.smax.v128i16(<128 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i16 @llvm.vector.reduce.smax.v1i16(<1 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i16 @llvm.vector.reduce.smax.v2i16(<2 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.vector.reduce.smax.v4i16(<4 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i16 @llvm.vector.reduce.smax.v8i16(<8 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i16 @llvm.vector.reduce.smax.v16i16(<16 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i16 @llvm.vector.reduce.smax.v32i16(<32 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i16 @llvm.vector.reduce.smax.v64i16(<64 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i16 @llvm.vector.reduce.smax.v128i16(<128 x i16> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i16 @llvm.vector.reduce.smax.v1i16(<1 x i16> undef)
@@ -285,18 +285,18 @@ define i32 @reduce_smax_i32(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i32 @llvm.vector.reduce.smax.v16i32(<16 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V32 = call i32 @llvm.vector.reduce.smax.v32i32(<32 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V64 = call i32 @llvm.vector.reduce.smax.v64i32(<64 x i32> undef)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %V128 = call i32 @llvm.vector.reduce.smax.v128i32(<128 x i32> undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %V128 = call i32 @llvm.vector.reduce.smax.v128i32(<128 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_smax_i32'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i32 @llvm.vector.reduce.smax.v1i32(<1 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.smax.v2i32(<2 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an ...
[truncated]

preames

LGTM

This reverts commit 2800448.

hanhanW · 2024-02-01T19:53:38Z

Hi there, I bisected to this change in a downstream compiler crash on RISC-V. Here is the stack trace:

FAILED: tests/e2e/regression/check_regression_llvm-cpu_dynamic_reduce_min.mlir_module.vmfb /work/build-linux-riscv_64/tests/e2e/regression/check_regression_llvm-cpu_dynamic_reduce_min.mlir_module.vmfb 
cd /work/build-linux-riscv_64/tests/e2e/regression && /work/full-build-dir/install/bin/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu --iree-input-type=stablehlo /work/tests/e2e/regression/dynamic_reduce_min.mlir -o check_regression_llvm-cpu_dynamic_reduce_min.mlir_module.vmfb --iree-hal-executable-object-search-path=\"/work/build-linux-riscv_64\" --iree-llvmcpu-target-triple=riscv64 --iree-llvmcpu-target-abi=lp64d --iree-llvmcpu-target-cpu-features=+m,+a,+f,+d,+c,+zvl512b,+v --riscv-v-fixed-length-vector-lmul-max=8
Unsupported intrinsic
UNREACHABLE executed at /work/third_party/llvm-project/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp:954!
Please report issues to https://github.com/openxla/iree/issues and include the crash backtrace.
Stack dump:
0.	Program arguments: /work/full-build-dir/install/bin/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu --iree-input-type=stablehlo /work/tests/e2e/regression/dynamic_reduce_min.mlir -o check_regression_llvm-cpu_dynamic_reduce_min.mlir_module.vmfb --iree-hal-executable-object-search-path=\"/work/build-linux-riscv_64\" --iree-llvmcpu-target-triple=riscv64 --iree-llvmcpu-target-abi=lp64d --iree-llvmcpu-target-cpu-features=+m,+a,+f,+d,+c,+zvl512b,+v --riscv-v-fixed-length-vector-lmul-max=8
 #0 0x00007f53278d96eb llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /work/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x00007f53278d7740 llvm::sys::RunSignalHandlers() /work/third_party/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x00007f53278d9dcf SignalHandler(int) /work/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x00007f532f3fd420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #4 0x00007f532106f00b raise (/lib/x86_64-linux-gnu/libc.so.6+0x4300b)
 #5 0x00007f532104e859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x22859)
 #6 0x00007f5327859151 (/work/full-build-dir/install/bin/../lib/libIREECompiler.so+0x6459151)
 #7 0x00007f532c7383c3 (/work/full-build-dir/install/bin/../lib/libIREECompiler.so+0xb3383c3)
 #8 0x00007f532c73dadc llvm::BasicTTIImplBase<llvm::RISCVTTIImpl>::getTypeBasedIntrinsicInstrCost(llvm::IntrinsicCostAttributes const&, llvm::TargetTransformInfo::TargetCostKind) /work/third_party/llvm-project/llvm/include/llvm/CodeGen/BasicTTIImpl.h:0:0
 #9 0x00007f532c735d01 llvm::BasicTTIImplBase<llvm::RISCVTTIImpl>::getIntrinsicInstrCost(llvm::IntrinsicCostAttributes const&, llvm::TargetTransformInfo::TargetCostKind) /work/third_party/llvm-project/llvm/include/llvm/CodeGen/BasicTTIImpl.h:0:0
#10 0x00007f532c735ba5 llvm::RISCVTTIImpl::getIntrinsicInstrCost(llvm::IntrinsicCostAttributes const&, llvm::TargetTransformInfo::TargetCostKind) /work/third_party/llvm-project/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp:837:17
#11 0x00007f532c6f217d llvm::TargetTransformInfoImplCRTPBase<llvm::RISCVTTIImpl>::getInstructionCost(llvm::User const*, llvm::ArrayRef<llvm::Value const*>, llvm::TargetTransformInfo::TargetCostKind) /work/third_party/llvm-project/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:1169:25
#12 0x00007f532e50c20d llvm::TargetTransformInfo::getInstructionCost(llvm::User const*, llvm::ArrayRef<llvm::Value const*>, llvm::TargetTransformInfo::TargetCostKind) const /work/third_party/llvm-project/llvm/lib/Analysis/TargetTransformInfo.cpp:270:3
#13 0x00007f532d372149 llvm::TargetTransformInfo::getInstructionCost(llvm::User const*, llvm::TargetTransformInfo::TargetCostKind) const /work/third_party/llvm-project/llvm/include/llvm/Analysis/TargetTransformInfo.h:411:12
#14 0x00007f532e2c00a4 llvm::InstructionCost::propagateState(llvm::InstructionCost const&) /work/third_party/llvm-project/llvm/include/llvm/Support/InstructionCost.h:57:19
#15 0x00007f532e2c00a4 llvm::InstructionCost::operator+=(llvm::InstructionCost const&) /work/third_party/llvm-project/llvm/include/llvm/Support/InstructionCost.h:100:5
#16 0x00007f532e2c00a4 llvm::CodeMetrics::analyzeBasicBlock(llvm::BasicBlock const*, llvm::TargetTransformInfo const&, llvm::SmallPtrSetImpl<llvm::Value const*> const&, bool) /work/third_party/llvm-project/llvm/lib/Analysis/CodeMetrics.cpp:180:14
#17 0x00007f532d9e5c9b llvm::FunctionSpecializer::run() /work/third_party/llvm-project/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:0:0
#18 0x00007f532d9df563 runIPSCCP(llvm::Module&, llvm::DataLayout const&, llvm::AnalysisManager<llvm::Function>*, std::function<llvm::TargetLibraryInfo const& (llvm::Function&)>, std::function<llvm::TargetTransformInfo& (llvm::Function&)>, std::function<llvm::AssumptionCache& (llvm::Function&)>, std::function<llvm::DominatorTree& (llvm::Function&)>, std::function<llvm::BlockFrequencyInfo& (llvm::Function&)>, bool) /work/third_party/llvm-project/llvm/lib/Transforms/IPO/SCCP.cpp:166:5
#19 0x00007f532d9df563 llvm::IPSCCPPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /work/third_party/llvm-project/llvm/lib/Transforms/IPO/SCCP.cpp:415:8
#20 0x00007f532d20c77d llvm::detail::PassModel<llvm::Module, llvm::IPSCCPPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /work/third_party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:89:5
#21 0x00007f532eaaf312 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /work/third_party/llvm-project/llvm/include/llvm/IR/PassManager.h:547:10
#22 0x00007f5329394380 llvm::SmallPtrSetImplBase::isSmall() const /work/third_party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:196:33
#23 0x00007f5329394380 llvm::SmallPtrSetImplBase::~SmallPtrSetImplBase() /work/third_party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:84:10
#24 0x00007f5329394380 llvm::PreservedAnalyses::~PreservedAnalyses() /work/third_party/llvm-project/llvm/include/llvm/IR/PassManager.h:172:7
#25 0x00007f5329394380 mlir::iree_compiler::IREE::HAL::runLLVMIRPasses(mlir::iree_compiler::IREE::HAL::LLVMTarget const&, llvm::TargetMachine*, llvm::Module*) /work/compiler/src/iree/compiler/Dialect/HAL/Target/LLVMCPU/LLVMIRPasses.cpp:105:5
#26 0x00007f532930b41f mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#27 0x00007f532930b41f mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#28 0x00007f532930b41f mlir::iree_compiler::IREE::HAL::LLVMCPUTargetBackend::serializeExecutable(mlir::iree_compiler::IREE::HAL::TargetBackend::SerializationOptions const&, mlir::iree_compiler::IREE::HAL::ExecutableVariantOp, mlir::OpBuilder&) /work/compiler/src/iree/compiler/Dialect/HAL/Target/LLVMCPU/LLVMCPUTarget.cpp:526:9
#29 0x00007f5328f955a0 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#30 0x00007f5328f955a0 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#31 0x00007f5328f955a0 mlir::iree_compiler::IREE::HAL::(anonymous namespace)::SerializeTargetExecutablesPass::runOnOperation() /work/compiler/src/iree/compiler/Dialect/HAL/Transforms/SerializeExecutables.cpp:87:11
#32 0x00007f5327a81406 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7::operator()() const /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:519:17
#33 0x00007f5327a81406 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7>(long) /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#34 0x00007f5327a81406 llvm::function_ref<void ()>::operator()() const /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#35 0x00007f5327a81406 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /work/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:275:7
#36 0x00007f5327a81406 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:513:21
#37 0x00007f5327a81d38 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#38 0x00007f5327a81d38 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#39 0x00007f5327a81d38 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:585:9
#40 0x00007f5327a86d41 mlir::LogicalResult llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_6>(long, mlir::OpPassManager&, mlir::Operation*) /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:5
#41 0x00007f5328f96425 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#42 0x00007f5328f96425 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#43 0x00007f5328f96425 mlir::iree_compiler::IREE::HAL::(anonymous namespace)::SerializeExecutablesPass::runOnOperation() /work/compiler/src/iree/compiler/Dialect/HAL/Transforms/SerializeExecutables.cpp:118:9
#44 0x00007f5327a81406 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7::operator()() const /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:519:17
#45 0x00007f5327a81406 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7>(long) /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#46 0x00007f5327a81406 llvm::function_ref<void ()>::operator()() const /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#47 0x00007f5327a81406 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /work/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:275:7
#48 0x00007f5327a81406 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:513:21
#49 0x00007f5327a81d38 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#50 0x00007f5327a81d38 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#51 0x00007f5327a81d38 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:585:9
#52 0x00007f5327a87e4e mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15::operator()(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&) const /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:810:5
#53 0x00007f5327a833fb mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#54 0x00007f5327a833fb mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#55 0x00007f5327a833fb mlir::LogicalResult mlir::failableParallelForEach<__gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15&>(mlir::MLIRContext*, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15&) /work/third_party/llvm-project/mlir/include/mlir/IR/Threading.h:46:11
#56 0x00007f5327a833fb mlir::LogicalResult mlir::failableParallelForEach<std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> >&, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15&>(mlir::MLIRContext*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> >&, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15&) /work/third_party/llvm-project/mlir/include/mlir/IR/Threading.h:92:10
#57 0x00007f5327a833fb mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:815:14
#58 0x00007f5327a815a2 mlir::detail::OpToOpPassAdaptor::runOnOperation(bool) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:706:5
#59 0x00007f5327a815a2 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7::operator()() const /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:517:20
#60 0x00007f5327a815a2 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7>(long) /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#61 0x00007f5327a815a2 llvm::function_ref<void ()>::operator()() const /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#62 0x00007f5327a815a2 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /work/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:275:7
#63 0x00007f5327a815a2 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:513:21
#64 0x00007f5327a844b6 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#65 0x00007f5327a844b6 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#66 0x00007f5327a844b6 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:585:9
#67 0x00007f5327a844b6 mlir::PassManager::runPasses(mlir::Operation*, mlir::AnalysisManager) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:896:10
#68 0x00007f5327a8425f mlir::PassManager::run(mlir::Operation*) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:0:0
#69 0x00007f532782e993 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#70 0x00007f532782e993 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#71 0x00007f532782e993 mlir::iree_compiler::embed::(anonymous namespace)::Invocation::runPipeline(iree_compiler_pipeline_t) /work/compiler/src/iree/compiler/API/Internal/CompilerDriver.cpp:958:7
#72 0x00007f532782e993 ireeCompilerInvocationPipeline /work/compiler/src/iree/compiler/API/Internal/CompilerDriver.cpp:1388:23
#73 0x00007f5327a4b779 mlir::iree_compiler::runIreecMain(int, char**)::$_0::operator()(iree_compiler_source_t*) const /work/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:247:11
#74 0x00007f5327a4aff0 mlir::iree_compiler::runIreecMain(int, char**) /work/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:348:10
#75 0x00007f5321050083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
#76 0x000000000020102e _start (/work/full-build-dir/install/bin/iree-compile+0x20102e)
Aborted (core dumped)

I'm going to revert it locally, because many of our RISC-V tests and benchmarks are failing. I'm not sure how to generate a repro for you at this moment, but I will try to do that.

This reverts commit 2800448.

Reverts #79402. Crash reported. On closer inspection, this patch does not handle Intrinsic::maximum and Intrinsic::minimum.

preames · 2024-02-01T21:09:47Z

Reverted, @arcbbb please post a new version with support for the two missing cases. You can reuse this PR if desired.

hanhanW · 2024-02-02T07:33:20Z

Reverted, @arcbbb please post a new version with support for the two missing cases. You can reuse this PR if desired.

Thanks!

arcbbb · 2024-02-03T16:51:40Z

Thanks @preames and @hanhanW . I created a test coverage for this in #80553

Reverts llvm#79402. Crash reported. On closer inspection, this patch does not handle Intrinsic::maximum and Intrinsic::minimum.

The ‘llvm.vector.reduce.fmaximum/fminimum.*’ intrinsics propagate NaNs. and if any element of the vector is a NaN. Following llvm#79402, the patch add the cost of NaN check (vmfne + vcpop)

…mum (#80697) The ‘llvm.vector.reduce.fmaximum/fminimum.*’ intrinsics propagate NaNs if any element of the vector is a NaN. Following #79402, the patch adds the cost for NaN check (vmfne + vcpop)

This is recommitted as the test and fix for llvm.vector.reduce.fmaximum/fminimum are covered in #80553 and #80697

This is recommitted as the test and fix for llvm.vector.reduce.fmaximum/fminimum are covered in llvm#80553 and llvm#80697

arcbbb requested review from lukel97, preames, topperc and wangpc-pp January 25, 2024 04:07

llvmbot added backend:RISC-V llvm:analysis Includes value tracking, cost tables and constant folding labels Jan 25, 2024

preames reviewed Jan 29, 2024

View reviewed changes

preames approved these changes Jan 29, 2024

View reviewed changes

preames mentioned this pull request Jan 29, 2024

[RISCV][CostModel] Updates reduction and shuffle cost #77342

Merged

arcbbb merged commit 2800448 into llvm:main Jan 30, 2024

arcbbb deleted the cost-minmax-reduction branch January 30, 2024 08:47

hanhanW added a commit to iree-org/llvm-project that referenced this pull request Feb 1, 2024

Revert "[RISCV] Refine cost on Min/Max reduction (llvm#79402)"

44866b2

This reverts commit 2800448.

preames added a commit that referenced this pull request Feb 1, 2024

Revert "[RISCV] Refine cost on Min/Max reduction (#79402)"

e226ed0

This reverts commit 2800448.

preames mentioned this pull request Feb 1, 2024

Revert "[RISCV] Refine cost on Min/Max reduction" #80340

Merged

preames added a commit that referenced this pull request Feb 1, 2024

Revert "[RISCV] Refine cost on Min/Max reduction" (#80340)

59e5590

Reverts #79402. Crash reported. On closer inspection, this patch does not handle Intrinsic::maximum and Intrinsic::minimum.

arcbbb mentioned this pull request Feb 6, 2024

[RISCV][CostModel] Estimate cost of llvm.vector.reduce.fmaximum/fminimum #80697

Merged

arcbbb added a commit that referenced this pull request Apr 1, 2024

Recommit "[RISCV] Refine cost on Min/Max reduction (#79402)" (#86480)

c7954ca

This is recommitted as the test and fix for llvm.vector.reduce.fmaximum/fminimum are covered in #80553 and #80697

arcbbb added a commit to arcbbb/llvm-project that referenced this pull request Apr 1, 2024

Recommit "[RISCV] Refine cost on Min/Max reduction (llvm#79402)"

0700e64

This is recommitted as the test and fix for llvm.vector.reduce.fmaximum/fminimum are covered in llvm#80553 and llvm#80697

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Refine cost on Min/Max reduction #79402

[RISCV] Refine cost on Min/Max reduction #79402

Uh oh!

arcbbb commented Jan 25, 2024

Uh oh!

llvmbot commented Jan 25, 2024 •

edited

Loading

Uh oh!

preames left a comment

Uh oh!

hanhanW commented Feb 1, 2024

Uh oh!

preames commented Feb 1, 2024 •

edited

Loading

Uh oh!

hanhanW commented Feb 2, 2024

Uh oh!

arcbbb commented Feb 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[RISCV] Refine cost on Min/Max reduction #79402

[RISCV] Refine cost on Min/Max reduction #79402

Uh oh!

Conversation

arcbbb commented Jan 25, 2024

Uh oh!

llvmbot commented Jan 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

preames left a comment

Choose a reason for hiding this comment

Uh oh!

hanhanW commented Feb 1, 2024

Uh oh!

preames commented Feb 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hanhanW commented Feb 2, 2024

Uh oh!

arcbbb commented Feb 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

llvmbot commented Jan 25, 2024 •

edited

Loading

preames commented Feb 1, 2024 •

edited

Loading