Skip to content

Conversation

@arcbbb
Copy link
Contributor

@arcbbb arcbbb commented Jan 25, 2024

This patch is split off from #77342, and follows #79103

  • Correct for CodeSize cost that 1 instruction is not included. 3 is from {VMV.S, ReductionOp, VMV.X}
  • Add SplitCost which chains a series of VMAX/VMIN/... which scales with LMUL.
  • Use MVT to estimate VL.

This patch is split off from llvm#77342, and follows llvm#79103

- Correct for CodeSize cost that 1 instruction is not included. 3 is
  from {VMV.S, ReductionOp, VMV.X}
- Add SplitCost which chains a series of VMAX/VMIN/... which scales
    with LMUL.
- Use MVT to estimate VL.
@llvmbot llvmbot added backend:RISC-V llvm:analysis Includes value tracking, cost tables and constant folding labels Jan 25, 2024
@llvmbot
Copy link
Member

llvmbot commented Jan 25, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-backend-risc-v

Author: Shih-Po Hung (arcbbb)

Changes

This patch is split off from #77342, and follows #79103

  • Correct for CodeSize cost that 1 instruction is not included. 3 is from {VMV.S, ReductionOp, VMV.X}
  • Add SplitCost which chains a series of VMAX/VMIN/... which scales with LMUL.
  • Use MVT to estimate VL.

Patch is 107.33 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79402.diff

5 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+36-7)
  • (modified) llvm/test/Analysis/CostModel/RISCV/reduce-max.ll (+70-70)
  • (modified) llvm/test/Analysis/CostModel/RISCV/reduce-min.ll (+70-70)
  • (modified) llvm/test/Analysis/CostModel/RISCV/reduce-scalable-fp.ll (+42-42)
  • (modified) llvm/test/Analysis/CostModel/RISCV/reduce-scalable-int.ll (+48-48)
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 0799af0a7bad550..0ab795ec99d8a87 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -942,13 +942,42 @@ RISCVTTIImpl::getMinMaxReductionCost(Intrinsic::ID IID, VectorType *Ty,
     return (LT.first - 1) + 3;
 
   // IR Reduction is composed by two vmv and one rvv reduction instruction.
-  InstructionCost BaseCost = 2;
-
-  if (CostKind == TTI::TCK_CodeSize)
-    return (LT.first - 1) + BaseCost;
-
-  unsigned VL = getEstimatedVLFor(Ty);
-  return (LT.first - 1) + BaseCost + Log2_32_Ceil(VL);
+  unsigned SplitOp;
+  SmallVector<unsigned, 3> Opcodes;
+  switch (IID) {
+  default:
+    llvm_unreachable("Unsupported intrinsic");
+  case Intrinsic::smax:
+    SplitOp = RISCV::VMAX_VV;
+    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMAX_VS, RISCV::VMV_X_S};
+    break;
+  case Intrinsic::smin:
+    SplitOp = RISCV::VMIN_VV;
+    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMIN_VS, RISCV::VMV_X_S};
+    break;
+  case Intrinsic::umax:
+    SplitOp = RISCV::VMAXU_VV;
+    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMAXU_VS, RISCV::VMV_X_S};
+    break;
+  case Intrinsic::umin:
+    SplitOp = RISCV::VMINU_VV;
+    Opcodes = {RISCV::VMV_S_X, RISCV::VREDMINU_VS, RISCV::VMV_X_S};
+    break;
+  case Intrinsic::maxnum:
+    SplitOp = RISCV::VFMAX_VV;
+    Opcodes = {RISCV::VFMV_S_F, RISCV::VFREDMAX_VS, RISCV::VFMV_F_S};
+    break;
+  case Intrinsic::minnum:
+    SplitOp = RISCV::VFMIN_VV;
+    Opcodes = {RISCV::VFMV_S_F, RISCV::VFREDMIN_VS, RISCV::VFMV_F_S};
+    break;
+  }
+  // Add a cost for data larger than LMUL8
+  InstructionCost SplitCost =
+      (LT.first > 1) ? (LT.first - 1) *
+                           getRISCVInstructionCost(SplitOp, LT.second, CostKind)
+                     : 0;
+  return SplitCost + getRISCVInstructionCost(Opcodes, LT.second, CostKind);
 }
 
 InstructionCost
diff --git a/llvm/test/Analysis/CostModel/RISCV/reduce-max.ll b/llvm/test/Analysis/CostModel/RISCV/reduce-max.ll
index c21b8520112cf88..11fa50d355e833d 100644
--- a/llvm/test/Analysis/CostModel/RISCV/reduce-max.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/reduce-max.ll
@@ -51,14 +51,14 @@ define i32 @reduce_umax_i8(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_umax_i8'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i8 @llvm.vector.reduce.umax.v1i8(<1 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i8 @llvm.vector.reduce.umax.v2i8(<2 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i8 @llvm.vector.reduce.umax.v4i8(<4 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i8 @llvm.vector.reduce.umax.v8i8(<8 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i8 @llvm.vector.reduce.umax.v16i8(<16 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i8 @llvm.vector.reduce.umax.v32i8(<32 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i8 @llvm.vector.reduce.umax.v64i8(<64 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = call i8 @llvm.vector.reduce.umax.v128i8(<128 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i8 @llvm.vector.reduce.umax.v1i8(<1 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.vector.reduce.umax.v2i8(<2 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i8 @llvm.vector.reduce.umax.v4i8(<4 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i8 @llvm.vector.reduce.umax.v8i8(<8 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i8 @llvm.vector.reduce.umax.v16i8(<16 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i8 @llvm.vector.reduce.umax.v32i8(<32 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i8 @llvm.vector.reduce.umax.v64i8(<64 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i8 @llvm.vector.reduce.umax.v128i8(<128 x i8> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i8 @llvm.vector.reduce.umax.v1i8(<1 x i8> undef)
@@ -85,14 +85,14 @@ define i32 @reduce_umax_i16(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_umax_i16'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i16 @llvm.vector.reduce.umax.v1i16(<1 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.vector.reduce.umax.v2i16(<2 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i16 @llvm.vector.reduce.umax.v4i16(<4 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i16 @llvm.vector.reduce.umax.v8i16(<8 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i16 @llvm.vector.reduce.umax.v16i16(<16 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i16 @llvm.vector.reduce.umax.v32i16(<32 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i16 @llvm.vector.reduce.umax.v64i16(<64 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = call i16 @llvm.vector.reduce.umax.v128i16(<128 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i16 @llvm.vector.reduce.umax.v1i16(<1 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i16 @llvm.vector.reduce.umax.v2i16(<2 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.vector.reduce.umax.v4i16(<4 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i16 @llvm.vector.reduce.umax.v8i16(<8 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i16 @llvm.vector.reduce.umax.v16i16(<16 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i16 @llvm.vector.reduce.umax.v32i16(<32 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i16 @llvm.vector.reduce.umax.v64i16(<64 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i16 @llvm.vector.reduce.umax.v128i16(<128 x i16> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i16 @llvm.vector.reduce.umax.v1i16(<1 x i16> undef)
@@ -115,18 +115,18 @@ define i32 @reduce_umax_i32(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i32 @llvm.vector.reduce.umax.v16i32(<16 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V32 = call i32 @llvm.vector.reduce.umax.v32i32(<32 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V64 = call i32 @llvm.vector.reduce.umax.v64i32(<64 x i32> undef)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %V128 = call i32 @llvm.vector.reduce.umax.v128i32(<128 x i32> undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %V128 = call i32 @llvm.vector.reduce.umax.v128i32(<128 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_umax_i32'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i32 @llvm.vector.reduce.umax.v1i32(<1 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.umax.v2i32(<2 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i32 @llvm.vector.reduce.umax.v8i32(<8 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i32 @llvm.vector.reduce.umax.v16i32(<16 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i32 @llvm.vector.reduce.umax.v32i32(<32 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i32 @llvm.vector.reduce.umax.v64i32(<64 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i32 @llvm.vector.reduce.umax.v128i32(<128 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i32 @llvm.vector.reduce.umax.v1i32(<1 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i32 @llvm.vector.reduce.umax.v2i32(<2 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i32 @llvm.vector.reduce.umax.v8i32(<8 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i32 @llvm.vector.reduce.umax.v16i32(<16 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i32 @llvm.vector.reduce.umax.v32i32(<32 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i32 @llvm.vector.reduce.umax.v64i32(<64 x i32> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V128 = call i32 @llvm.vector.reduce.umax.v128i32(<128 x i32> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i32 @llvm.vector.reduce.umax.v1i32(<1 x i32> undef)
@@ -148,19 +148,19 @@ define i32 @reduce_umax_i64(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V8 = call i64 @llvm.vector.reduce.umax.v8i64(<8 x i64> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i64 @llvm.vector.reduce.umax.v16i64(<16 x i64> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V32 = call i64 @llvm.vector.reduce.umax.v32i64(<32 x i64> undef)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %V64 = call i64 @llvm.vector.reduce.umax.v64i64(<64 x i64> undef)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V128 = call i64 @llvm.vector.reduce.umax.v128i64(<128 x i64> undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %V64 = call i64 @llvm.vector.reduce.umax.v64i64(<64 x i64> undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 31 for instruction: %V128 = call i64 @llvm.vector.reduce.umax.v128i64(<128 x i64> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_umax_i64'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i64 @llvm.vector.reduce.umax.v1i64(<1 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i64 @llvm.vector.reduce.umax.v2i64(<2 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i64 @llvm.vector.reduce.umax.v4i64(<4 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i64 @llvm.vector.reduce.umax.v8i64(<8 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i64 @llvm.vector.reduce.umax.v16i64(<16 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i64 @llvm.vector.reduce.umax.v32i64(<32 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i64 @llvm.vector.reduce.umax.v64i64(<64 x i64> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V128 = call i64 @llvm.vector.reduce.umax.v128i64(<128 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i64 @llvm.vector.reduce.umax.v1i64(<1 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i64 @llvm.vector.reduce.umax.v2i64(<2 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i64 @llvm.vector.reduce.umax.v4i64(<4 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i64 @llvm.vector.reduce.umax.v8i64(<8 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i64 @llvm.vector.reduce.umax.v16i64(<16 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i64 @llvm.vector.reduce.umax.v32i64(<32 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V64 = call i64 @llvm.vector.reduce.umax.v64i64(<64 x i64> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V128 = call i64 @llvm.vector.reduce.umax.v128i64(<128 x i64> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i64 @llvm.vector.reduce.umax.v1i64(<1 x i64> undef)
@@ -221,14 +221,14 @@ define i32 @reduce_smax_i8(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_smax_i8'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i8 @llvm.vector.reduce.smax.v1i8(<1 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i8 @llvm.vector.reduce.smax.v2i8(<2 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i8 @llvm.vector.reduce.smax.v8i8(<8 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i8 @llvm.vector.reduce.smax.v16i8(<16 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i8 @llvm.vector.reduce.smax.v32i8(<32 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i8 @llvm.vector.reduce.smax.v64i8(<64 x i8> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = call i8 @llvm.vector.reduce.smax.v128i8(<128 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i8 @llvm.vector.reduce.smax.v1i8(<1 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.vector.reduce.smax.v2i8(<2 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i8 @llvm.vector.reduce.smax.v8i8(<8 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i8 @llvm.vector.reduce.smax.v16i8(<16 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i8 @llvm.vector.reduce.smax.v32i8(<32 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i8 @llvm.vector.reduce.smax.v64i8(<64 x i8> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i8 @llvm.vector.reduce.smax.v128i8(<128 x i8> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i8 @llvm.vector.reduce.smax.v1i8(<1 x i8> undef)
@@ -255,14 +255,14 @@ define i32 @reduce_smax_i16(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_smax_i16'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i16 @llvm.vector.reduce.smax.v1i16(<1 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.vector.reduce.smax.v2i16(<2 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i16 @llvm.vector.reduce.smax.v4i16(<4 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i16 @llvm.vector.reduce.smax.v8i16(<8 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i16 @llvm.vector.reduce.smax.v16i16(<16 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V32 = call i16 @llvm.vector.reduce.smax.v32i16(<32 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V64 = call i16 @llvm.vector.reduce.smax.v64i16(<64 x i16> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = call i16 @llvm.vector.reduce.smax.v128i16(<128 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i16 @llvm.vector.reduce.smax.v1i16(<1 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i16 @llvm.vector.reduce.smax.v2i16(<2 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.vector.reduce.smax.v4i16(<4 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V8 = call i16 @llvm.vector.reduce.smax.v8i16(<8 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i16 @llvm.vector.reduce.smax.v16i16(<16 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V32 = call i16 @llvm.vector.reduce.smax.v32i16(<32 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V64 = call i16 @llvm.vector.reduce.smax.v64i16(<64 x i16> undef)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V128 = call i16 @llvm.vector.reduce.smax.v128i16(<128 x i16> undef)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
 ;
   %V1   = call i16 @llvm.vector.reduce.smax.v1i16(<1 x i16> undef)
@@ -285,18 +285,18 @@ define i32 @reduce_smax_i32(i32 %arg) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i32 @llvm.vector.reduce.smax.v16i32(<16 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V32 = call i32 @llvm.vector.reduce.smax.v32i32(<32 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V64 = call i32 @llvm.vector.reduce.smax.v64i32(<64 x i32> undef)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %V128 = call i32 @llvm.vector.reduce.smax.v128i32(<128 x i32> undef)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %V128 = call i32 @llvm.vector.reduce.smax.v128i32(<128 x i32> undef)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
 ;
 ; SIZE-LABEL: 'reduce_smax_i32'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V1 = call i32 @llvm.vector.reduce.smax.v1i32(<1 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.smax.v2i32(<2 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> undef)
-; SIZE-NEXT:  Cost Model: Found an ...
[truncated]

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arcbbb arcbbb merged commit 2800448 into llvm:main Jan 30, 2024
@arcbbb arcbbb deleted the cost-minmax-reduction branch January 30, 2024 08:47
hanhanW added a commit to iree-org/llvm-project that referenced this pull request Feb 1, 2024
@hanhanW
Copy link
Contributor

hanhanW commented Feb 1, 2024

Hi there, I bisected to this change in a downstream compiler crash on RISC-V. Here is the stack trace:

FAILED: tests/e2e/regression/check_regression_llvm-cpu_dynamic_reduce_min.mlir_module.vmfb /work/build-linux-riscv_64/tests/e2e/regression/check_regression_llvm-cpu_dynamic_reduce_min.mlir_module.vmfb 
cd /work/build-linux-riscv_64/tests/e2e/regression && /work/full-build-dir/install/bin/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu --iree-input-type=stablehlo /work/tests/e2e/regression/dynamic_reduce_min.mlir -o check_regression_llvm-cpu_dynamic_reduce_min.mlir_module.vmfb --iree-hal-executable-object-search-path=\"/work/build-linux-riscv_64\" --iree-llvmcpu-target-triple=riscv64 --iree-llvmcpu-target-abi=lp64d --iree-llvmcpu-target-cpu-features=+m,+a,+f,+d,+c,+zvl512b,+v --riscv-v-fixed-length-vector-lmul-max=8
Unsupported intrinsic
UNREACHABLE executed at /work/third_party/llvm-project/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp:954!
Please report issues to https://github.com/openxla/iree/issues and include the crash backtrace.
Stack dump:
0.	Program arguments: /work/full-build-dir/install/bin/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu --iree-input-type=stablehlo /work/tests/e2e/regression/dynamic_reduce_min.mlir -o check_regression_llvm-cpu_dynamic_reduce_min.mlir_module.vmfb --iree-hal-executable-object-search-path=\"/work/build-linux-riscv_64\" --iree-llvmcpu-target-triple=riscv64 --iree-llvmcpu-target-abi=lp64d --iree-llvmcpu-target-cpu-features=+m,+a,+f,+d,+c,+zvl512b,+v --riscv-v-fixed-length-vector-lmul-max=8
 #0 0x00007f53278d96eb llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /work/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x00007f53278d7740 llvm::sys::RunSignalHandlers() /work/third_party/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x00007f53278d9dcf SignalHandler(int) /work/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x00007f532f3fd420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #4 0x00007f532106f00b raise (/lib/x86_64-linux-gnu/libc.so.6+0x4300b)
 #5 0x00007f532104e859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x22859)
 #6 0x00007f5327859151 (/work/full-build-dir/install/bin/../lib/libIREECompiler.so+0x6459151)
 #7 0x00007f532c7383c3 (/work/full-build-dir/install/bin/../lib/libIREECompiler.so+0xb3383c3)
 #8 0x00007f532c73dadc llvm::BasicTTIImplBase<llvm::RISCVTTIImpl>::getTypeBasedIntrinsicInstrCost(llvm::IntrinsicCostAttributes const&, llvm::TargetTransformInfo::TargetCostKind) /work/third_party/llvm-project/llvm/include/llvm/CodeGen/BasicTTIImpl.h:0:0
 #9 0x00007f532c735d01 llvm::BasicTTIImplBase<llvm::RISCVTTIImpl>::getIntrinsicInstrCost(llvm::IntrinsicCostAttributes const&, llvm::TargetTransformInfo::TargetCostKind) /work/third_party/llvm-project/llvm/include/llvm/CodeGen/BasicTTIImpl.h:0:0
#10 0x00007f532c735ba5 llvm::RISCVTTIImpl::getIntrinsicInstrCost(llvm::IntrinsicCostAttributes const&, llvm::TargetTransformInfo::TargetCostKind) /work/third_party/llvm-project/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp:837:17
#11 0x00007f532c6f217d llvm::TargetTransformInfoImplCRTPBase<llvm::RISCVTTIImpl>::getInstructionCost(llvm::User const*, llvm::ArrayRef<llvm::Value const*>, llvm::TargetTransformInfo::TargetCostKind) /work/third_party/llvm-project/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:1169:25
#12 0x00007f532e50c20d llvm::TargetTransformInfo::getInstructionCost(llvm::User const*, llvm::ArrayRef<llvm::Value const*>, llvm::TargetTransformInfo::TargetCostKind) const /work/third_party/llvm-project/llvm/lib/Analysis/TargetTransformInfo.cpp:270:3
#13 0x00007f532d372149 llvm::TargetTransformInfo::getInstructionCost(llvm::User const*, llvm::TargetTransformInfo::TargetCostKind) const /work/third_party/llvm-project/llvm/include/llvm/Analysis/TargetTransformInfo.h:411:12
#14 0x00007f532e2c00a4 llvm::InstructionCost::propagateState(llvm::InstructionCost const&) /work/third_party/llvm-project/llvm/include/llvm/Support/InstructionCost.h:57:19
#15 0x00007f532e2c00a4 llvm::InstructionCost::operator+=(llvm::InstructionCost const&) /work/third_party/llvm-project/llvm/include/llvm/Support/InstructionCost.h:100:5
#16 0x00007f532e2c00a4 llvm::CodeMetrics::analyzeBasicBlock(llvm::BasicBlock const*, llvm::TargetTransformInfo const&, llvm::SmallPtrSetImpl<llvm::Value const*> const&, bool) /work/third_party/llvm-project/llvm/lib/Analysis/CodeMetrics.cpp:180:14
#17 0x00007f532d9e5c9b llvm::FunctionSpecializer::run() /work/third_party/llvm-project/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:0:0
#18 0x00007f532d9df563 runIPSCCP(llvm::Module&, llvm::DataLayout const&, llvm::AnalysisManager<llvm::Function>*, std::function<llvm::TargetLibraryInfo const& (llvm::Function&)>, std::function<llvm::TargetTransformInfo& (llvm::Function&)>, std::function<llvm::AssumptionCache& (llvm::Function&)>, std::function<llvm::DominatorTree& (llvm::Function&)>, std::function<llvm::BlockFrequencyInfo& (llvm::Function&)>, bool) /work/third_party/llvm-project/llvm/lib/Transforms/IPO/SCCP.cpp:166:5
#19 0x00007f532d9df563 llvm::IPSCCPPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /work/third_party/llvm-project/llvm/lib/Transforms/IPO/SCCP.cpp:415:8
#20 0x00007f532d20c77d llvm::detail::PassModel<llvm::Module, llvm::IPSCCPPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /work/third_party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:89:5
#21 0x00007f532eaaf312 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /work/third_party/llvm-project/llvm/include/llvm/IR/PassManager.h:547:10
#22 0x00007f5329394380 llvm::SmallPtrSetImplBase::isSmall() const /work/third_party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:196:33
#23 0x00007f5329394380 llvm::SmallPtrSetImplBase::~SmallPtrSetImplBase() /work/third_party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:84:10
#24 0x00007f5329394380 llvm::PreservedAnalyses::~PreservedAnalyses() /work/third_party/llvm-project/llvm/include/llvm/IR/PassManager.h:172:7
#25 0x00007f5329394380 mlir::iree_compiler::IREE::HAL::runLLVMIRPasses(mlir::iree_compiler::IREE::HAL::LLVMTarget const&, llvm::TargetMachine*, llvm::Module*) /work/compiler/src/iree/compiler/Dialect/HAL/Target/LLVMCPU/LLVMIRPasses.cpp:105:5
#26 0x00007f532930b41f mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#27 0x00007f532930b41f mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#28 0x00007f532930b41f mlir::iree_compiler::IREE::HAL::LLVMCPUTargetBackend::serializeExecutable(mlir::iree_compiler::IREE::HAL::TargetBackend::SerializationOptions const&, mlir::iree_compiler::IREE::HAL::ExecutableVariantOp, mlir::OpBuilder&) /work/compiler/src/iree/compiler/Dialect/HAL/Target/LLVMCPU/LLVMCPUTarget.cpp:526:9
#29 0x00007f5328f955a0 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#30 0x00007f5328f955a0 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#31 0x00007f5328f955a0 mlir::iree_compiler::IREE::HAL::(anonymous namespace)::SerializeTargetExecutablesPass::runOnOperation() /work/compiler/src/iree/compiler/Dialect/HAL/Transforms/SerializeExecutables.cpp:87:11
#32 0x00007f5327a81406 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7::operator()() const /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:519:17
#33 0x00007f5327a81406 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7>(long) /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#34 0x00007f5327a81406 llvm::function_ref<void ()>::operator()() const /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#35 0x00007f5327a81406 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /work/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:275:7
#36 0x00007f5327a81406 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:513:21
#37 0x00007f5327a81d38 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#38 0x00007f5327a81d38 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#39 0x00007f5327a81d38 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:585:9
#40 0x00007f5327a86d41 mlir::LogicalResult llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_6>(long, mlir::OpPassManager&, mlir::Operation*) /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:5
#41 0x00007f5328f96425 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#42 0x00007f5328f96425 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#43 0x00007f5328f96425 mlir::iree_compiler::IREE::HAL::(anonymous namespace)::SerializeExecutablesPass::runOnOperation() /work/compiler/src/iree/compiler/Dialect/HAL/Transforms/SerializeExecutables.cpp:118:9
#44 0x00007f5327a81406 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7::operator()() const /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:519:17
#45 0x00007f5327a81406 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7>(long) /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#46 0x00007f5327a81406 llvm::function_ref<void ()>::operator()() const /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#47 0x00007f5327a81406 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /work/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:275:7
#48 0x00007f5327a81406 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:513:21
#49 0x00007f5327a81d38 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#50 0x00007f5327a81d38 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#51 0x00007f5327a81d38 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:585:9
#52 0x00007f5327a87e4e mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15::operator()(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&) const /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:810:5
#53 0x00007f5327a833fb mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#54 0x00007f5327a833fb mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#55 0x00007f5327a833fb mlir::LogicalResult mlir::failableParallelForEach<__gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15&>(mlir::MLIRContext*, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, __gnu_cxx::__normal_iterator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> > >, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15&) /work/third_party/llvm-project/mlir/include/mlir/IR/Threading.h:46:11
#56 0x00007f5327a833fb mlir::LogicalResult mlir::failableParallelForEach<std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> >&, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15&>(mlir::MLIRContext*, std::vector<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo, std::allocator<mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo> >&, mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::$_15&) /work/third_party/llvm-project/mlir/include/mlir/IR/Threading.h:92:10
#57 0x00007f5327a833fb mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:815:14
#58 0x00007f5327a815a2 mlir::detail::OpToOpPassAdaptor::runOnOperation(bool) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:706:5
#59 0x00007f5327a815a2 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7::operator()() const /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:517:20
#60 0x00007f5327a815a2 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_7>(long) /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
#61 0x00007f5327a815a2 llvm::function_ref<void ()>::operator()() const /work/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
#62 0x00007f5327a815a2 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /work/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:275:7
#63 0x00007f5327a815a2 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:513:21
#64 0x00007f5327a844b6 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#65 0x00007f5327a844b6 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#66 0x00007f5327a844b6 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:585:9
#67 0x00007f5327a844b6 mlir::PassManager::runPasses(mlir::Operation*, mlir::AnalysisManager) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:896:10
#68 0x00007f5327a8425f mlir::PassManager::run(mlir::Operation*) /work/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:0:0
#69 0x00007f532782e993 mlir::LogicalResult::failed() const /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:44:33
#70 0x00007f532782e993 mlir::failed(mlir::LogicalResult) /work/third_party/llvm-project/mlir/include/mlir/Support/LogicalResult.h:72:58
#71 0x00007f532782e993 mlir::iree_compiler::embed::(anonymous namespace)::Invocation::runPipeline(iree_compiler_pipeline_t) /work/compiler/src/iree/compiler/API/Internal/CompilerDriver.cpp:958:7
#72 0x00007f532782e993 ireeCompilerInvocationPipeline /work/compiler/src/iree/compiler/API/Internal/CompilerDriver.cpp:1388:23
#73 0x00007f5327a4b779 mlir::iree_compiler::runIreecMain(int, char**)::$_0::operator()(iree_compiler_source_t*) const /work/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:247:11
#74 0x00007f5327a4aff0 mlir::iree_compiler::runIreecMain(int, char**) /work/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:348:10
#75 0x00007f5321050083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
#76 0x000000000020102e _start (/work/full-build-dir/install/bin/iree-compile+0x20102e)
Aborted (core dumped)

I'm going to revert it locally, because many of our RISC-V tests and benchmarks are failing. I'm not sure how to generate a repro for you at this moment, but I will try to do that.

preames added a commit that referenced this pull request Feb 1, 2024
preames added a commit that referenced this pull request Feb 1, 2024
Reverts #79402. Crash reported. On closer inspection,
this patch does not handle Intrinsic::maximum and Intrinsic::minimum.
@preames
Copy link
Collaborator

preames commented Feb 1, 2024

Reverted, @arcbbb please post a new version with support for the two missing cases. You can reuse this PR if desired.

@hanhanW
Copy link
Contributor

hanhanW commented Feb 2, 2024

Reverted, @arcbbb please post a new version with support for the two missing cases. You can reuse this PR if desired.

Thanks!

@arcbbb
Copy link
Contributor Author

arcbbb commented Feb 3, 2024

Thanks @preames and @hanhanW . I created a test coverage for this in #80553

agozillon pushed a commit to agozillon/llvm-project that referenced this pull request Feb 5, 2024
Reverts llvm#79402. Crash reported. On closer inspection,
this patch does not handle Intrinsic::maximum and Intrinsic::minimum.
arcbbb added a commit to arcbbb/llvm-project that referenced this pull request Mar 15, 2024
The ‘llvm.vector.reduce.fmaximum/fminimum.*’ intrinsics propagate NaNs.
and if any element of the vector is a NaN.

Following llvm#79402, the patch add the cost of NaN check  (vmfne + vcpop)
arcbbb added a commit that referenced this pull request Mar 25, 2024
…mum (#80697)

The ‘llvm.vector.reduce.fmaximum/fminimum.*’ intrinsics propagate NaNs
if any element of the vector is a NaN.
Following #79402, the patch adds the cost for NaN check (vmfne + vcpop)
arcbbb added a commit that referenced this pull request Apr 1, 2024
This is recommitted as the test and fix for
llvm.vector.reduce.fmaximum/fminimum are covered in #80553 and #80697
arcbbb added a commit to arcbbb/llvm-project that referenced this pull request Apr 1, 2024
This is recommitted as the test and fix for
llvm.vector.reduce.fmaximum/fminimum are covered in llvm#80553 and llvm#80697
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:RISC-V llvm:analysis Includes value tracking, cost tables and constant folding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants