pre-commit: PR165258 #3007

zyw-bot · 2025-10-31T13:53:44Z

Link: llvm/llvm-project#165258
Requested by: @Camsyn

zyw-bot · 2025-10-31T14:17:10Z

Diff mode

runner: ariselab-64c-docker
baseline: llvm/llvm-project@511c9c0
patch: llvm/llvm-project#165258
sha256: c92a254e91f8790b15e188e6ce95b60c0239a21056effe25e9e0bd537083bef6
commit: 88be3e6

8147 files changed, 1318112 insertions(+), 1323505 deletions(-)

Improvements:
  mem2reg.NumPromoted 70917 -> 71311 +0.56%
  argpromotion.NumArgumentsDead 476619 -> 478442 +0.38%
  function-attrs.NumReadNoneArg 836667 -> 838484 +0.22%
  sccp.NumDeadBlocks 685312 -> 686614 +0.19%
  argpromotion.NumArgumentsPromoted 699786 -> 701113 +0.19%
  function-attrs.NumReadOnlyArg 1881577 -> 1885029 +0.18%
  function-attrs.NumCapturesNone 3404731 -> 3409094 +0.13%
  bdce.NumRemoved 383814 -> 384086 +0.07%
  instcombine.NumSunkInst 3402777 -> 3404705 +0.06%
  globaldce.NumFunctions 348446 -> 348636 +0.05%
Regressions:
  correlated-value-propagation.NumDeadCases 65942 -> 56499 -14.32%
  correlated-value-propagation.NumSExt 47306 -> 46990 -0.67%
  correlated-value-propagation.NumAShrsConverted 3484 -> 3465 -0.55%
  sccp.NumInstReplaced 132887 -> 132233 -0.49%
  correlated-value-propagation.NumNNeg 96021 -> 95768 -0.26%
  simplifycfg.NumSinkCommonCode 378565 -> 377679 -0.23%
  correlated-value-propagation.NumCmps 272160 -> 271564 -0.22%
  local.NumRemoved 5303713 -> 5293801 -0.19%
  correlated-value-propagation.NumUDivURemsNarrowedExpanded 1674 -> 1671 -0.18%
  adce.NumRemoved 95338 -> 95174 -0.17%

+6 llvm/Target.ll
+3 wireshark/packet-rf4ce-nwk.ll
+3 z3/expr_pattern_match.ll
+2 abc/abcSweep.ll
+1 casadi/rootfinder.ll
+1 hdf5/H5Olink.ll
+1 libjpeg-turbo/rdjpgcom.ll
+1 libjpeg-turbo/tjunittest.ll
+1 llama.cpp/llama-vocab.ll
+0 cxxopts/example.ll
+0 fish-rs/e69mx4kebbw5h90l2bpw0bwyt.ll
+0 gromacs/membed.ll
+0 lz4/lz4cli.ll
+0 tinyrenderer/model.ll
-1 abc/amapParse.ll
-1 abc/verFormula.ll
-1 arrow/string-to-double.ll
-1 box2d/imgui_demo.ll
-1 cpython/socketmodule.ll
-1 csmith/FunctionInvocationBinary.ll
-1 ffmpeg/hevcdec.ll
-1 ffmpeg/rv34.ll
-1 icu/double-conversion-string-to-double.ll
-1 linux/fault.ll
-1 mitsuba3/plastic.ll
-1 opencv/array.ll
-1 opencv/ts_perf.ll
-1 openusd/string-to-double.ll
-1 php/zend_ini_scanner.ll
-1 postgres/gindatapage.ll
-1 postgres/pl_exec.ll
-1 postgres/type.ll
-1 quiche-rs/6lp2oyapnsojevo64mk9ap806.ll
-1 ruby/date_parse.ll
-1 stockfish/uci.ll
-1 typst-rs/3kgmqnxcsl3z3n0n.ll
-1 wireshark/packet-ansi_637.ll
-1 zed-rs/dw4qzuo904yf8wu71sutofhxl.ll
-2 abc/ioUtil.ll
-2 icu/number_patternstring.ll
-2 linux/locks.ll
-2 nix/print-ambiguous.ll
-2 openjdk/jvmFlag.ll
-2 zed-rs/20igqmfettcex48uahr8huyna.ll
-2 zed-rs/2g6g1uvat5pik6wc3r3hl3kr7.ll
-3 cmake/zstd_compress.ll
-3 duckdb/zstd_compress_superblock.ll
-3 linux/xhci-debugfs.ll
-3 meshoptimizer/vertexcodec.ll
-3 ruff-rs/1t5d2y321zgutphrasyamrpjz.ll
-3 rustfmt-rs/3xcdaapyewyrfogi.ll
-3 zstd/zstd_compress.ll
-4 cpython/listobject.ll
-4 luau/Quantify.ll
-4 meilisearch-rs/48hhebymxr5ff2nk.ll
-5 flatbuffers/idl_gen_kotlin_kmp.ll
-5 openssl/quic_stream_map.ll
-5 openssl/rsa_ameth.ll
-5 postgres/jsonfuncs.ll
-5 quantlib/fdklugeextouspreadengine.ll
-6 llvm/InstrProfiling.ll
-6 llvm/Program.ll
-6 typst-rs/1ru1rhojhbz2vfey.ll
-6 typst-rs/59tuvc5m3xlovl3o.ll
-8 wasmtime-rs/24jxjxhx40nukvhl.ll
-10 hermes/JSObject.ll
-10 ruff-rs/9ezhgv3vaoku7b96fwwr4f701.ll
-10 rust-analyzer-rs/hf9vzunhg9aziex.ll
-11 c3c/sema_casts.ll
-16 pola-rs/dgtr4n6toyrs0lo6gtn8sd4wk.ll
-19 c3c/types.ll
-20 lief/bignum.ll
-27 hermes/RegexParser.ll
-28 diesel-rs/32aaw0bzsmxs81tm.ll
-55 z3/realclosure.ll
-56 diesel-rs/285i4t9uy6n6phhi.ll
-64 minetest/serialization.ll

github-actions · 2025-10-31T14:19:49Z

Control Flow Simplification: Several functions replace complex conditional checks with simpler icmp comparisons and direct branching, improving code clarity and potentially performance by reducing unnecessary operations.
Phi Node Adjustments: Multiple phi nodes in loops and exit blocks are updated to reflect changes in control flow, such as redirecting incoming blocks or simplifying value selection, ensuring correct dominance and SSA form after structural changes.
Switch Statement Optimization: Some switch statements are streamlined by removing unreachable cases or consolidating duplicate labels, reducing branch overhead and enhancing readability.
Memory Operation Refinement: Stores and loads are adjusted for better alignment and precision (e.g., replacing wide stores with narrower ones), and redundant memory accesses are eliminated through improved aliasing and scope metadata.
Function Signature Updates: Certain function declarations now include range attributes on parameters, providing more precise semantic information to the optimizer, which can enable better optimization decisions.

model: qwen-plus-latest
CompletionUsage(completion_tokens=187, prompt_tokens=107175, total_tokens=107362, completion_tokens_details=None, prompt_tokens_details=None)

Camsyn · 2025-11-01T16:58:40Z

bench/llvm/optimized/Target.ll

+_ZN4llvm12StringSwitchINS_5MachO12PlatformTypeES2_E4CaseENS_13StringLiteralES2_.exit97.thread: ; preds = %_ZN4llvmeqENS_9StringRefES0_.exit.i.i, %_ZN4llvmeqENS_9StringRefES0_.exit.i.i14, %_ZN4llvmeqENS_9StringRefES0_.exit.i.i94, %_ZN4llvmeqENS_9StringRefES0_.exit.i.i86
+  %.sroa.30.11.ph = phi i64 [ 4294967297, %_ZN4llvmeqENS_9StringRefES0_.exit.i.i14 ], [ 4294967296, %_ZN4llvmeqENS_9StringRefES0_.exit.i.i ], [ 0, %_ZN4llvmeqENS_9StringRefES0_.exit.i.i86 ], [ 0, %_ZN4llvmeqENS_9StringRefES0_.exit.i.i94 ]
+  br label %_ZN4llvm12StringSwitchINS_5MachO12PlatformTypeES2_E4CaseENS_13StringLiteralES2_.exit105



Regression?

This change is introduced by different inputs to jump-threading (I am trying to figure out what happens).
And the input difference is introduced by SCCP folding a conditional br to uncond br, as follows:

I see the reason: it seems to be the missed optimization of TryToSimplifyUncondBranchFromEmptyBlock, which bails out and cannot fold other phi values in BB into the successor Succ if there are >1 shared predecessors between BB and Succ.

Details

The original JumpThreading generates the following block:

foo.exit97.thread6: ; preds = %i.i, %i.i14, %i.i86 %.sroa.10.ph = phi i64 [ 5, %i.i14 ], [ 7, %i.i ], [ 9, %i.i86 ] %.sroa.30.10.ph = phi i64 [ 4294967297, %i.i14 ], [ 4294967296, %i.i ], [ 0, %i.i86 ] br label %foo.exit105 %foo.exit105: ...

While there is only ONE common predecessor (i.e., %i.i86) between foo.exit97.thread6 and its successor %foo.exit105, TryToSimplifyUncondBranchFromEmptyBlock can fold other phi node values into %foo.exit105, as follows:

foo.exit97.thread6: ; preds = %i.i86 br label %foo.exit105 %foo.exit105: _ = phi i64 ..., [ 5, %i.i14 ], [ 7, %i.i ], [9, %foo.exit97.thread6] _ = phi i64 ..., [ 4294967297, %i.i14 ], [ 4294967296, %i.i ], [0, %foo.exit97.thread6] ...

Eventually, with the simplified branch values of {{0}, {9}}, JumpThreading can further fold foo.exit97.thread6 into %foo.exit105's successor %foo.exit105.thread as follows:

BB 'foo.exit105': FOUND condition = i1 true for pred 'foo.exit97.thread6'. Threading edge from 'foo.exit97.thread6' to 'foo.exit105.thread', across block: foo.exit105

The enhancement of SCCP makes JumpThreading optimize more, generating the following block (ONE more block of %i.i94 merged):

foo.exit97.thread: ; preds = %i.i, %i.i14, %i.i94, %i.i86 %.sroa.0.ph = phi i64 [ 5, %i.i14 ], [ 7, %i.i ], [ 9, %i.i86 ], [ 4, %i.i94 ] %.sroa.1.ph = phi i64 [ 4294967297, %i.i14 ], [ 4294967296, %i.i ], [ 0, %i.i86 ], [ 0, %i.i94 ] br label %foo.exit105 %foo.exit105: ...

%i.i94 is also a shared predecessor between %foo.exit97.thread and %foo.exit105, leading to there are TWO shared predecessors (%i.i94 and %i.i86).
However, currently, TryToSimplifyUncondBranchFromEmptyBlock cannot fold other phi values if there are >1 shared predecessors.

Eventually, JumpThreading CANNOT further fold foo.exit97.thread6 into %foo.exit105's successor %foo.exit105.thread as the values can be derived from that branch are too complex ( {{0, 4294967296, 4294967297}, {4, 5, 7, 9}} ).

Maybe we can fold other phi values in such case, as follows:

foo.exit97.thread: ; preds = %i.i94, %i.i86 ; %.sroa.0.ph = phi i64 [ 5, %i.i14 ], [ 7, %i.i ], [ 9, %i.i86 ], [ 4, %i.i94 ] ; %.sroa.1.ph = phi i64 [ 4294967297, %i.i14 ], [ 4294967296, %i.i ], [ 0, %i.i86 ], [ 0, %i.i94 ] %.sroa.0.ph = phi i64 [ 9, %i.i86 ], [ 4, %i.i94 ] %.sroa.1.ph = phi i64 [ 0, %i.i86 ], [ 0, %i.i94 ] br label %foo.exit105 %foo.exit105: _ = phi i64 ..., [ 5, %i.i14 ], [ 7, %i.i ], [%.sroa.0.ph, %foo.exit97.thread6] _ = phi i64 ..., [ 4294967297, %i.i14 ], [ 4294967296, %i.i ], [%.sroa.1.ph, %foo.exit97.thread6] ...

Should we do such optimization?

Camsyn · 2025-11-05T07:20:32Z

bench/abc/optimized/abcSweep.ll


 ; Function Attrs: nounwind uwtable
-define range(i32 -1, 2) i32 @Abc_NtkCheckConstant_rec(ptr noundef %0) local_unnamed_addr #0 {
+define i32 @Abc_NtkCheckConstant_rec(ptr noundef %0) local_unnamed_addr #0 {


Regression in SCCP: fails to derive the return range of such a recursive function.

The recursion expression is as follows:
$R^{k+1} = [-1, 2) \cup R^k$
$R^0 = [-1, 2)$
$k$ means the recursion depth, and obviously, the range $R^*$ should have been inferred to $[-1, 2)$

The core reason is the monotony of SCCP analysis: using union (mergeInValue) to update the lattice state, so that the value state can only evolve in an increasing direction.

E.g., for predicate info, we infer a new range via $\text{CR} = \text{CR} \cup ( \text{ImposedCR} \cap \text{CopyCR} )$.

Assume that initial $\text{CR} = \bot$ , $\text{ImposedCR} = [0, 2]$ (edge constraint), and $\text{CopyCR} = \bot$ (e.g., a ret value from a function call with unknown ret range).

Then we can infer that $\text{CR} = [0, 2]$

If we can infer the range of $\text{CopyCR}$ after as $[-1, 1]$, we expect a new $\text{CR}$ as $[0, 1]$

However, as we use $\cup$ to update the range, the final $\text{CR}$ unchanges.

Maybe, designated for constantrange, we can relax the monotony of range evolution.

Details

Considering such a simplified situation:

define i8 @foo() { ... switch: %ret = tail call i8 @foo() switch i8 %ret, label %default [ i32 0, label %end ] default: br label %end ... end: %phi = phi i8 [ 0, %switch], [ %ret, %default ] ret i8 %phi }

If the DFS order of SCCP is switch-> end -> default,

Before this patch, SCCP performed analysis as follows:

Visit switch:

%ret = tail call i8 @foo() --> $\bot$ (unknown ), as the ret range of foo is unknown.

Mark edges switch -> end and switch -> default as feasible (markEdgeExecutable)

Visit end:

%phi = phi i8 [ 0, %switch], [ %ret, %default ] -> $[0,1)$ ( constantrange), as edge default -> end is not feasible temporarily.

ret i8 %phi -> set the ret range of foo as $[0,1)$

Add user %ret = tail call i8 @foo() to worklist

Update %ret as $[0,1) = \bot \cup [0,1)$, as old range is $\bot$ and the new range is $[0,1)$.

Visit default:

Mark edge default-> end as feasible (markEdgeExecutable) and push %phi to worklist

Update %phi as union of 0 and %ret -> $[0,1) \cup [0,1)= [0,1)$

SCCP ends with %phi unchanged.

The final ret range of foo is $[0,1)$

After this patch, SCCP performed analysis as follows:

PredicateInfo:

Insert noop predicate info %ret.default = bitcast i8 %ret to i32 for edge switch -> default

Visit switch:

%ret = tail call i8 @foo() --> $\bot$ (unknown ), as the ret range of foo is unknown.

handlePredicateInfo: %ret.default = bitcast i8 %ret to i32 -> $[1, 0) = \bot \cup \big ([1,0) \cap \bot \big)$ = OrigCR ∪ (ImposedCR ∩ RetCR) ; ImposedCR = [1,0) meets the edge requirement.

Mark edges switch -> end and switch -> default as feasible (markEdgeExecutable)

Visit end:

%phi = phi i8 [ 0, %switch], [ %ret.default, %default ] -> [0,1) ( constantrange), as edge default -> end is not feasible temporarily.

ret i8 %phi -> set the ret range of foo as [0,1)

Add user %ret = tail call i8 @foo() to worklist

Update %ret as $[0,1) = \bot \cup [0,1)$, as old range is $\bot$ and the new range is $[0,1)$.

Add user %ret.default to worklist to update it.

handlePredicateInfo: %ret.default = bitcast i8 %ret to i32 -> $[1, 0) = [1,0) \cup \big ([1,0) \cap [0, 1) \big)$ = OrigCR ∪ (ImposedCR ∩ RetCR).

Visit default:

Mark edge default-> end as feasible (markEdgeExecutable) and push %phi to worklist

Update %phi as union of 0 and %ret -> $[0,1) \cup [1,0)= \top$

Update ret range of foo as $\top$, i.e., overdefined

.... SCCP ends with %phi unchanged.

The final ret range of foo is $\top$, i.e., overdefined.

pre-commit: PR165258

8d7485c

github-actions bot mentioned this pull request Oct 31, 2025

Task submission #1312

Open

github-actions bot added 2 commits October 31, 2025 14:16

pre-commit: Update

1633000

pre-commit: Remap

88be3e6

Camsyn reviewed Nov 1, 2025

View reviewed changes

Camsyn mentioned this pull request Nov 4, 2025

[SimplifyCFG][Local] Redirect other phi values from BB to Succ except commom preds llvm/llvm-project#166390

Open

Camsyn reviewed Nov 5, 2025

View reviewed changes

dtcxzyw closed this Dec 10, 2025

dtcxzyw deleted the test-run18974668236 branch December 10, 2025 15:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pre-commit: PR165258 #3007

pre-commit: PR165258 #3007

Uh oh!

zyw-bot commented Oct 31, 2025

Uh oh!

zyw-bot commented Oct 31, 2025

Uh oh!

github-actions bot commented Oct 31, 2025

Uh oh!

Camsyn Nov 1, 2025

Uh oh!

Camsyn Nov 3, 2025 •

edited

Loading

Uh oh!

Camsyn Nov 5, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pre-commit: PR165258 #3007

pre-commit: PR165258 #3007

Uh oh!

Conversation

zyw-bot commented Oct 31, 2025

Uh oh!

zyw-bot commented Oct 31, 2025

Diff mode

Uh oh!

github-actions bot commented Oct 31, 2025

Uh oh!

Camsyn Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

Camsyn Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Details

Uh oh!

Camsyn Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Camsyn Nov 3, 2025 •

edited

Loading

Camsyn Nov 5, 2025 •

edited

Loading