Skip to content

Conversation

@dtcxzyw
Copy link
Owner

@dtcxzyw dtcxzyw commented Dec 27, 2023

@github-actions
Copy link
Contributor

baseline: llvm/llvm-project@fe94ae8
patch: llvm/llvm-project#76458
sha256: 41e6a325a9c1b1fdd9794e6a641d2a150e02df3f8cf0999086a3885213806a66
commit: ac88e3d
ac88e3d pre-commit: Update
bench/abseil-cpp/optimized/algorithm_test.cc.ll
bench/abseil-cpp/optimized/bernoulli_distribution_test.cc.ll
bench/abseil-cpp/optimized/beta_distribution_test.cc.ll
bench/abseil-cpp/optimized/bit_gen_ref_test.cc.ll
bench/abseil-cpp/optimized/chi_square_test.cc.ll
bench/abseil-cpp/optimized/common_policy_traits_test.cc.ll
bench/abseil-cpp/optimized/container_test.cc.ll
bench/abseil-cpp/optimized/convert_test.cc.ll
bench/abseil-cpp/optimized/cord.cc.ll
bench/abseil-cpp/optimized/cord_rep_btree_navigator_test.cc.ll
bench/abseil-cpp/optimized/cord_rep_btree_reader_test.cc.ll
bench/abseil-cpp/optimized/cord_rep_btree_test.cc.ll
bench/abseil-cpp/optimized/cord_rep_crc_test.cc.ll
bench/abseil-cpp/optimized/cord_test.cc.ll
bench/abseil-cpp/optimized/cordz_handle.cc.ll
bench/abseil-cpp/optimized/cordz_handle_test.cc.ll
bench/abseil-cpp/optimized/cordz_info_statistics_test.cc.ll
bench/abseil-cpp/optimized/cordz_info_test.cc.ll
bench/abseil-cpp/optimized/cordz_sample_token_test.cc.ll
bench/abseil-cpp/optimized/cordz_test.cc.ll
bench/abseil-cpp/optimized/crc_cord_state.cc.ll
bench/abseil-cpp/optimized/crc_cord_state_test.cc.ll
bench/abseil-cpp/optimized/crc_memcpy_test.cc.ll
bench/abseil-cpp/optimized/discrete_distribution.cc.ll
bench/abseil-cpp/optimized/discrete_distribution_test.cc.ll
bench/abseil-cpp/optimized/duration_test.cc.ll
bench/abseil-cpp/optimized/endian_test.cc.ll
bench/abseil-cpp/optimized/examples_test.cc.ll
bench/abseil-cpp/optimized/exception_safety_testing_test.cc.ll
bench/abseil-cpp/optimized/explicit_seed_seq_test.cc.ll
bench/abseil-cpp/optimized/exponential_biased_test.cc.ll
bench/abseil-cpp/optimized/exponential_distribution_test.cc.ll
bench/abseil-cpp/optimized/failure_signal_handler_test.cc.ll
bench/abseil-cpp/optimized/fixed_array_test.cc.ll
bench/abseil-cpp/optimized/flat_hash_map_test.cc.ll
bench/abseil-cpp/optimized/flat_hash_set_test.cc.ll
bench/abseil-cpp/optimized/gaussian_distribution_test.cc.ll
bench/abseil-cpp/optimized/graphcycles.cc.ll
bench/abseil-cpp/optimized/graphcycles_test.cc.ll
bench/abseil-cpp/optimized/hash_generator_testing.cc.ll
bench/abseil-cpp/optimized/hash_instantiated_test.cc.ll
bench/abseil-cpp/optimized/hash_policy_traits_test.cc.ll
bench/abseil-cpp/optimized/hash_test.cc.ll
bench/abseil-cpp/optimized/hashtablez_sampler_test.cc.ll
bench/abseil-cpp/optimized/inlined_vector_exception_safety_test.cc.ll
bench/abseil-cpp/optimized/inlined_vector_test.cc.ll
bench/abseil-cpp/optimized/int128_stream_test.cc.ll
bench/abseil-cpp/optimized/int128_test.cc.ll
bench/abseil-cpp/optimized/layout_test.cc.ll
bench/abseil-cpp/optimized/log_basic_test.cc.ll
bench/abseil-cpp/optimized/log_modifier_methods_test.cc.ll
bench/abseil-cpp/optimized/log_severity_test.cc.ll
bench/abseil-cpp/optimized/log_sink_set.cc.ll
bench/abseil-cpp/optimized/log_sink_test.cc.ll
bench/abseil-cpp/optimized/log_streamer_test.cc.ll
bench/abseil-cpp/optimized/log_uniform_int_distribution_test.cc.ll
bench/abseil-cpp/optimized/marshalling.cc.ll
bench/abseil-cpp/optimized/marshalling_test.cc.ll
bench/abseil-cpp/optimized/mock_distributions_test.cc.ll
bench/abseil-cpp/optimized/mocking_bit_gen_test.cc.ll
bench/abseil-cpp/optimized/mutex_test.cc.ll
bench/abseil-cpp/optimized/no_destructor_test.cc.ll
bench/abseil-cpp/optimized/node_hash_map_test.cc.ll
bench/abseil-cpp/optimized/node_hash_set_test.cc.ll
bench/abseil-cpp/optimized/non_temporal_memcpy_test.cc.ll
bench/abseil-cpp/optimized/nonsecure_base_test.cc.ll
bench/abseil-cpp/optimized/numbers_test.cc.ll
bench/abseil-cpp/optimized/parse.cc.ll
bench/abseil-cpp/optimized/parse_test.cc.ll
bench/abseil-cpp/optimized/parser_test.cc.ll
bench/abseil-cpp/optimized/pcg_engine_test.cc.ll
bench/abseil-cpp/optimized/periodic_sampler_test.cc.ll
bench/abseil-cpp/optimized/poisson_distribution_test.cc.ll
bench/abseil-cpp/optimized/randen_engine_test.cc.ll
bench/abseil-cpp/optimized/raw_hash_set_test.cc.ll
bench/abseil-cpp/optimized/reflection.cc.ll
bench/abseil-cpp/optimized/reflection_test.cc.ll
bench/abseil-cpp/optimized/salted_seed_seq_test.cc.ll
bench/abseil-cpp/optimized/sample_recorder_test.cc.ll
bench/abseil-cpp/optimized/seed_sequences.cc.ll
bench/abseil-cpp/optimized/seed_sequences_test.cc.ll
bench/abseil-cpp/optimized/sequence_lock_test.cc.ll
bench/abseil-cpp/optimized/span_test.cc.ll
bench/abseil-cpp/optimized/status_internal.cc.ll
bench/abseil-cpp/optimized/status_test.cc.ll
bench/abseil-cpp/optimized/statusor_test.cc.ll
bench/abseil-cpp/optimized/stderr_log_sink_test.cc.ll
bench/abseil-cpp/optimized/str_cat_test.cc.ll
bench/abseil-cpp/optimized/str_join_test.cc.ll
bench/abseil-cpp/optimized/str_replace.cc.ll
bench/abseil-cpp/optimized/str_replace_test.cc.ll
bench/abseil-cpp/optimized/str_split_test.cc.ll
bench/abseil-cpp/optimized/time_test.cc.ll
bench/abseil-cpp/optimized/time_zone_impl.cc.ll
bench/abseil-cpp/optimized/time_zone_info.cc.ll
bench/abseil-cpp/optimized/unordered_map_test.cc.ll
bench/abseil-cpp/optimized/unordered_set_test.cc.ll
bench/abseil-cpp/optimized/usage.cc.ll
bench/abseil-cpp/optimized/vlog_config.cc.ll

@dtcxzyw
Copy link
Owner Author

dtcxzyw commented Dec 27, 2023

Regression:

diff --git a/bench/proxygen/optimized/HTTPTransactionEgressSM.cpp.ll b/bench/proxygen/optimized/HTTPTransactionEgressSM.cpp.ll
index 94a76d09..b01dce6a 100644
--- a/bench/proxygen/optimized/HTTPTransactionEgressSM.cpp.ll
+++ b/bench/proxygen/optimized/HTTPTransactionEgressSM.cpp.ll
@@ -243,7 +243,13 @@ if.then.i.i:                                      ; preds = %entry
 
 _ZNSt6vectorISt4pairIS0_IN8proxygen27HTTPTransactionEgressSMData5StateENS2_5EventEES3_ESaIS6_EE17_S_check_init_lenEmRKS7_.exit.i: ; preds = %entry
   %cmp.not.i.i = icmp eq i64 %__l.coerce1, 0
-  br i1 %cmp.not.i.i, label %invoke.cont, label %for.body.i.i.i.i.preheader.i
+  br i1 %cmp.not.i.i, label %_ZNSt12_Vector_baseISt4pairIS0_IN8proxygen27HTTPTransactionEgressSMData5StateENS2_5EventEES3_ESaIS6_EE11_M_allocateEm.exit.thread.i, label %for.body.i.i.i.i.preheader.i
+
+_ZNSt12_Vector_baseISt4pairIS0_IN8proxygen27HTTPTransactionEgressSMData5StateENS2_5EventEES3_ESaIS6_EE11_M_allocateEm.exit.thread.i: ; preds = %_ZNSt6vectorISt4pairIS0_IN8proxygen27HTTPTransactionEgressSMData5StateENS2_5EventEES3_ESaIS6_EE17_S_check_init_lenEmRKS7_.exit.i
+  %add.ptr8.i = getelementptr inbounds i8, ptr null, i64 %add.ptr.i.idx
+  %_M_end_of_storage9.i = getelementptr inbounds %"struct.std::_Vector_base<std::pair<std::pair<proxygen::HTTPTransactionEgressSMData::State, proxygen::HTTPTransactionEgressSMData::Event>, proxygen::HTTPTransactionEgressSMData::State>, std::allocator<std::pair<std::pair<proxygen::HTTPTransactionEgressSMData::State, proxygen::HTTPTransactionEgressSMData::Event>, proxygen::HTTPTransactionEgressSMData::State>>>::_Vector_impl_data", ptr %this, i64 0, i32 2
+  store ptr %add.ptr8.i, ptr %_M_end_of_storage9.i, align 8
+  br label %invoke.cont
 
 for.body.i.i.i.i.preheader.i:                     ; preds = %_ZNSt6vectorISt4pairIS0_IN8proxygen27HTTPTransactionEgressSMData5StateENS2_5EventEES3_ESaIS6_EE17_S_check_init_lenEmRKS7_.exit.i
   %call5.i.i.i.i2 = invoke noalias noundef nonnull ptr @_Znwm(i64 noundef %add.ptr.i.idx) #15
@@ -251,7 +257,7 @@ for.body.i.i.i.i.preheader.i:                     ; preds = %_ZNSt6vectorISt4pai
 
 call5.i.i.i.i.noexc:                              ; preds = %for.body.i.i.i.i.preheader.i
   store ptr %call5.i.i.i.i2, ptr %this, align 8
-  %add.ptr.i1 = getelementptr inbounds %"struct.std::pair.5", ptr %call5.i.i.i.i2, i64 %__l.coerce1
+  %add.ptr.i1 = getelementptr inbounds i8, ptr %call5.i.i.i.i2, i64 %add.ptr.i.idx
   %_M_end_of_storage.i = getelementptr inbounds %"struct.std::_Vector_base<std::pair<std::pair<proxygen::HTTPTransactionEgressSMData::State, proxygen::HTTPTransactionEgressSMData::Event>, proxygen::HTTPTransactionEgressSMData::State>, std::allocator<std::pair<std::pair<proxygen::HTTPTransactionEgressSMData::State, proxygen::HTTPTransactionEgressSMData::Event>, proxygen::HTTPTransactionEgressSMData::State>>>::_Vector_impl_data", ptr %this, i64 0, i32 2
   store ptr %add.ptr.i1, ptr %_M_end_of_storage.i, align 8
   %0 = add nsw i64 %add.ptr.i.idx, -3
@@ -262,8 +268,8 @@ call5.i.i.i.i.noexc:                              ; preds = %for.body.i.i.i.i.pr
   %scevgep.i = getelementptr i8, ptr %call5.i.i.i.i2, i64 %3
   br label %invoke.cont
 
-invoke.cont:                                      ; preds = %_ZNSt6vectorISt4pairIS0_IN8proxygen27HTTPTransactionEgressSMData5StateENS2_5EventEES3_ESaIS6_EE17_S_check_init_lenEmRKS7_.exit.i, %call5.i.i.i.i.noexc
-  %__cur.0.lcssa.i.i.i.i.i = phi ptr [ %scevgep.i, %call5.i.i.i.i.noexc ], [ null, %_ZNSt6vectorISt4pairIS0_IN8proxygen27HTTPTransactionEgressSMData5StateENS2_5EventEES3_ESaIS6_EE17_S_check_init_lenEmRKS7_.exit.i ]
+invoke.cont:                                      ; preds = %call5.i.i.i.i.noexc, %_ZNSt12_Vector_baseISt4pairIS0_IN8proxygen27HTTPTransactionEgressSMData5StateENS2_5EventEES3_ESaIS6_EE11_M_allocateEm.exit.thread.i
+  %__cur.0.lcssa.i.i.i.i.i = phi ptr [ %scevgep.i, %call5.i.i.i.i.noexc ], [ null, %_ZNSt12_Vector_baseISt4pairIS0_IN8proxygen27HTTPTransactionEgressSMData5StateENS2_5EventEES3_ESaIS6_EE11_M_allocateEm.exit.thread.i ]
   %_M_finish.i = getelementptr inbounds %"struct.std::_Vector_base<std::pair<std::pair<proxygen::HTTPTransactionEgressSMData::State, proxygen::HTTPTransactionEgressSMData::Event>, proxygen::HTTPTransactionEgressSMData::State>, std::allocator<std::pair<std::pair<proxygen::HTTPTransactionEgressSMData::State, proxygen::HTTPTransactionEgressSMData::Event>, proxygen::HTTPTransactionEgressSMData::State>>>::_Vector_impl_data", ptr %this, i64 0, i32 1
   store ptr %__cur.0.lcssa.i.i.i.i.i, ptr %_M_finish.i, align 8
   ret void

@nikic
Copy link

nikic commented Dec 28, 2023

@dtcxzyw It's possible to fold getelementptr inbounds i8, ptr null, i64 %add.ptr.i.idx to null (assuming this is not compiled with -fno-delete-null-pointer-checks).

@dtcxzyw
Copy link
Owner Author

dtcxzyw commented Dec 28, 2023

@dtcxzyw It's possible to fold getelementptr inbounds i8, ptr null, i64 %add.ptr.i.idx to null (assuming this is not compiled with -fno-delete-null-pointer-checks).

Alive2: https://alive2.llvm.org/ce/z/Zy_4Dz
My concern is that it would break some incorrect offsetof idioms like #define offsetof(st, m) ((size_t)&(((st *)0)->m)), which is UB according to the C/C++ standard.
Example: https://godbolt.org/z/PvT57z895

I will post a patch later.

@nikic
Copy link

nikic commented Dec 28, 2023

@dtcxzyw We could avoid doing the transform if the offset is also constant. Or change clang frontend to not emit inbounds for this specific pattern. I agree that it would be best not to break it.

@dtcxzyw
Copy link
Owner Author

dtcxzyw commented Dec 28, 2023

A possible miscompilation after simplifying gep inbounds null, idx -> null:
https://godbolt.org/z/Gshj9rz6e

#include <vector>

std::vector<std::vector<int>> test(std::vector<int> y) {
	std::vector<std::vector<int>> x;
	x.push_back(y);
	return x;
}

diff:

148,151c148
<   %add.ptr.i.i.i.i.i70 = getelementptr inbounds i32, ptr null, i64 %sub.ptr.div.i.i.i.i
<   %_M_end_of_storage.i.i.i.i.i71 = getelementptr inbounds %"struct.std::_Vector_base<int, std::allocator<int>>::_Vector_impl_data", ptr %add.ptr, i64 0, i32 2
<   tail call void @llvm.memset.p0.i64(ptr noundef nonnull align 8 dereferenceable(16) %add.ptr, i8 0, i64 16, i1 false)
<   store ptr %add.ptr.i.i.i.i.i70, ptr %_M_end_of_storage.i.i.i.i.i71, align 8, !tbaa !16
---
>   tail call void @llvm.memset.p0.i64(ptr noundef nonnull align 8 dereferenceable(24) %add.ptr, i8 0, i64 24, i1 false)

But this BB seems to be unreachable.

@dtcxzyw dtcxzyw force-pushed the main branch 9 times, most recently from 296115f to 7225cd8 Compare September 17, 2024 07:23
@dtcxzyw dtcxzyw closed this Oct 22, 2024
@dtcxzyw dtcxzyw deleted the test-pr76458 branch October 22, 2024 12:01
dtcxzyw added a commit to llvm/llvm-project that referenced this pull request Apr 11, 2025
…se pointers (#130734)

In the LLVM middle-end we want to fold `gep inbounds null, idx -> null`:
https://alive2.llvm.org/ce/z/5ZkPx-
This pattern is common in real-world programs
(dtcxzyw/llvm-opt-benchmark#55 (comment)).
Generally, it exists in some (actually) unreachable blocks, which is
introduced by JumpThreading.

However, some old-style offsetof macros are still widely used in
real-world C/C++ code (e.g., hwloc/slurm/luajit). To avoid breaking
existing code and inconvenience to downstream users, this patch removes
the inbounds flag from the struct gep if the base pointer is null.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Apr 11, 2025
…ith null base pointers (#130734)

In the LLVM middle-end we want to fold `gep inbounds null, idx -> null`:
https://alive2.llvm.org/ce/z/5ZkPx-
This pattern is common in real-world programs
(dtcxzyw/llvm-opt-benchmark#55 (comment)).
Generally, it exists in some (actually) unreachable blocks, which is
introduced by JumpThreading.

However, some old-style offsetof macros are still widely used in
real-world C/C++ code (e.g., hwloc/slurm/luajit). To avoid breaking
existing code and inconvenience to downstream users, this patch removes
the inbounds flag from the struct gep if the base pointer is null.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants