forked from JuliaLang/julia
-
Notifications
You must be signed in to change notification settings - Fork 0
Patch2 #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some recent additions were not placed into the "correct" spot. Note: empirically, the sort ordering employed here ignores underscores, so I am doing this here, too. A script performed the sorting.
…ang#36177) Co-authored-by: Viral B. Shah <[email protected]> Co-authored-by: Simeon Schaub <[email protected]>
Raw `unreachable` in LLVM is somewhat dangerous since LLVM will literally stop emitting code there, so if the unreachable is wrong, execution will crash into whatever subsequent code exists, causing strange and mysterious errors and incorrect backtraces that are hard to debug without rr. As a result, we basically always emit a safety trap call before an `unreachable` terminator, which will abort execution at the point where unreachable was executed and at least provide proper backtraces. However, we neglected to do this for literal unreachables that came from Julia IR (i.e. ReturnNodes with undef val field). Fix that to make debugging easier if unreachable ever gets accidentally executed.
* Add "Check whitespace" buildkite job * Add `key` values for later `wait` blocks * Add "embedding tests" to buildkite configuration * Eliminate unnecessary work in buildkite builds Don't bother to precompile the Julia system image like we normally would want to if we're just going to run things once. Also use `JULIA_NUM_CORES` instead of hard-coding `-j 6` into the buildsystem. * Update embedding.yml * Update whitespace.yml
* Comment out signed pipeline test This is confirmed working, so let's comment it out until it's actually used by a codesigning step or similar. * Specifically notify llvm passes
* Add `/cache/repos` as a mapping into the CI sandbox This should allow `git` to find its cached objects properly, which should silence the warnings on CI, and also give us the proper git version info within buildkite builds * Break up `llvmpasses` output a bit * Provide `/cache/repos` for `whitespace` as well * Give a positive message if whitespace check passes It's a little unnerving to have a silent command block in buildkite, so let's output a success message if everything is on the up-and-up
when building with USE_BINARYBUILDER=0
fix for 0d Cartesian AbstractArray. This version should be fast enough if the size of Cartesian array's first dim is larger than 16 (eltype Float64).
N5N3
added a commit
that referenced
this pull request
Oct 29, 2021
commit c054dbc Author: Shuhei Kadowaki <[email protected]> Date: Fri Oct 29 01:31:55 2021 +0900 optimizer: eliminate allocations (JuliaLang#42833) commit 6a9737d Author: Jeff Bezanson <[email protected]> Date: Thu Oct 28 12:23:53 2021 -0400 fix JuliaLang#42659, move `jl_coverage_visit_line` to runtime library (JuliaLang#42810) commit c762f10 Author: Marc Ittel <[email protected]> Date: Thu Oct 28 12:19:13 2021 +0200 change `julia` to `julia-repl` in docstrings (JuliaLang#42824) Co-authored-by: Michael Abbott <[email protected]> commit 9f52ec0 Author: Dilum Aluthge <[email protected]> Date: Thu Oct 28 05:30:11 2021 -0400 CI (Buildkite): Update all rootfs images to the latest versions (JuliaLang#42802) * CI (Buildkite): Update all rootfs images to the latest versions * Re-sign all of the signed pipelines commit 404e584 Author: DilumAluthgeBot <[email protected]> Date: Wed Oct 27 21:11:04 2021 -0400 🤖 Bump the Statistics stdlib from 74897fe to 5256d57 (JuliaLang#42826) Co-authored-by: Dilum Aluthge <[email protected]> commit c74814e Author: Jeff Bezanson <[email protected]> Date: Wed Oct 27 16:34:46 2021 -0400 reset `RandomDevice` file from `__init__` (JuliaLang#42537) This prevents us from seeing an invalid `IOStream` object from a saved system image, and also ensures the files are opened once for all threads. commit 05ed348 Author: Jeff Bezanson <[email protected]> Date: Wed Oct 27 15:24:17 2021 -0400 only visit nonfunction_mt once when traversing method tables (JuliaLang#42821) commit d71b77d Author: DilumAluthgeBot <[email protected]> Date: Tue Oct 26 20:39:08 2021 -0400 🤖 Bump the Downloads stdlib from 5f1509d to dbb0625 (JuliaLang#42811) Co-authored-by: Dilum Aluthge <[email protected]> commit b4fddc1 Author: DilumAluthgeBot <[email protected]> Date: Tue Oct 26 14:46:20 2021 -0400 🤖 Bump the Pkg stdlib from bc32103f to 26918395 (JuliaLang#42806) Co-authored-by: Dilum Aluthge <[email protected]> commit 6a386de Author: Dilum Aluthge <[email protected]> Date: Tue Oct 26 12:15:51 2021 -0400 CI (Buildkite): make sure to hit ignore any unencrypted repo keys, regardless of where they are located in the repository (JuliaLang#42803) commit 021a6b5 Author: Shuhei Kadowaki <[email protected]> Date: Wed Oct 27 01:08:33 2021 +0900 optimizer: clean up inlining test code (JuliaLang#42804) commit 16eb196 Merge: 21ebabf 1510eaa Author: Shuhei Kadowaki <[email protected]> Date: Tue Oct 26 23:25:41 2021 +0900 Merge pull request JuliaLang#42766 from JuliaLang/avi/42754 optimizer: fix JuliaLang#42754, inline union-split const-prop'ed sources commit 21ebabf Author: Kristoffer Carlsson <[email protected]> Date: Tue Oct 26 16:11:32 2021 +0200 simplify code loading test now that TOML files are parsed with a real TOML parser (JuliaLang#42328) commit 1510eaa Author: Shuhei Kadowaki <[email protected]> Date: Mon Oct 25 01:35:12 2021 +0900 optimizer: fix JuliaLang#42754, inline union-split const-prop'ed sources This commit complements JuliaLang#39754 and JuliaLang#39305: implements a logic to use constant-prop'ed results for inlining at union-split callsite. Currently it works only for cases when constant-prop' succeeded for all (union-split) signatures. > example ```julia julia> mutable struct X # NOTE in order to confuse `fieldtype_tfunc`, we need to have at least two fields with different types a::Union{Nothing, Int} b::Symbol end; julia> code_typed((X, Union{Nothing,Int})) do x, a # this `setproperty` call would be union-split and constant-prop will happen for # each signature: inlining would fail if we don't use constant-prop'ed source # since the approximated inlining cost of `convert(fieldtype(X, sym), a)` would # end up very high if we don't propagate `sym::Const(:a)` x.a = a x end |> only |> first ``` > before this commit ```julia CodeInfo( 1 ─ %1 = Base.setproperty!::typeof(setproperty!) │ %2 = (isa)(a, Nothing)::Bool └── goto #3 if not %2 2 ─ %4 = π (a, Nothing) │ invoke %1(_2::X,🅰️ :Symbol, %4::Nothing)::Any └── goto #6 3 ─ %7 = (isa)(a, Int64)::Bool └── goto #5 if not %7 4 ─ %9 = π (a, Int64) │ invoke %1(_2::X,🅰️ :Symbol, %9::Int64)::Any └── goto #6 5 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 6 ┄ return x ) ``` > after this commit ```julia CodeInfo( 1 ─ %1 = (isa)(a, Nothing)::Bool └── goto #3 if not %1 2 ─ Base.setfield!(x, :a, nothing)::Nothing └── goto #6 3 ─ %5 = (isa)(a, Int64)::Bool └── goto #5 if not %5 4 ─ %7 = π (a, Int64) │ Base.setfield!(x, :a, %7)::Int64 └── goto #6 5 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 6 ┄ return x ) ``` commit 4c3ae20 Author: Chris Foster <[email protected]> Date: Tue Oct 26 21:48:32 2021 +1000 Make Base.ifelse a generic function (JuliaLang#37343) Allow user code to directly extend `Base.ifelse` rather than needing a special package for it. commit 2e388e3 Author: Shuhei Kadowaki <[email protected]> Date: Mon Oct 25 01:30:09 2021 +0900 optimizer: eliminate excessive specialization in inlining code This commit includes several code quality improvements in inlining code: - eliminate excessive specializations around: * `item::Pair{Any, Any}` constructions * iterations on `Vector{Pair{Any, Any}}` - replace `Pair{Any, Any}` with new, more explicit data type `InliningCase` - remove dead code
N5N3
pushed a commit
that referenced
this pull request
Oct 29, 2021
This commit complements JuliaLang#39754 and JuliaLang#39305: implements a logic to use constant-prop'ed results for inlining at union-split callsite. Currently it works only for cases when constant-prop' succeeded for all (union-split) signatures. > example ```julia julia> mutable struct X # NOTE in order to confuse `fieldtype_tfunc`, we need to have at least two fields with different types a::Union{Nothing, Int} b::Symbol end; julia> code_typed((X, Union{Nothing,Int})) do x, a # this `setproperty` call would be union-split and constant-prop will happen for # each signature: inlining would fail if we don't use constant-prop'ed source # since the approximated inlining cost of `convert(fieldtype(X, sym), a)` would # end up very high if we don't propagate `sym::Const(:a)` x.a = a x end |> only |> first ``` > before this commit ```julia CodeInfo( 1 ─ %1 = Base.setproperty!::typeof(setproperty!) │ %2 = (isa)(a, Nothing)::Bool └── goto #3 if not %2 2 ─ %4 = π (a, Nothing) │ invoke %1(_2::X,🅰️ :Symbol, %4::Nothing)::Any └── goto #6 3 ─ %7 = (isa)(a, Int64)::Bool └── goto #5 if not %7 4 ─ %9 = π (a, Int64) │ invoke %1(_2::X,🅰️ :Symbol, %9::Int64)::Any └── goto #6 5 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 6 ┄ return x ) ``` > after this commit ```julia CodeInfo( 1 ─ %1 = (isa)(a, Nothing)::Bool └── goto #3 if not %1 2 ─ Base.setfield!(x, :a, nothing)::Nothing └── goto #6 3 ─ %5 = (isa)(a, Int64)::Bool └── goto #5 if not %5 4 ─ %7 = π (a, Int64) │ Base.setfield!(x, :a, %7)::Int64 └── goto #6 5 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 6 ┄ return x ) ```
N5N3
pushed a commit
that referenced
this pull request
Apr 3, 2022
Currently the optimizer handles abstract callsite only when there is a single dispatch candidate (in most cases), and so inlining and static-dispatch are prohibited when the callsite is union-split (in other word, union-split happens only when all the dispatch candidates are concrete). However, there are certain patterns of code (most notably our Julia-level compiler code) that inherently need to deal with abstract callsite. The following example is taken from `Core.Compiler` utility: ```julia julia> @inline isType(@nospecialize t) = isa(t, DataType) && t.name === Type.body.name isType (generic function with 1 method) julia> code_typed((Any,)) do x # abstract, but no union-split, successful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (x isa Main.DataType)::Bool └── goto #3 if not %1 2 ─ %3 = π (x, DataType) │ %4 = Base.getfield(%3, :name)::Core.TypeName │ %5 = Base.getfield(Type{T}, :name)::Core.TypeName │ %6 = (%4 === %5)::Bool └── goto #4 3 ─ goto #4 4 ┄ %9 = φ (#2 => %6, #3 => false)::Bool └── return %9 ) => Bool julia> code_typed((Union{Type,Nothing},)) do x # abstract, union-split, unsuccessful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (isa)(x, Nothing)::Bool └── goto #3 if not %1 2 ─ goto #4 3 ─ %4 = Main.isType(x)::Bool └── goto #4 4 ┄ %6 = φ (#2 => false, #3 => %4)::Bool └── return %6 ) => Bool ``` (note that this is a limitation of the inlining algorithm, and so any user-provided hints like callsite inlining annotation doesn't help here) This commit enables inlining and static dispatch for abstract union-split callsite. The core idea here is that we can simulate our dispatch semantics by generating `isa` checks in order of the specialities of dispatch candidates: ```julia julia> code_typed((Union{Type,Nothing},)) do x # union-split, unsuccessful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (isa)(x, Nothing)::Bool └── goto #3 if not %1 2 ─ goto #9 3 ─ %4 = (isa)(x, Type)::Bool └── goto #8 if not %4 4 ─ %6 = π (x, Type) │ %7 = (%6 isa Main.DataType)::Bool └── goto #6 if not %7 5 ─ %9 = π (%6, DataType) │ %10 = Base.getfield(%9, :name)::Core.TypeName │ %11 = Base.getfield(Type{T}, :name)::Core.TypeName │ %12 = (%10 === %11)::Bool └── goto #7 6 ─ goto #7 7 ┄ %15 = φ (#5 => %12, #6 => false)::Bool └── goto #9 8 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 9 ┄ %19 = φ (#2 => false, #7 => %15)::Bool └── return %19 ) => Bool ``` Inlining/static-dispatch of abstract union-split callsite will improve the performance in such situations (and so this commit will improve the latency of our JIT compilation). Especially, this commit helps us avoid excessive specializations of `Core.Compiler` code by statically-resolving `@nospecialize`d callsites, and as the result, the # of precompiled statements is now reduced from `2005` ([`master`](f782430)) to `1912` (this commit). And also, as a side effect, the implementation of our inlining algorithm gets much simplified now since we no longer need the previous special handlings for abstract callsites. One possible drawback would be increased code size. This change seems to certainly increase the size of sysimage, but I think these numbers are in an acceptable range: > [`master`](f782430) ``` ❯ du -shk usr/lib/julia/* 17604 usr/lib/julia/corecompiler.ji 194072 usr/lib/julia/sys-o.a 169424 usr/lib/julia/sys.dylib 23784 usr/lib/julia/sys.dylib.dSYM 103772 usr/lib/julia/sys.ji ``` > this commit ``` ❯ du -shk usr/lib/julia/* 17512 usr/lib/julia/corecompiler.ji 195588 usr/lib/julia/sys-o.a 170908 usr/lib/julia/sys.dylib 23776 usr/lib/julia/sys.dylib.dSYM 105360 usr/lib/julia/sys.ji ```
N5N3
pushed a commit
that referenced
this pull request
Jun 27, 2022
…Lang#45790) Currently the `@nospecialize`-d `push!(::Vector{Any}, ...)` can only take a single item and we will end up with runtime dispatch when we try to call it with multiple items: ```julia julia> code_typed(push!, (Vector{Any}, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing │ %2 = Base.arraylen(a)::Int64 │ Base.arrayset(true, a, item, %2)::Vector{Any} └── return a ) => Vector{Any} julia> code_typed(push!, (Vector{Any}, Any, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ %1 = Base.append!(a, iter)::Vector{Any} └── return %1 ) => Vector{Any} ``` This commit adds a new specialization that it can take arbitrary-length items. Our compiler should still be able to optimize the single-input case as before via the dispatch mechanism. ```julia julia> code_typed(push!, (Vector{Any}, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing │ %2 = Base.arraylen(a)::Int64 │ Base.arrayset(true, a, item, %2)::Vector{Any} └── return a ) => Vector{Any} julia> code_typed(push!, (Vector{Any}, Any, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ %1 = Base.arraylen(a)::Int64 │ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000002, 0x0000000000000002))::Nothing └── goto #7 if not true 2 ┄ %4 = φ (#1 => 1, #6 => %14)::Int64 │ %5 = φ (#1 => 1, #6 => %15)::Int64 │ %6 = Base.getfield(x, %4, true)::Any │ %7 = Base.add_int(%1, %4)::Int64 │ Base.arrayset(true, a, %6, %7)::Vector{Any} │ %9 = (%5 === 2)::Bool └── goto #4 if not %9 3 ─ goto #5 4 ─ %12 = Base.add_int(%5, 1)::Int64 └── goto #5 5 ┄ %14 = φ (#4 => %12)::Int64 │ %15 = φ (#4 => %12)::Int64 │ %16 = φ (#3 => true, #4 => false)::Bool │ %17 = Base.not_int(%16)::Bool └── goto #7 if not %17 6 ─ goto #2 7 ┄ return a ) => Vector{Any} ``` This commit also adds the equivalent implementations for `pushfirst!`.
N5N3
pushed a commit
that referenced
this pull request
Jun 29, 2022
When calling `jl_error()` or `jl_errorf()`, we must check to see if we
are so early in the bringup process that it is dangerous to attempt to
construct a backtrace because the data structures used to provide line
information are not properly setup.
This can be easily triggered by running:
```
julia -C invalid
```
On an `i686-linux-gnu` build, this will hit the "Invalid CPU Name"
branch in `jitlayers.cpp`, which calls `jl_errorf()`. This in turn
calls `jl_throw()`, which will eventually call `jl_DI_for_fptr` as part
of the backtrace printing process, which fails as the object maps are
not fully initialized. See the below `gdb` stacktrace for details:
```
$ gdb -batch -ex 'r' -ex 'bt' --args ./julia -C invalid
...
fatal: error thrown and no exception handler available.
ErrorException("Invalid CPU name "invalid".")
Thread 1 "julia" received signal SIGSEGV, Segmentation fault.
0xf75bd665 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo>, std::_Select1st<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> >, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__k=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h:1277
1277 /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h: No such file or directory.
#0 0xf75bd665 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo>, std::_Select1st<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> >, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__k=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h:1277
#1 std::map<unsigned int, JITDebugInfoRegistry::ObjectInfo, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__x=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_map.h:1258
#2 jl_DI_for_fptr (fptr=4155049385, symsize=symsize@entry=0xffffcfa8, slide=slide@entry=0xffffcfa0, Section=Section@entry=0xffffcfb8, context=context@entry=0xffffcf94) at /cache/build/default-amdci5-4/julialang/julia-master/src/debuginfo.cpp:1181
#3 0xf75c056a in jl_getFunctionInfo_impl (frames_out=0xffffd03c, pointer=4155049385, skipC=0, noInline=0) at /cache/build/default-amdci5-4/julialang/julia-master/src/debuginfo.cpp:1210
#4 0xf7a6ca98 in jl_print_native_codeloc (ip=4155049385) at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:636
#5 0xf7a6cd54 in jl_print_bt_entry_codeloc (bt_entry=0xf0798018) at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:657
#6 jlbacktrace () at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:1090
#7 0xf7a3cd2b in ijl_no_exc_handler (e=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:605
#8 0xf7a3d10a in throw_internal (ct=ct@entry=0xf070c010, exception=<optimized out>, exception@entry=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:638
#9 0xf7a3d330 in ijl_throw (e=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:654
#10 0xf7a905aa in ijl_errorf (fmt=fmt@entry=0xf7647cd4 "Invalid CPU name \"%s\".") at /cache/build/default-amdci5-4/julialang/julia-master/src/rtutils.c:77
#11 0xf75a4b22 in (anonymous namespace)::createTargetMachine () at /cache/build/default-amdci5-4/julialang/julia-master/src/jitlayers.cpp:823
#12 JuliaOJIT::JuliaOJIT (this=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/jitlayers.cpp:1044
#13 0xf7531793 in jl_init_llvm () at /cache/build/default-amdci5-4/julialang/julia-master/src/codegen.cpp:8585
#14 0xf75318a8 in jl_init_codegen_impl () at /cache/build/default-amdci5-4/julialang/julia-master/src/codegen.cpp:8648
#15 0xf7a51a52 in jl_restore_system_image_from_stream (f=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2131
#16 0xf7a55c03 in ijl_restore_system_image_data (buf=0xe859c1c0 <jl_system_image_data> "8'\031\003", len=125161105) at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2184
#17 0xf7a55cf9 in jl_load_sysimg_so () at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:424
#18 ijl_restore_system_image (fname=0x80a0900 "/build/bk_download/julia-d78fdad601/lib/julia/sys.so") at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2157
#19 0xf7a3bdfc in _finish_julia_init (rel=rel@entry=JL_IMAGE_JULIA_HOME, ct=<optimized out>, ptls=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/init.c:741
#20 0xf7a3c8ac in julia_init (rel=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/init.c:728
#21 0xf7a7f61d in jl_repl_entrypoint (argc=<optimized out>, argv=0xffffddf4) at /cache/build/default-amdci5-4/julialang/julia-master/src/jlapi.c:705
#22 0x080490a7 in main (argc=3, argv=0xffffddf4) at /cache/build/default-amdci5-4/julialang/julia-master/cli/loader_exe.c:59
```
To prevent this, we simply avoid calling `jl_errorf` this early in the
process, punting the problem to a later PR that can update guard
conditions within `jl_error*`.
mbauman
pushed a commit
that referenced
this pull request
Dec 11, 2023
This is part of the work to address JuliaLang#51352 by attempting to allow the compiler to perform SRAO on persistent data structures like `PersistentDict` as if they were regular immutable data structures. These sorts of data structures have very complicated internals (with lots of mutation, memory sharing, etc.), but a relatively simple interface. As such, it is unlikely that our compiler will have sufficient power to optimize this interface by analyzing the implementation. We thus need to come up with some other mechanism that gives the compiler license to perform the requisite optimization. One way would be to just hardcode `PersistentDict` into the compiler, optimizing it like any of the other builtin datatypes. However, this is of course very unsatisfying. At the other end of the spectrum would be something like a generic rewrite rule system (e-graphs anyone?) that would let the PersistentDict implementation declare its interface to the compiler and the compiler would use this for optimization (in a perfect world, the actual rewrite would then be checked using some sort of formal methods). I think that would be interesting, but we're very far from even being able to design something like that (at least in Base - experiments with external AbstractInterpreters in this direction are encouraged). This PR tries to come up with a reasonable middle ground, where the compiler gets some knowledge of the protocol hardcoded without having to know about the implementation details of the data structure. The basic ideas is that `Core` provides some magic generic functions that implementations can extend. Semantically, they are not special. They dispatch as usual, and implementations are expected to work properly even in the absence of any compiler optimizations. However, the compiler is semantically permitted to perform structural optimization using these magic generic functions. In the concrete case, this PR introduces the `KeyValue` interface which consists of two generic functions, `get` and `set`. The core optimization is that the compiler is allowed to rewrite any occurrence of `get(set(x, k, v), k)` into `v` without additional legality checks. In particular, the compiler performs no type checks, conversions, etc. The higher level implementation code is expected to do all that. This approach closely matches the general direction we've been taking in external AbstractInterpreters for embedding additional semantics and optimization opportunities into Julia code (although we generally use methods there, rather than full generic functions), so I think we have some evidence that this sort of approach works reasonably well. Nevertheless, this is certainly an experiment and the interface is explicitly declared unstable. ## Current Status This is fully working and implemented, but the optimization currently bails on anything but the simplest cases. Filling all those cases in is not particularly hard, but should be done along with a more invasive refactoring of SROA, so we should figure out the general direction here first and then we can finish all that up in a follow-up cleanup. ## Obligatory benchmark Before: ``` julia> using BenchmarkTools julia> function foo() a = Base.PersistentDict(:a => 1) return a[:a] end foo (generic function with 1 method) julia> @benchmark foo() BenchmarkTools.Trial: 10000 samples with 993 evaluations. Range (min … max): 32.940 ns … 28.754 μs ┊ GC (min … max): 0.00% … 99.76% Time (median): 49.647 ns ┊ GC (median): 0.00% Time (mean ± σ): 57.519 ns ± 333.275 ns ┊ GC (mean ± σ): 10.81% ± 2.22% ▃█▅ ▁▃▅▅▃▁ ▁▃▂ ▂ ▁▂▄▃▅▇███▇▃▁▂▁▁▁▁▁▁▁▁▂▂▅██████▅▂▁▁▁▁▁▁▁▁▁▁▂▃▃▇███▇▆███▆▄▃▃▂▂ ▃ 32.9 ns Histogram: frequency by time 68.6 ns < Memory estimate: 128 bytes, allocs estimate: 4. julia> @code_typed foo() CodeInfo( 1 ─ %1 = invoke Vector{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}}(Base.HashArrayMappedTries.undef::UndefInitializer, 1::Int64)::Vector{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}} │ %2 = %new(Base.HashArrayMappedTries.HAMT{Symbol, Int64}, %1, 0x00000000)::Base.HashArrayMappedTries.HAMT{Symbol, Int64} │ %3 = %new(Base.HashArrayMappedTries.Leaf{Symbol, Int64}, :a, 1)::Base.HashArrayMappedTries.Leaf{Symbol, Int64} │ %4 = Base.getfield(%2, :data)::Vector{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}} │ %5 = $(Expr(:boundscheck, true))::Bool └── goto #5 if not %5 2 ─ %7 = Base.sub_int(1, 1)::Int64 │ %8 = Base.bitcast(UInt64, %7)::UInt64 │ %9 = Base.getfield(%4, :size)::Tuple{Int64} │ %10 = $(Expr(:boundscheck, true))::Bool │ %11 = Base.getfield(%9, 1, %10)::Int64 │ %12 = Base.bitcast(UInt64, %11)::UInt64 │ %13 = Base.ult_int(%8, %12)::Bool └── goto #4 if not %13 3 ─ goto #5 4 ─ %16 = Core.tuple(1)::Tuple{Int64} │ invoke Base.throw_boundserror(%4::Vector{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}}, %16::Tuple{Int64})::Union{} └── unreachable 5 ┄ %19 = Base.getfield(%4, :ref)::MemoryRef{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}} │ %20 = Base.memoryref(%19, 1, false)::MemoryRef{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}} │ Base.memoryrefset!(%20, %3, :not_atomic, false)::MemoryRef{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}} └── goto #6 6 ─ %23 = Base.getfield(%2, :bitmap)::UInt32 │ %24 = Base.or_int(%23, 0x00010000)::UInt32 │ Base.setfield!(%2, :bitmap, %24)::UInt32 └── goto #7 7 ─ %27 = %new(Base.PersistentDict{Symbol, Int64}, %2)::Base.PersistentDict{Symbol, Int64} └── goto #8 8 ─ %29 = invoke Base.getindex(%27::Base.PersistentDict{Symbol, Int64},🅰️ :Symbol)::Int64 └── return %29 ``` After: ``` julia> using BenchmarkTools julia> function foo() a = Base.PersistentDict(:a => 1) return a[:a] end foo (generic function with 1 method) julia> @benchmark foo() BenchmarkTools.Trial: 10000 samples with 1000 evaluations. Range (min … max): 2.459 ns … 11.320 ns ┊ GC (min … max): 0.00% … 0.00% Time (median): 2.460 ns ┊ GC (median): 0.00% Time (mean ± σ): 2.469 ns ± 0.183 ns ┊ GC (mean ± σ): 0.00% ± 0.00% ▂ █ ▁ █ ▂ █▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁█ █ 2.46 ns Histogram: log(frequency) by time 2.47 ns < Memory estimate: 0 bytes, allocs estimate: 0. julia> @code_typed foo() CodeInfo( 1 ─ return 1 ```
N5N3
pushed a commit
that referenced
this pull request
Jan 27, 2024
`@something` eagerly unwraps any `Some` given to it, while keeping the
variable between its arguments the same. This can be an issue if a
previously unpacked value is used as input to `@something`, leading to a
type instability on more than two arguments (e.g. because of a fallback
to `Some(nothing)`). By using different variables for each argument,
type inference has an easier time handling these cases that are isolated
to single branches anyway.
This also adds some comments to the macro, since it's non-obvious what
it does.
Benchmarking the specific case I encountered this in led to a ~2x
performance improvement on multiple machines.
1.10-beta3/master:
```
[sukera@tower 01]$ jl1100 -q --project=. -L 01.jl -e 'bench()'
v"1.10.0-beta3"
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 38.670 μs … 70.350 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 43.340 μs ┊ GC (median): 0.00%
Time (mean ± σ): 43.395 μs ± 1.518 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▆█▂ ▁▁
▂▂▂▂▂▂▂▂▂▁▂▂▂▃▃▃▂▂▃▃▃▂▂▂▂▂▄▇███▆██▄▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
38.7 μs Histogram: frequency by time 48 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
```
This PR:
```
[sukera@tower 01]$ julia -q --project=. -L 01.jl -e 'bench()'
v"1.11.0-DEV.970"
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 22.820 μs … 44.980 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 24.300 μs ┊ GC (median): 0.00%
Time (mean ± σ): 24.370 μs ± 832.239 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▂▅▇██▇▆▅▁
▂▂▂▂▂▂▂▂▃▃▄▅▇███████████▅▄▃▃▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▂▂ ▃
22.8 μs Histogram: frequency by time 27.7 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
```
<details>
<summary>Benchmarking code (spoilers for Advent Of Code 2023 Day 01,
Part 01). Running this requires the input of that Advent Of Code
day.</summary>
```julia
using BenchmarkTools
using InteractiveUtils
isdigit(d::UInt8) = UInt8('0') <= d <= UInt8('9')
someDigit(c::UInt8) = isdigit(c) ? Some(c - UInt8('0')) : nothing
function part1(data)
total = 0
may_a = nothing
may_b = nothing
for c in data
digitRes = someDigit(c)
may_a = @something may_a digitRes Some(nothing)
may_b = @something digitRes may_b Some(nothing)
if c == UInt8('\n')
digit_a = may_a::UInt8
digit_b = may_b::UInt8
total += digit_a*0xa + digit_b
may_a = nothing
may_b = nothing
end
end
return total
end
function bench()
data = read("input.txt")
display(VERSION)
println()
display(@benchmark part1($data))
nothing
end
```
</details>
<details>
<summary>`@code_warntype` before</summary>
```julia
julia> @code_warntype part1(data)
MethodInstance for part1(::Vector{UInt8})
from part1(data) @ Main ~/Documents/projects/AOC/2023/01/01.jl:7
Arguments
#self#::Core.Const(part1)
data::Vector{UInt8}
Locals
@_3::Union{Nothing, Tuple{UInt8, Int64}}
may_b::Union{Nothing, UInt8}
may_a::Union{Nothing, UInt8}
total::Int64
c::UInt8
digit_b::UInt8
digit_a::UInt8
val@_10::Any
val@_11::Any
digitRes::Union{Nothing, Some{UInt8}}
@_13::Union{Some{Nothing}, Some{UInt8}, UInt8}
@_14::Union{Some{Nothing}, Some{UInt8}}
@_15::Some{Nothing}
@_16::Union{Some{Nothing}, Some{UInt8}, UInt8}
@_17::Union{Some{Nothing}, UInt8}
@_18::Some{Nothing}
Body::Int64
1 ── (total = 0)
│ (may_a = Main.nothing)
│ (may_b = Main.nothing)
│ %4 = data::Vector{UInt8}
│ (@_3 = Base.iterate(%4))
│ %6 = (@_3 === nothing)::Bool
│ %7 = Base.not_int(%6)::Bool
└─── goto JuliaLang#24 if not %7
2 ┄─ Core.NewvarNode(:(digit_b))
│ Core.NewvarNode(:(digit_a))
│ Core.NewvarNode(:(val@_10))
│ %12 = @_3::Tuple{UInt8, Int64}
│ (c = Core.getfield(%12, 1))
│ %14 = Core.getfield(%12, 2)::Int64
│ (digitRes = Main.someDigit(c))
│ (val@_11 = may_a)
│ %17 = (val@_11::Union{Nothing, UInt8} !== Base.nothing)::Bool
└─── goto #4 if not %17
3 ── (@_13 = val@_11::UInt8)
└─── goto #11
4 ── (val@_11 = digitRes)
│ %22 = (val@_11::Union{Nothing, Some{UInt8}} !== Base.nothing)::Bool
└─── goto #6 if not %22
5 ── (@_14 = val@_11::Some{UInt8})
└─── goto #10
6 ── (val@_11 = Main.Some(Main.nothing))
│ %27 = (val@_11::Core.Const(Some(nothing)) !== Base.nothing)::Core.Const(true)
└─── goto #8 if not %27
7 ── (@_15 = val@_11::Core.Const(Some(nothing)))
└─── goto #9
8 ── Core.Const(:(@_15 = Base.nothing))
9 ┄─ (@_14 = @_15)
10 ┄ (@_13 = @_14)
11 ┄ %34 = @_13::Union{Some{Nothing}, Some{UInt8}, UInt8}
│ (may_a = Base.something(%34))
│ (val@_10 = digitRes)
│ %37 = (val@_10::Union{Nothing, Some{UInt8}} !== Base.nothing)::Bool
└─── goto #13 if not %37
12 ─ (@_16 = val@_10::Some{UInt8})
└─── goto #20
13 ─ (val@_10 = may_b)
│ %42 = (val@_10::Union{Nothing, UInt8} !== Base.nothing)::Bool
└─── goto #15 if not %42
14 ─ (@_17 = val@_10::UInt8)
└─── goto #19
15 ─ (val@_10 = Main.Some(Main.nothing))
│ %47 = (val@_10::Core.Const(Some(nothing)) !== Base.nothing)::Core.Const(true)
└─── goto #17 if not %47
16 ─ (@_18 = val@_10::Core.Const(Some(nothing)))
└─── goto #18
17 ─ Core.Const(:(@_18 = Base.nothing))
18 ┄ (@_17 = @_18)
19 ┄ (@_16 = @_17)
20 ┄ %54 = @_16::Union{Some{Nothing}, Some{UInt8}, UInt8}
│ (may_b = Base.something(%54))
│ %56 = c::UInt8
│ %57 = Main.UInt8('\n')::Core.Const(0x0a)
│ %58 = (%56 == %57)::Bool
└─── goto #22 if not %58
21 ─ (digit_a = Core.typeassert(may_a, Main.UInt8))
│ (digit_b = Core.typeassert(may_b, Main.UInt8))
│ %62 = total::Int64
│ %63 = (digit_a * 0x0a)::UInt8
│ %64 = (%63 + digit_b)::UInt8
│ (total = %62 + %64)
│ (may_a = Main.nothing)
└─── (may_b = Main.nothing)
22 ┄ (@_3 = Base.iterate(%4, %14))
│ %69 = (@_3 === nothing)::Bool
│ %70 = Base.not_int(%69)::Bool
└─── goto JuliaLang#24 if not %70
23 ─ goto #2
24 ┄ return total
```
</details>
<details>
<summary>`@code_native debuginfo=:none` Before </summary>
```julia
julia> @code_native debuginfo=:none part1(data)
.text
.file "part1"
.globl julia_part1_418 # -- Begin function julia_part1_418
.p2align 4, 0x90
.type julia_part1_418,@function
julia_part1_418: # @julia_part1_418
# %bb.0: # %top
push rbp
mov rbp, rsp
push r15
push r14
push r13
push r12
push rbx
sub rsp, 40
mov rax, qword ptr [rdi + 8]
test rax, rax
je .LBB0_1
# %bb.2: # %L17
mov rcx, qword ptr [rdi]
dec rax
mov r10b, 1
xor r14d, r14d
# implicit-def: $r12b
# implicit-def: $r13b
# implicit-def: $r9b
# implicit-def: $sil
mov qword ptr [rbp - 64], rax # 8-byte Spill
mov al, 1
mov dword ptr [rbp - 48], eax # 4-byte Spill
# implicit-def: $al
# kill: killed $al
xor eax, eax
mov qword ptr [rbp - 56], rax # 8-byte Spill
mov qword ptr [rbp - 72], rcx # 8-byte Spill
# implicit-def: $cl
jmp .LBB0_3
.p2align 4, 0x90
.LBB0_8: # in Loop: Header=BB0_3 Depth=1
mov dword ptr [rbp - 48], 0 # 4-byte Folded Spill
.LBB0_24: # %post_union_move
# in Loop: Header=BB0_3 Depth=1
movzx r13d, byte ptr [rbp - 41] # 1-byte Folded Reload
mov r12d, r8d
cmp qword ptr [rbp - 64], r14 # 8-byte Folded Reload
je .LBB0_13
.LBB0_25: # %guard_exit113
# in Loop: Header=BB0_3 Depth=1
inc r14
mov r10d, ebx
.LBB0_3: # %L19
# =>This Inner Loop Header: Depth=1
mov rax, qword ptr [rbp - 72] # 8-byte Reload
xor ebx, ebx
xor edi, edi
movzx r15d, r9b
movzx ecx, cl
movzx esi, sil
mov r11b, 1
# implicit-def: $r9b
movzx edx, byte ptr [rax + r14]
lea eax, [rdx - 58]
lea r8d, [rdx - 48]
cmp al, -10
setae bl
setb dil
test r10b, 1
cmovne r15d, edi
mov edi, 0
cmovne ecx, ebx
mov bl, 1
cmovne esi, edi
test r15b, 1
jne .LBB0_7
# %bb.4: # %L76
# in Loop: Header=BB0_3 Depth=1
mov r11b, 2
test cl, 1
jne .LBB0_5
# %bb.6: # %L78
# in Loop: Header=BB0_3 Depth=1
mov ebx, r10d
mov r9d, r15d
mov byte ptr [rbp - 41], r13b # 1-byte Spill
test sil, 1
je .LBB0_26
.LBB0_7: # %L82
# in Loop: Header=BB0_3 Depth=1
cmp al, -11
jbe .LBB0_9
jmp .LBB0_8
.p2align 4, 0x90
.LBB0_5: # in Loop: Header=BB0_3 Depth=1
mov ecx, r8d
mov sil, 1
xor ebx, ebx
mov byte ptr [rbp - 41], r8b # 1-byte Spill
xor r9d, r9d
xor ecx, ecx
cmp al, -11
ja .LBB0_8
.LBB0_9: # %L90
# in Loop: Header=BB0_3 Depth=1
test byte ptr [rbp - 48], 1 # 1-byte Folded Reload
jne .LBB0_23
# %bb.10: # %L115
# in Loop: Header=BB0_3 Depth=1
cmp dl, 10
jne .LBB0_11
# %bb.14: # %L122
# in Loop: Header=BB0_3 Depth=1
test r15b, 1
jne .LBB0_15
# %bb.12: # %L130.thread
# in Loop: Header=BB0_3 Depth=1
movzx eax, byte ptr [rbp - 41] # 1-byte Folded Reload
mov bl, 1
add eax, eax
lea eax, [rax + 4*rax]
add al, r12b
movzx eax, al
add qword ptr [rbp - 56], rax # 8-byte Folded Spill
mov al, 1
mov dword ptr [rbp - 48], eax # 4-byte Spill
cmp qword ptr [rbp - 64], r14 # 8-byte Folded Reload
jne .LBB0_25
jmp .LBB0_13
.p2align 4, 0x90
.LBB0_23: # %L115.thread
# in Loop: Header=BB0_3 Depth=1
mov al, 1
# implicit-def: $r8b
mov dword ptr [rbp - 48], eax # 4-byte Spill
cmp dl, 10
jne .LBB0_24
jmp .LBB0_21
.LBB0_11: # in Loop: Header=BB0_3 Depth=1
mov r8d, r12d
jmp .LBB0_24
.LBB0_1:
xor eax, eax
mov qword ptr [rbp - 56], rax # 8-byte Spill
.LBB0_13: # %L159
mov rax, qword ptr [rbp - 56] # 8-byte Reload
add rsp, 40
pop rbx
pop r12
pop r13
pop r14
pop r15
pop rbp
ret
.LBB0_21: # %L122.thread
test r15b, 1
jne .LBB0_15
# %bb.22: # %post_box_union58
movabs rdi, offset .L_j_str1
movabs rax, offset ijl_type_error
movabs rsi, 140008511215408
movabs rdx, 140008667209736
call rax
.LBB0_15: # %fail
cmp r11b, 1
je .LBB0_19
# %bb.16: # %fail
movzx eax, r11b
cmp eax, 2
jne .LBB0_17
# %bb.20: # %box_union54
movzx eax, byte ptr [rbp - 41] # 1-byte Folded Reload
movabs rcx, offset jl_boxed_uint8_cache
mov rdx, qword ptr [rcx + 8*rax]
jmp .LBB0_18
.LBB0_26: # %L80
movabs rax, offset ijl_throw
movabs rdi, 140008495049392
call rax
.LBB0_19: # %box_union
movabs rdx, 140008667209736
jmp .LBB0_18
.LBB0_17:
xor edx, edx
.LBB0_18: # %post_box_union
movabs rdi, offset .L_j_str1
movabs rax, offset ijl_type_error
movabs rsi, 140008511215408
call rax
.Lfunc_end0:
.size julia_part1_418, .Lfunc_end0-julia_part1_418
# -- End function
.type .L_j_str1,@object # @_j_str1
.section .rodata.str1.1,"aMS",@progbits,1
.L_j_str1:
.asciz "typeassert"
.size .L_j_str1, 11
.section ".note.GNU-stack","",@progbits
```
</details>
<details>
<summary>`@code_warntype` After</summary>
```julia
[sukera@tower 01]$ julia -q --project=. -L 01.jl
julia> data = read("input.txt");
julia> @code_warntype part1(data)
MethodInstance for part1(::Vector{UInt8})
from part1(data) @ Main ~/Documents/projects/AOC/2023/01/01.jl:7
Arguments
#self#::Core.Const(part1)
data::Vector{UInt8}
Locals
@_3::Union{Nothing, Tuple{UInt8, Int64}}
may_b::Union{Nothing, UInt8}
may_a::Union{Nothing, UInt8}
total::Int64
val@_7::Union{}
val@_8::Union{}
c::UInt8
digit_b::UInt8
digit_a::UInt8
#JuliaLang#215::Some{Nothing}
#JuliaLang#216::Union{Nothing, UInt8}
#JuliaLang#217::Union{Nothing, Some{UInt8}}
#JuliaLang#212::Some{Nothing}
#JuliaLang#213::Union{Nothing, Some{UInt8}}
#JuliaLang#214::Union{Nothing, UInt8}
digitRes::Union{Nothing, Some{UInt8}}
@_19::Union{Nothing, UInt8}
@_20::Union{Nothing, UInt8}
@_21::Nothing
@_22::Union{Nothing, UInt8}
@_23::Union{Nothing, UInt8}
@_24::Nothing
Body::Int64
1 ── (total = 0)
│ (may_a = Main.nothing)
│ (may_b = Main.nothing)
│ %4 = data::Vector{UInt8}
│ (@_3 = Base.iterate(%4))
│ %6 = @_3::Union{Nothing, Tuple{UInt8, Int64}}
│ %7 = (%6 === nothing)::Bool
│ %8 = Base.not_int(%7)::Bool
└─── goto JuliaLang#24 if not %8
2 ┄─ Core.NewvarNode(:(val@_7))
│ Core.NewvarNode(:(val@_8))
│ Core.NewvarNode(:(digit_b))
│ Core.NewvarNode(:(digit_a))
│ Core.NewvarNode(:(#JuliaLang#215))
│ Core.NewvarNode(:(#JuliaLang#216))
│ Core.NewvarNode(:(#JuliaLang#217))
│ Core.NewvarNode(:(#JuliaLang#212))
│ Core.NewvarNode(:(#JuliaLang#213))
│ %19 = @_3::Tuple{UInt8, Int64}
│ (c = Core.getfield(%19, 1))
│ %21 = Core.getfield(%19, 2)::Int64
│ %22 = c::UInt8
│ (digitRes = Main.someDigit(%22))
│ %24 = may_a::Union{Nothing, UInt8}
│ (#JuliaLang#214 = %24)
│ %26 = Base.:!::Core.Const(!)
│ %27 = #JuliaLang#214::Union{Nothing, UInt8}
│ %28 = Base.isnothing(%27)::Bool
│ %29 = (%26)(%28)::Bool
└─── goto #4 if not %29
3 ── %31 = #JuliaLang#214::UInt8
│ (@_19 = Base.something(%31))
└─── goto #11
4 ── %34 = digitRes::Union{Nothing, Some{UInt8}}
│ (#JuliaLang#213 = %34)
│ %36 = Base.:!::Core.Const(!)
│ %37 = #JuliaLang#213::Union{Nothing, Some{UInt8}}
│ %38 = Base.isnothing(%37)::Bool
│ %39 = (%36)(%38)::Bool
└─── goto #6 if not %39
5 ── %41 = #JuliaLang#213::Some{UInt8}
│ (@_20 = Base.something(%41))
└─── goto #10
6 ── %44 = Main.Some::Core.Const(Some)
│ %45 = Main.nothing::Core.Const(nothing)
│ (#JuliaLang#212 = (%44)(%45))
│ %47 = Base.:!::Core.Const(!)
│ %48 = #JuliaLang#212::Core.Const(Some(nothing))
│ %49 = Base.isnothing(%48)::Core.Const(false)
│ %50 = (%47)(%49)::Core.Const(true)
└─── goto #8 if not %50
7 ── %52 = #JuliaLang#212::Core.Const(Some(nothing))
│ (@_21 = Base.something(%52))
└─── goto #9
8 ── Core.Const(nothing)
│ Core.Const(:(val@_8 = Base.something(Base.nothing)))
│ Core.Const(nothing)
│ Core.Const(:(val@_8))
└─── Core.Const(:(@_21 = %58))
9 ┄─ %60 = @_21::Core.Const(nothing)
└─── (@_20 = %60)
10 ┄ %62 = @_20::Union{Nothing, UInt8}
└─── (@_19 = %62)
11 ┄ %64 = @_19::Union{Nothing, UInt8}
│ (may_a = %64)
│ %66 = digitRes::Union{Nothing, Some{UInt8}}
│ (#JuliaLang#217 = %66)
│ %68 = Base.:!::Core.Const(!)
│ %69 = #JuliaLang#217::Union{Nothing, Some{UInt8}}
│ %70 = Base.isnothing(%69)::Bool
│ %71 = (%68)(%70)::Bool
└─── goto #13 if not %71
12 ─ %73 = #JuliaLang#217::Some{UInt8}
│ (@_22 = Base.something(%73))
└─── goto #20
13 ─ %76 = may_b::Union{Nothing, UInt8}
│ (#JuliaLang#216 = %76)
│ %78 = Base.:!::Core.Const(!)
│ %79 = #JuliaLang#216::Union{Nothing, UInt8}
│ %80 = Base.isnothing(%79)::Bool
│ %81 = (%78)(%80)::Bool
└─── goto #15 if not %81
14 ─ %83 = #JuliaLang#216::UInt8
│ (@_23 = Base.something(%83))
└─── goto #19
15 ─ %86 = Main.Some::Core.Const(Some)
│ %87 = Main.nothing::Core.Const(nothing)
│ (#JuliaLang#215 = (%86)(%87))
│ %89 = Base.:!::Core.Const(!)
│ %90 = #JuliaLang#215::Core.Const(Some(nothing))
│ %91 = Base.isnothing(%90)::Core.Const(false)
│ %92 = (%89)(%91)::Core.Const(true)
└─── goto #17 if not %92
16 ─ %94 = #JuliaLang#215::Core.Const(Some(nothing))
│ (@_24 = Base.something(%94))
└─── goto #18
17 ─ Core.Const(nothing)
│ Core.Const(:(val@_7 = Base.something(Base.nothing)))
│ Core.Const(nothing)
│ Core.Const(:(val@_7))
└─── Core.Const(:(@_24 = %100))
18 ┄ %102 = @_24::Core.Const(nothing)
└─── (@_23 = %102)
19 ┄ %104 = @_23::Union{Nothing, UInt8}
└─── (@_22 = %104)
20 ┄ %106 = @_22::Union{Nothing, UInt8}
│ (may_b = %106)
│ %108 = Main.:(==)::Core.Const(==)
│ %109 = c::UInt8
│ %110 = Main.UInt8('\n')::Core.Const(0x0a)
│ %111 = (%108)(%109, %110)::Bool
└─── goto #22 if not %111
21 ─ %113 = may_a::Union{Nothing, UInt8}
│ (digit_a = Core.typeassert(%113, Main.UInt8))
│ %115 = may_b::Union{Nothing, UInt8}
│ (digit_b = Core.typeassert(%115, Main.UInt8))
│ %117 = Main.:+::Core.Const(+)
│ %118 = total::Int64
│ %119 = Main.:+::Core.Const(+)
│ %120 = Main.:*::Core.Const(*)
│ %121 = digit_a::UInt8
│ %122 = (%120)(%121, 0x0a)::UInt8
│ %123 = digit_b::UInt8
│ %124 = (%119)(%122, %123)::UInt8
│ (total = (%117)(%118, %124))
│ (may_a = Main.nothing)
└─── (may_b = Main.nothing)
22 ┄ (@_3 = Base.iterate(%4, %21))
│ %129 = @_3::Union{Nothing, Tuple{UInt8, Int64}}
│ %130 = (%129 === nothing)::Bool
│ %131 = Base.not_int(%130)::Bool
└─── goto JuliaLang#24 if not %131
23 ─ goto #2
24 ┄ %134 = total::Int64
└─── return %134
```
</details>
<details>
<summary>`@code_native debuginfo=:none` After </summary>
```julia
julia> @code_native debuginfo=:none part1(data)
.text
.file "part1"
.globl julia_part1_1203 # -- Begin function julia_part1_1203
.p2align 4, 0x90
.type julia_part1_1203,@function
julia_part1_1203: # @julia_part1_1203
; Function Signature: part1(Array{UInt8, 1})
# %bb.0: # %top
#DEBUG_VALUE: part1:data <- [DW_OP_deref] $rdi
push rbp
mov rbp, rsp
push r15
push r14
push r13
push r12
push rbx
sub rsp, 40
vxorps xmm0, xmm0, xmm0
#APP
mov rax, qword ptr fs:[0]
#NO_APP
lea rdx, [rbp - 64]
vmovaps xmmword ptr [rbp - 64], xmm0
mov qword ptr [rbp - 48], 0
mov rcx, qword ptr [rax - 8]
mov qword ptr [rbp - 64], 4
mov rax, qword ptr [rcx]
mov qword ptr [rbp - 72], rcx # 8-byte Spill
mov qword ptr [rbp - 56], rax
mov qword ptr [rcx], rdx
#DEBUG_VALUE: part1:data <- [DW_OP_deref] 0
mov r15, qword ptr [rdi + 16]
test r15, r15
je .LBB0_1
# %bb.2: # %L34
mov r14, qword ptr [rdi]
dec r15
mov r11b, 1
mov r13b, 1
# implicit-def: $r12b
# implicit-def: $r10b
xor eax, eax
jmp .LBB0_3
.p2align 4, 0x90
.LBB0_4: # in Loop: Header=BB0_3 Depth=1
xor r11d, r11d
mov ebx, edi
mov r10d, r8d
.LBB0_9: # %L114
# in Loop: Header=BB0_3 Depth=1
mov r12d, esi
test r15, r15
je .LBB0_12
.LBB0_10: # %guard_exit126
# in Loop: Header=BB0_3 Depth=1
inc r14
dec r15
mov r13d, ebx
.LBB0_3: # %L36
# =>This Inner Loop Header: Depth=1
movzx edx, byte ptr [r14]
test r13b, 1
movzx edi, r13b
mov ebx, 1
mov ecx, 0
cmove ebx, edi
cmovne edi, ecx
movzx ecx, r10b
lea esi, [rdx - 48]
lea r9d, [rdx - 58]
movzx r8d, sil
cmove r8d, ecx
cmp r9b, -11
ja .LBB0_4
# %bb.5: # %L89
# in Loop: Header=BB0_3 Depth=1
test r11b, 1
jne .LBB0_8
# %bb.6: # %L102
# in Loop: Header=BB0_3 Depth=1
cmp dl, 10
jne .LBB0_7
# %bb.13: # %L106
# in Loop: Header=BB0_3 Depth=1
test r13b, 1
jne .LBB0_14
# %bb.11: # %L114.thread
# in Loop: Header=BB0_3 Depth=1
add ecx, ecx
mov bl, 1
mov r11b, 1
lea ecx, [rcx + 4*rcx]
add cl, r12b
movzx ecx, cl
add rax, rcx
test r15, r15
jne .LBB0_10
jmp .LBB0_12
.p2align 4, 0x90
.LBB0_8: # %L102.thread
# in Loop: Header=BB0_3 Depth=1
mov r11b, 1
# implicit-def: $sil
cmp dl, 10
jne .LBB0_9
jmp .LBB0_15
.LBB0_7: # in Loop: Header=BB0_3 Depth=1
mov esi, r12d
jmp .LBB0_9
.LBB0_1:
xor eax, eax
.LBB0_12: # %L154
mov rcx, qword ptr [rbp - 56]
mov rdx, qword ptr [rbp - 72] # 8-byte Reload
mov qword ptr [rdx], rcx
add rsp, 40
pop rbx
pop r12
pop r13
pop r14
pop r15
pop rbp
ret
.LBB0_15: # %L106.thread
test r13b, 1
jne .LBB0_14
# %bb.16: # %post_box_union47
movabs rax, offset jl_nothing
movabs rcx, offset jl_small_typeof
movabs rdi, offset ".L_j_str_typeassert#1"
mov rdx, qword ptr [rax]
mov rsi, qword ptr [rcx + 336]
movabs rax, offset ijl_type_error
mov qword ptr [rbp - 48], rsi
call rax
.LBB0_14: # %post_box_union
movabs rax, offset jl_nothing
movabs rcx, offset jl_small_typeof
movabs rdi, offset ".L_j_str_typeassert#1"
mov rdx, qword ptr [rax]
mov rsi, qword ptr [rcx + 336]
movabs rax, offset ijl_type_error
mov qword ptr [rbp - 48], rsi
call rax
.Lfunc_end0:
.size julia_part1_1203, .Lfunc_end0-julia_part1_1203
# -- End function
.type ".L_j_str_typeassert#1",@object # @"_j_str_typeassert#1"
.section .rodata.str1.1,"aMS",@progbits,1
".L_j_str_typeassert#1":
.asciz "typeassert"
.size ".L_j_str_typeassert#1", 11
.section ".note.GNU-stack","",@progbits
```
</details>
Co-authored-by: Sukera <[email protected]>
N5N3
pushed a commit
that referenced
this pull request
Apr 22, 2024
…ce. (JuliaLang#54113) The former also handles vectors of pointers, which can occur after vectorization: ``` #5 0x00007f5bfe94de5e in llvm::cast<llvm::PointerType, llvm::Type> (Val=<optimized out>) at llvm/Support/Casting.h:578 578 assert(isa<To>(Val) && "cast<Ty>() argument of incompatible type!"); (rr) up #6 GCInvariantVerifier::visitAddrSpaceCastInst (this=this@entry=0x7ffd022fbf56, I=...) at julia/src/llvm-gc-invariant-verifier.cpp:66 66 unsigned ToAS = cast<PointerType>(I.getDestTy())->getAddressSpace(); (rr) call I.dump() %23 = addrspacecast <4 x ptr addrspace(10)> %wide.load to <4 x ptr addrspace(11)>, !dbg !43 ``` Fixes aborts seen in JuliaLang#53070
N5N3
pushed a commit
that referenced
this pull request
Oct 23, 2024
E.g. this allows `finalizer` inlining in the following case:
```julia
mutable struct ForeignBuffer{T}
const ptr::Ptr{T}
end
const foreign_buffer_finalized = Ref(false)
function foreign_alloc(::Type{T}, length) where T
ptr = Libc.malloc(sizeof(T) * length)
ptr = Base.unsafe_convert(Ptr{T}, ptr)
obj = ForeignBuffer{T}(ptr)
return finalizer(obj) do obj
Base.@assume_effects :notaskstate :nothrow
foreign_buffer_finalized[] = true
Libc.free(obj.ptr)
end
end
function f_EA_finalizer(N::Int)
workspace = foreign_alloc(Float64, N)
GC.@preserve workspace begin
(;ptr) = workspace
Base.@assume_effects :nothrow @noinline println(devnull, "ptr = ", ptr)
end
end
```
```julia
julia> @code_typed f_EA_finalizer(42)
CodeInfo(
1 ── %1 = Base.mul_int(8, N)::Int64
│ %2 = Core.lshr_int(%1, 63)::Int64
│ %3 = Core.trunc_int(Core.UInt8, %2)::UInt8
│ %4 = Core.eq_int(%3, 0x01)::Bool
└─── goto #3 if not %4
2 ── invoke Core.throw_inexacterror(:convert::Symbol, UInt64::Type, %1::Int64)::Union{}
└─── unreachable
3 ── goto #4
4 ── %9 = Core.bitcast(Core.UInt64, %1)::UInt64
└─── goto #5
5 ── goto #6
6 ── goto #7
7 ── goto #8
8 ── %14 = $(Expr(:foreigncall, :(:malloc), Ptr{Nothing}, svec(UInt64), 0, :(:ccall), :(%9), :(%9)))::Ptr{Nothing}
└─── goto #9
9 ── %16 = Base.bitcast(Ptr{Float64}, %14)::Ptr{Float64}
│ %17 = %new(ForeignBuffer{Float64}, %16)::ForeignBuffer{Float64}
└─── goto #10
10 ─ %19 = $(Expr(:gc_preserve_begin, :(%17)))
│ %20 = Base.getfield(%17, :ptr)::Ptr{Float64}
│ invoke Main.println(Main.devnull::Base.DevNull, "ptr = "::String, %20::Ptr{Float64})::Nothing
│ $(Expr(:gc_preserve_end, :(%19)))
│ %23 = Main.foreign_buffer_finalized::Base.RefValue{Bool}
│ Base.setfield!(%23, :x, true)::Bool
│ %25 = Base.getfield(%17, :ptr)::Ptr{Float64}
│ %26 = Base.bitcast(Ptr{Nothing}, %25)::Ptr{Nothing}
│ $(Expr(:foreigncall, :(:free), Nothing, svec(Ptr{Nothing}), 0, :(:ccall), :(%26), :(%25)))::Nothing
└─── return nothing
) => Nothing
```
However, this is still a WIP. Before merging, I want to improve EA's
precision a bit and at least fix the test case that is currently marked
as `broken`. I also need to check its impact on compiler performance.
Additionally, I believe this feature is not yet practical. In
particular, there is still significant room for improvement in the
following areas:
- EA's interprocedural capabilities: currently EA is performed ad-hoc
for limited frames because of latency reasons, which significantly
reduces its precision in the presence of interprocedural calls.
- Relaxing the `:nothrow` check for finalizer inlining: the current
algorithm requires `:nothrow`-ness on all paths from the allocation of
the mutable struct to its last use, which is not practical for
real-world cases. Even when `:nothrow` cannot be guaranteed, auxiliary
optimizations such as inserting a `finalize` call after the last use
might still be possible (JuliaLang#55990).
adienes
pushed a commit
that referenced
this pull request
Sep 9, 2025
Use an atomic fetch and add to fix a data race in `Module()` identified by tsan: ``` ./usr/bin/julia -t4,0 --gcthreads=1 -e 'Threads.@threads for i=1:100 Module() end' ================== WARNING: ThreadSanitizer: data race (pid=5575) Write of size 4 at 0xffff9bf9bd28 by thread T9: #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4) #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4) #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968) #3 <null> <null> (0xffff76a21164) #4 <null> <null> (0xffff76a1f074) #5 <null> <null> (0xffff76a1f0c4) #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04) #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04) #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4) #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4) Previous write of size 4 at 0xffff9bf9bd28 by thread T10: #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4) #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4) #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968) #3 <null> <null> (0xffff76a21164) #4 <null> <null> (0xffff76a1f074) #5 <null> <null> (0xffff76a1f0c4) #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04) #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04) #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4) #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4) Location is global 'jl_new_module__.mcounter' of size 4 at 0xffff9bf9bd28 (libjulia-internal.so.1.13+0x3dbd28) ```
adienes
pushed a commit
that referenced
this pull request
Sep 9, 2025
Simplify `workqueue_for`. While not strictly necessary, the acquire load
in `getindex(once::OncePerThread{T,F}, tid::Integer)` makes
ThreadSanitizer happy. With the existing implementation, we get false
positives whenever a thread other than the one that originally allocated
the array reads it:
```
==================
WARNING: ThreadSanitizer: data race (pid=6819)
Atomic read of size 8 at 0xffff86bec058 by main thread:
#0 getproperty Base_compiler.jl:57 (sys.so+0x113b478)
#1 julia_pushNOT._1925 task.jl:868 (sys.so+0x113b478)
#2 julia_enq_work_1896 task.jl:969 (sys.so+0x5cd218)
#3 schedule task.jl:983 (sys.so+0x892294)
#4 macro expansion threadingconstructs.jl:522 (sys.so+0x892294)
#5 julia_start_profile_listener_60681 Base.jl:355 (sys.so+0x892294)
#6 julia___init___60641 Base.jl:392 (sys.so+0x1178dc)
#7 jfptr___init___60642 <null> (sys.so+0x118134)
#8 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5e9a4)
#9 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5e9a4)
#10 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0xbba74)
#11 jl_module_run_initializer /home/user/c/julia/src/toplevel.c:68:13 (libjulia-internal.so.1.13+0xbba74)
#12 _finish_jl_init_ /home/user/c/julia/src/init.c:632:13 (libjulia-internal.so.1.13+0x9c0fc)
#13 ijl_init_ /home/user/c/julia/src/init.c:783:5 (libjulia-internal.so.1.13+0x9bcf4)
#14 jl_repl_entrypoint /home/user/c/julia/src/jlapi.c:1125:5 (libjulia-internal.so.1.13+0xf7ec8)
#15 jl_load_repl /home/user/c/julia/cli/loader_lib.c:601:12 (libjulia.so.1.13+0x11934)
#16 main /home/user/c/julia/cli/loader_exe.c:58:15 (julia+0x10dc20)
Previous write of size 8 at 0xffff86bec058 by thread T2:
#0 IntrusiveLinkedListSynchronized task.jl:863 (sys.so+0x78d220)
#1 macro expansion task.jl:932 (sys.so+0x78d220)
#2 macro expansion lock.jl:376 (sys.so+0x78d220)
#3 julia_workqueue_for_1933 task.jl:924 (sys.so+0x78d220)
#4 julia_wait_2048 task.jl:1204 (sys.so+0x6255ac)
#5 julia_task_done_hook_49205 task.jl:839 (sys.so+0x128fdc0)
#6 jfptr_task_done_hook_49206 <null> (sys.so+0x902218)
#7 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5e9a4)
#8 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5e9a4)
#9 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9c79c)
#10 jl_finish_task /home/user/c/julia/src/task.c:345:13 (libjulia-internal.so.1.13+0x9c79c)
#11 jl_threadfun /home/user/c/julia/src/scheduler.c:122:5 (libjulia-internal.so.1.13+0xe7db8)
Thread T2 (tid=6824, running) created by main thread at:
#0 pthread_create <null> (julia+0x85f88)
#1 uv_thread_create_ex /workspace/srcdir/libuv/src/unix/thread.c:172 (libjulia-internal.so.1.13+0x1a8d70)
#2 _finish_jl_init_ /home/user/c/julia/src/init.c:618:5 (libjulia-internal.so.1.13+0x9c010)
#3 ijl_init_ /home/user/c/julia/src/init.c:783:5 (libjulia-internal.so.1.13+0x9bcf4)
#4 jl_repl_entrypoint /home/user/c/julia/src/jlapi.c:1125:5 (libjulia-internal.so.1.13+0xf7ec8)
#5 jl_load_repl /home/user/c/julia/cli/loader_lib.c:601:12 (libjulia.so.1.13+0x11934)
#6 main /home/user/c/julia/cli/loader_exe.c:58:15 (julia+0x10dc20)
SUMMARY: ThreadSanitizer: data race Base_compiler.jl:57 in getproperty
==================
```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.