Skip to content

Commit fb08dcb

Browse files
committed
optimizer: inline abstract union-split callsite
Currently the optimizer handles abstract callsite only when there is a single dispatch candidate (in most cases), and so inlining and static-dispatch are prohibited when the callsite is union-split (in other word, union-split happens only when all the dispatch candidates are concrete). However, there are certain patterns of code (most notably our Julia-level compiler code) that inherently need to deal with abstract callsite. The following example is taken from `Core.Compiler` utility: ```julia julia> @inline isType(@nospecialize t) = isa(t, DataType) && t.name === Type.body.name isType (generic function with 1 method) julia> code_typed((Any,)) do x # abstract, but no union-split, successful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (x isa Main.DataType)::Bool └── goto #3 if not %1 2 ─ %3 = π (x, DataType) │ %4 = Base.getfield(%3, :name)::Core.TypeName │ %5 = Base.getfield(Type{T}, :name)::Core.TypeName │ %6 = (%4 === %5)::Bool └── goto #4 3 ─ goto #4 4 ┄ %9 = φ (#2 => %6, #3 => false)::Bool └── return %9 ) => Bool julia> code_typed((Union{Type,Nothing},)) do x # abstract, union-split, unsuccessful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (isa)(x, Nothing)::Bool └── goto #3 if not %1 2 ─ goto #4 3 ─ %4 = Main.isType(x)::Bool └── goto #4 4 ┄ %6 = φ (#2 => false, #3 => %4)::Bool └── return %6 ) => Bool ``` (note that this is a limitation of the inlining algorithm, and so any user-provided hints like callsite inlining annotation doesn't help here) This commit enables inlining and static dispatch for abstract union-split callsite. The core idea here is that we can simulate our dispatch semantics by generating `isa` checks in order of the specialities of dispatch candidates: ```julia julia> code_typed((Union{Type,Nothing},)) do x # union-split, unsuccessful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (isa)(x, Nothing)::Bool └── goto #3 if not %1 2 ─ goto #9 3 ─ %4 = (isa)(x, Type)::Bool └── goto #8 if not %4 4 ─ %6 = π (x, Type) │ %7 = (%6 isa Main.DataType)::Bool └── goto #6 if not %7 5 ─ %9 = π (%6, DataType) │ %10 = Base.getfield(%9, :name)::Core.TypeName │ %11 = Base.getfield(Type{T}, :name)::Core.TypeName │ %12 = (%10 === %11)::Bool └── goto #7 6 ─ goto #7 7 ┄ %15 = φ (#5 => %12, #6 => false)::Bool └── goto #9 8 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 9 ┄ %19 = φ (#2 => false, #7 => %15)::Bool └── return %19 ) => Bool ``` Inlining/static-dispatch of abstract union-split callsite will improve the performance in such situations (and so this commit will improve the latency of our JIT compilation). Especially, this commit helps us avoid excessive specializations of `Core.Compiler` code by statically-resolving `@nospecialize`d callsites, and as the result, the # of precompiled statements is now reduced from `1956` ([`master`](dc45d77)) to `1901` (this commit). And also, as a side effect, the implementation of our inlining algorithm gets much simplified now since we no longer need the previous special handlings for abstract callsites. One possible drawback would be increased code size. This change seems to certainly increase the size of sysimage, but I think these numbers are in an acceptable range: > [`master`](dc45d77) ``` ❯ du -sh usr/lib/julia/* 17M usr/lib/julia/corecompiler.ji 188M usr/lib/julia/sys-o.a 164M usr/lib/julia/sys.dylib 23M usr/lib/julia/sys.dylib.dSYM 101M usr/lib/julia/sys.ji ``` > this commit ``` ❯ du -sh usr/lib/julia/* 17M usr/lib/julia/corecompiler.ji 190M usr/lib/julia/sys-o.a 166M usr/lib/julia/sys.dylib 23M usr/lib/julia/sys.dylib.dSYM 102M usr/lib/julia/sys.ji ```
1 parent 1e64682 commit fb08dcb

File tree

3 files changed

+177
-90
lines changed

3 files changed

+177
-90
lines changed

base/compiler/ssair/inlining.jl

Lines changed: 75 additions & 85 deletions
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ function cfg_inline_unionsplit!(ir::IRCode, idx::Int,
241241
push!(from_bbs, length(state.new_cfg_blocks))
242242
# TODO: Right now we unconditionally generate a fallback block
243243
# in case of subtyping errors - This is probably unnecessary.
244-
if i != length(cases) || (!fully_covered || (!params.trust_inference && isdispatchtuple(cases[i].sig)))
244+
if i != length(cases) || (!fully_covered || (!params.trust_inference))
245245
# This block will have the next condition or the final else case
246246
push!(state.new_cfg_blocks, BasicBlock(StmtRange(idx, idx)))
247247
push!(state.new_cfg_blocks[cond_bb].succs, length(state.new_cfg_blocks))
@@ -313,7 +313,6 @@ function ir_inline_item!(compact::IncrementalCompact, idx::Int, argexprs::Vector
313313
spec = item.spec::ResolvedInliningSpec
314314
sparam_vals = item.mi.sparam_vals
315315
def = item.mi.def::Method
316-
inline_cfg = spec.ir.cfg
317316
linetable_offset::Int32 = length(linetable)
318317
# Append the linetable of the inlined function to our line table
319318
inlined_at = Int(compact.result[idx][:line])
@@ -462,6 +461,47 @@ end
462461

463462
const FATAL_TYPE_BOUND_ERROR = ErrorException("fatal error in type inference (type bound)")
464463

464+
"""
465+
ir_inline_unionsplit!
466+
467+
The core idea of this function is to simulate the dispatch semantics by generating
468+
(flat) `isa`-checks corresponding to the signatures of union-split dispatch candidates,
469+
and then inline their bodies into each `isa`-conditional block.
470+
471+
This `isa`-based virtual dispatch requires some pre-conditions to hold in order to simulate
472+
the actual semantics correctly.
473+
474+
The first one is that these dispatch candidates need to be processed in order of their specificity,
475+
and the corresponding `isa`-checks should reflect the method specificities, since now their
476+
signatures are not necessarily concrete.
477+
Fortunately, `ml_matches` should already sorted them in that way, except cases when there is
478+
any ambiguity, from which we already bail out at this point.
479+
480+
Another consideration is type equality constraint from type variables: the `isa`-checks are
481+
not enough to simulate the dispatch semantics in cases like:
482+
483+
Given a definition:
484+
485+
f(x::T, y::T) where T<:Integer = ...
486+
487+
Transform a callsite:
488+
489+
(x::Any, y::Any)
490+
491+
Into the optimized form:
492+
493+
if isa(x, Integer) && isa(y, Integer)
494+
f(x::Integer, y::Integer)
495+
else
496+
f(x::Integer, y::Integer)
497+
end
498+
499+
But again, we should already bail out from such cases at this point, essentially by
500+
excluding cases where `case.sig::UnionAll`.
501+
502+
In short, here we can process the dispatch candidates in order, assuming we haven't changed
503+
their order somehow somewhere up to this point.
504+
"""
465505
function ir_inline_unionsplit!(compact::IncrementalCompact, idx::Int,
466506
argexprs::Vector{Any}, linetable::Vector{LineInfoNode},
467507
(; fully_covered, atype, cases, bbs)::UnionSplit,
@@ -471,17 +511,17 @@ function ir_inline_unionsplit!(compact::IncrementalCompact, idx::Int,
471511
join_bb = bbs[end]
472512
pn = PhiNode()
473513
local bb = compact.active_result_bb
474-
@assert length(bbs) >= length(cases)
475-
for i in 1:length(cases)
514+
ncases = length(cases)
515+
@assert length(bbs) >= ncases
516+
for i = 1:ncases
476517
ithcase = cases[i]
477518
mtype = ithcase.sig::DataType # checked within `handle_cases!`
478519
case = ithcase.item
479520
next_cond_bb = bbs[i]
480521
cond = true
481522
nparams = fieldcount(atype)
482523
@assert nparams == fieldcount(mtype)
483-
if i != length(cases) || !fully_covered ||
484-
(!params.trust_inference && isdispatchtuple(cases[i].sig))
524+
if i != ncases || !fully_covered || !params.trust_inference
485525
for i = 1:nparams
486526
a, m = fieldtype(atype, i), fieldtype(mtype, i)
487527
# If this is always true, we don't need to check for it
@@ -538,7 +578,7 @@ function ir_inline_unionsplit!(compact::IncrementalCompact, idx::Int,
538578
bb += 1
539579
# We're now in the fall through block, decide what to do
540580
if fully_covered
541-
if !params.trust_inference && isdispatchtuple(cases[end].sig)
581+
if !params.trust_inference
542582
e = Expr(:call, GlobalRef(Core, :throw), FATAL_TYPE_BOUND_ERROR)
543583
insert_node_here!(compact, NewInstruction(e, Union{}, line))
544584
insert_node_here!(compact, NewInstruction(ReturnNode(), Union{}, line))
@@ -561,7 +601,7 @@ function batch_inline!(todo::Vector{Pair{Int, Any}}, ir::IRCode, linetable::Vect
561601
state = CFGInliningState(ir)
562602
for (idx, item) in todo
563603
if isa(item, UnionSplit)
564-
cfg_inline_unionsplit!(ir, idx, item::UnionSplit, state, params)
604+
cfg_inline_unionsplit!(ir, idx, item, state, params)
565605
else
566606
item = item::InliningTodo
567607
spec = item.spec::ResolvedInliningSpec
@@ -1175,12 +1215,8 @@ function analyze_single_call!(
11751215
sig::Signature, state::InliningState, todo::Vector{Pair{Int, Any}})
11761216
argtypes = sig.argtypes
11771217
cases = InliningCase[]
1178-
local only_method = nothing # keep track of whether there is one matching method
1179-
local meth::MethodLookupResult
1218+
local any_fully_covered = false
11801219
local handled_all_cases = true
1181-
local any_covers_full = false
1182-
local revisit_idx = nothing
1183-
11841220
for i in 1:length(infos)
11851221
meth = infos[i].results
11861222
if meth.ambig
@@ -1191,66 +1227,20 @@ function analyze_single_call!(
11911227
# No applicable methods; try next union split
11921228
handled_all_cases = false
11931229
continue
1194-
else
1195-
if length(meth) == 1 && only_method !== false
1196-
if only_method === nothing
1197-
only_method = meth[1].method
1198-
elseif only_method !== meth[1].method
1199-
only_method = false
1200-
end
1201-
else
1202-
only_method = false
1203-
end
12041230
end
1205-
for (j, match) in enumerate(meth)
1206-
any_covers_full |= match.fully_covers
1207-
if !isdispatchtuple(match.spec_types)
1208-
if !match.fully_covers
1209-
handled_all_cases = false
1210-
continue
1211-
end
1212-
if revisit_idx === nothing
1213-
revisit_idx = (i, j)
1214-
else
1215-
handled_all_cases = false
1216-
revisit_idx = nothing
1217-
end
1218-
else
1219-
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases)
1220-
end
1231+
for match in meth
1232+
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases, true)
1233+
any_fully_covered |= match.fully_covers
12211234
end
12221235
end
12231236

1224-
atype = argtypes_to_type(argtypes)
1225-
if handled_all_cases && revisit_idx !== nothing
1226-
# If there's only one case that's not a dispatchtuple, we can
1227-
# still unionsplit by visiting all the other cases first.
1228-
# This is useful for code like:
1229-
# foo(x::Int) = 1
1230-
# foo(@nospecialize(x::Any)) = 2
1231-
# where we where only a small number of specific dispatchable
1232-
# cases are split off from an ::Any typed fallback.
1233-
(i, j) = revisit_idx
1234-
match = infos[i].results[j]
1235-
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases, true)
1236-
elseif length(cases) == 0 && only_method isa Method
1237-
# if the signature is fully covered and there is only one applicable method,
1238-
# we can try to inline it even if the signature is not a dispatch tuple.
1239-
# -- But don't try it if we already tried to handle the match in the revisit_idx
1240-
# case, because that'll (necessarily) be the same method.
1241-
if length(infos) > 1
1242-
(metharg, methsp) = ccall(:jl_type_intersection_with_env, Any, (Any, Any),
1243-
atype, only_method.sig)::SimpleVector
1244-
match = MethodMatch(metharg, methsp::SimpleVector, only_method, true)
1245-
else
1246-
@assert length(meth) == 1
1247-
match = meth[1]
1248-
end
1249-
handle_match!(match, argtypes, flag, state, cases, true) || return nothing
1250-
any_covers_full = handled_all_cases = match.fully_covers
1237+
if !handled_all_cases
1238+
# if we've not seen all candidates, union split is valid only for dispatch tuples
1239+
filter!(case::InliningCase->isdispatchtuple(case.sig), cases)
12511240
end
12521241

1253-
handle_cases!(ir, idx, stmt, atype, cases, any_covers_full && handled_all_cases, todo, state.params)
1242+
handle_cases!(ir, idx, stmt, argtypes_to_type(argtypes), cases,
1243+
handled_all_cases & any_fully_covered, todo, state.params)
12541244
end
12551245

12561246
# similar to `analyze_single_call!`, but with constant results
@@ -1261,8 +1251,8 @@ function handle_const_call!(
12611251
(; call, results) = cinfo
12621252
infos = isa(call, MethodMatchInfo) ? MethodMatchInfo[call] : call.matches
12631253
cases = InliningCase[]
1254+
local any_fully_covered = false
12641255
local handled_all_cases = true
1265-
local any_covers_full = false
12661256
local j = 0
12671257
for i in 1:length(infos)
12681258
meth = infos[i].results
@@ -1278,42 +1268,39 @@ function handle_const_call!(
12781268
for match in meth
12791269
j += 1
12801270
result = results[j]
1281-
any_covers_full |= match.fully_covers
1271+
any_fully_covered |= match.fully_covers
12821272
if isa(result, ConstResult)
12831273
case = const_result_item(result, state)
12841274
push!(cases, InliningCase(result.mi.specTypes, case))
12851275
elseif isa(result, InferenceResult)
1286-
handled_all_cases &= handle_inf_result!(result, argtypes, flag, state, cases)
1276+
handled_all_cases &= handle_inf_result!(result, argtypes, flag, state, cases, true)
12871277
else
12881278
@assert result === nothing
1289-
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases)
1279+
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases, true)
12901280
end
12911281
end
12921282
end
12931283

1294-
# if the signature is fully covered and there is only one applicable method,
1295-
# we can try to inline it even if the signature is not a dispatch tuple
1296-
atype = argtypes_to_type(argtypes)
1297-
if length(cases) == 0
1298-
length(results) == 1 || return nothing
1299-
result = results[1]
1300-
isa(result, InferenceResult) || return nothing
1301-
handle_inf_result!(result, argtypes, flag, state, cases, true) || return nothing
1302-
spec_types = cases[1].sig
1303-
any_covers_full = handled_all_cases = atype <: spec_types
1284+
if !handled_all_cases
1285+
# if we've not seen all candidates, union split is valid only for dispatch tuples
1286+
filter!(case::InliningCase->isdispatchtuple(case.sig), cases)
13041287
end
13051288

1306-
handle_cases!(ir, idx, stmt, atype, cases, any_covers_full && handled_all_cases, todo, state.params)
1289+
handle_cases!(ir, idx, stmt, argtypes_to_type(argtypes), cases,
1290+
handled_all_cases & any_fully_covered, todo, state.params)
13071291
end
13081292

13091293
function handle_match!(
13101294
match::MethodMatch, argtypes::Vector{Any}, flag::UInt8, state::InliningState,
13111295
cases::Vector{InliningCase}, allow_abstract::Bool = false)
13121296
spec_types = match.spec_types
13131297
allow_abstract || isdispatchtuple(spec_types) || return false
1298+
# we may see duplicated dispatch signatures here when a signature gets widened
1299+
# during abstract interpretation: for the purpose of inlining, we can just skip
1300+
# processing this dispatch candidate
1301+
_any(case->case.sig === spec_types, cases) && return true
13141302
item = analyze_method!(match, argtypes, flag, state)
13151303
item === nothing && return false
1316-
_any(case->case.sig === spec_types, cases) && return true
13171304
push!(cases, InliningCase(spec_types, item))
13181305
return true
13191306
end
@@ -1349,7 +1336,9 @@ function handle_cases!(ir::IRCode, idx::Int, stmt::Expr, @nospecialize(atype),
13491336
handle_single_case!(ir, idx, stmt, cases[1].item, todo, params)
13501337
elseif length(cases) > 0
13511338
isa(atype, DataType) || return nothing
1352-
all(case::InliningCase->isa(case.sig, DataType), cases) || return nothing
1339+
for case in cases
1340+
isa(case.sig, DataType) || return nothing
1341+
end
13531342
push!(todo, idx=>UnionSplit(fully_covered, atype, cases))
13541343
end
13551344
return nothing
@@ -1445,7 +1434,8 @@ function assemble_inline_todo!(ir::IRCode, state::InliningState)
14451434

14461435
analyze_single_call!(ir, idx, stmt, infos, flag, sig, state, todo)
14471436
end
1448-
todo
1437+
1438+
return todo
14491439
end
14501440

14511441
function linear_inline_eligible(ir::IRCode)

base/sort.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ module Sort
55
import ..@__MODULE__, ..parentmodule
66
const Base = parentmodule(@__MODULE__)
77
using .Base.Order
8-
using .Base: copymutable, LinearIndices, length, (:),
8+
using .Base: copymutable, LinearIndices, length, (:), iterate,
99
eachindex, axes, first, last, similar, zip, OrdinalRange,
1010
AbstractVector, @inbounds, AbstractRange, @eval, @inline, Vector, @noinline,
1111
AbstractMatrix, AbstractUnitRange, isless, identity, eltype, >, <, <=, >=, |, +, -, *, !,

0 commit comments

Comments
 (0)