Skip to content

Commit 7d4cc1f

Browse files
committed
optimizer: inline abstract union-split callsite
Currently the optimizer handles abstract callsite only when there is a single dispatch candidate (in most cases), and so inlining and static-dispatch are prohibited when the callsite is union-split (in other word, union-split happens only when all the dispatch candidates are concrete). However, there are certain patterns of code (most notably our Julia-level compiler code) that inherently need to deal with abstract callsite. The following example is taken from `Core.Compiler` utility: ```julia julia> @inline isType(@nospecialize t) = isa(t, DataType) && t.name === Type.body.name isType (generic function with 1 method) julia> code_typed((Any,)) do x # abstract, but no union-split, successful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (x isa Main.DataType)::Bool └── goto #3 if not %1 2 ─ %3 = π (x, DataType) │ %4 = Base.getfield(%3, :name)::Core.TypeName │ %5 = Base.getfield(Type{T}, :name)::Core.TypeName │ %6 = (%4 === %5)::Bool └── goto #4 3 ─ goto #4 4 ┄ %9 = φ (#2 => %6, #3 => false)::Bool └── return %9 ) => Bool julia> code_typed((Union{Type,Nothing},)) do x # abstract, union-split, unsuccessful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (isa)(x, Nothing)::Bool └── goto #3 if not %1 2 ─ goto #4 3 ─ %4 = Main.isType(x)::Bool └── goto #4 4 ┄ %6 = φ (#2 => false, #3 => %4)::Bool └── return %6 ) => Bool ``` (note that this is a limitation of the inlining algorithm, and so any user-provided hints like callsite inlining annotation doesn't help here) This commit enables inlining and static dispatch for abstract union-split callsite. The core idea here is that we can simulate our dispatch semantics by generating `isa` checks in order of the specialities of dispatch candidates: ```julia julia> code_typed((Union{Type,Nothing},)) do x # union-split, unsuccessful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (isa)(x, Nothing)::Bool └── goto #3 if not %1 2 ─ goto #9 3 ─ %4 = (isa)(x, Type)::Bool └── goto #8 if not %4 4 ─ %6 = π (x, Type) │ %7 = (%6 isa Main.DataType)::Bool └── goto #6 if not %7 5 ─ %9 = π (%6, DataType) │ %10 = Base.getfield(%9, :name)::Core.TypeName │ %11 = Base.getfield(Type{T}, :name)::Core.TypeName │ %12 = (%10 === %11)::Bool └── goto #7 6 ─ goto #7 7 ┄ %15 = φ (#5 => %12, #6 => false)::Bool └── goto #9 8 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 9 ┄ %19 = φ (#2 => false, #7 => %15)::Bool └── return %19 ) => Bool ``` Inlining/static-dispatch of abstract union-split callsite will improve the performance in such situations (and so this commit will improve the latency of our JIT compilation). Especially, this commit helps us avoid excessive specializations of `Core.Compiler` code by statically-resolving `@nospecialize`d callsites, and as the result, the # of precompiled statements is now reduced from `1956` ([`master`](dc45d77)) to `1901` (this commit). And also, as a side effect, the implementation of our inlining algorithm gets much simplified now since we no longer need the previous special handlings for abstract callsites. One possible drawback would be increased code size. This change seems to certainly increase the size of sysimage, but I think these numbers are in an acceptable range: > [`master`](dc45d77) ``` ❯ du -sh usr/lib/julia/* 17M usr/lib/julia/corecompiler.ji 188M usr/lib/julia/sys-o.a 164M usr/lib/julia/sys.dylib 23M usr/lib/julia/sys.dylib.dSYM 101M usr/lib/julia/sys.ji ``` > this commit ``` ❯ du -sh usr/lib/julia/* 17M usr/lib/julia/corecompiler.ji 190M usr/lib/julia/sys-o.a 166M usr/lib/julia/sys.dylib 23M usr/lib/julia/sys.dylib.dSYM 102M usr/lib/julia/sys.ji ```
1 parent d5f5d52 commit 7d4cc1f

File tree

3 files changed

+177
-90
lines changed

3 files changed

+177
-90
lines changed

base/compiler/ssair/inlining.jl

Lines changed: 75 additions & 85 deletions
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ function cfg_inline_unionsplit!(ir::IRCode, idx::Int,
241241
push!(from_bbs, length(state.new_cfg_blocks))
242242
# TODO: Right now we unconditionally generate a fallback block
243243
# in case of subtyping errors - This is probably unnecessary.
244-
if i != length(cases) || (!fully_covered || (!params.trust_inference && isdispatchtuple(cases[i].sig)))
244+
if i != length(cases) || (!fully_covered || (!params.trust_inference))
245245
# This block will have the next condition or the final else case
246246
push!(state.new_cfg_blocks, BasicBlock(StmtRange(idx, idx)))
247247
push!(state.new_cfg_blocks[cond_bb].succs, length(state.new_cfg_blocks))
@@ -313,7 +313,6 @@ function ir_inline_item!(compact::IncrementalCompact, idx::Int, argexprs::Vector
313313
spec = item.spec::ResolvedInliningSpec
314314
sparam_vals = item.mi.sparam_vals
315315
def = item.mi.def::Method
316-
inline_cfg = spec.ir.cfg
317316
linetable_offset::Int32 = length(linetable)
318317
# Append the linetable of the inlined function to our line table
319318
inlined_at = Int(compact.result[idx][:line])
@@ -459,6 +458,47 @@ end
459458

460459
const FATAL_TYPE_BOUND_ERROR = ErrorException("fatal error in type inference (type bound)")
461460

461+
"""
462+
ir_inline_unionsplit!
463+
464+
The core idea of this function is to simulate the dispatch semantics by generating
465+
(flat) `isa`-checks corresponding to the signatures of union-split dispatch candidates,
466+
and then inline their bodies into each `isa`-conditional block.
467+
468+
This `isa`-based virtual dispatch requires some pre-conditions to hold in order to simulate
469+
the actual semantics correctly.
470+
471+
The first one is that these dispatch candidates need to be processed in order of their specificity,
472+
and the corresponding `isa`-checks should reflect the method specificities, since now their
473+
signatures are not necessarily concrete.
474+
Fortunately, `ml_matches` should already sorted them in that way, except cases when there is
475+
any ambiguity, from which we already bail out at this point.
476+
477+
Another consideration is type equality constraint from type variables: the `isa`-checks are
478+
not enough to simulate the dispatch semantics in cases like:
479+
480+
Given a definition:
481+
482+
f(x::T, y::T) where T<:Integer = ...
483+
484+
Transform a callsite:
485+
486+
(x::Any, y::Any)
487+
488+
Into the optimized form:
489+
490+
if isa(x, Integer) && isa(y, Integer)
491+
f(x::Integer, y::Integer)
492+
else
493+
f(x::Integer, y::Integer)
494+
end
495+
496+
But again, we should already bail out from such cases at this point, essentially by
497+
excluding cases where `case.sig::UnionAll`.
498+
499+
In short, here we can process the dispatch candidates in order, assuming we haven't changed
500+
their order somehow somewhere up to this point.
501+
"""
462502
function ir_inline_unionsplit!(compact::IncrementalCompact, idx::Int,
463503
argexprs::Vector{Any}, linetable::Vector{LineInfoNode},
464504
(; fully_covered, atype, cases, bbs)::UnionSplit,
@@ -468,17 +508,17 @@ function ir_inline_unionsplit!(compact::IncrementalCompact, idx::Int,
468508
join_bb = bbs[end]
469509
pn = PhiNode()
470510
local bb = compact.active_result_bb
471-
@assert length(bbs) >= length(cases)
472-
for i in 1:length(cases)
511+
ncases = length(cases)
512+
@assert length(bbs) >= ncases
513+
for i = 1:ncases
473514
ithcase = cases[i]
474515
mtype = ithcase.sig::DataType # checked within `handle_cases!`
475516
case = ithcase.item
476517
next_cond_bb = bbs[i]
477518
cond = true
478519
nparams = fieldcount(atype)
479520
@assert nparams == fieldcount(mtype)
480-
if i != length(cases) || !fully_covered ||
481-
(!params.trust_inference && isdispatchtuple(cases[i].sig))
521+
if i != ncases || !fully_covered || !params.trust_inference
482522
for i = 1:nparams
483523
a, m = fieldtype(atype, i), fieldtype(mtype, i)
484524
# If this is always true, we don't need to check for it
@@ -535,7 +575,7 @@ function ir_inline_unionsplit!(compact::IncrementalCompact, idx::Int,
535575
bb += 1
536576
# We're now in the fall through block, decide what to do
537577
if fully_covered
538-
if !params.trust_inference && isdispatchtuple(cases[end].sig)
578+
if !params.trust_inference
539579
e = Expr(:call, GlobalRef(Core, :throw), FATAL_TYPE_BOUND_ERROR)
540580
insert_node_here!(compact, NewInstruction(e, Union{}, line))
541581
insert_node_here!(compact, NewInstruction(ReturnNode(), Union{}, line))
@@ -558,7 +598,7 @@ function batch_inline!(todo::Vector{Pair{Int, Any}}, ir::IRCode, linetable::Vect
558598
state = CFGInliningState(ir)
559599
for (idx, item) in todo
560600
if isa(item, UnionSplit)
561-
cfg_inline_unionsplit!(ir, idx, item::UnionSplit, state, params)
601+
cfg_inline_unionsplit!(ir, idx, item, state, params)
562602
else
563603
item = item::InliningTodo
564604
spec = item.spec::ResolvedInliningSpec
@@ -1172,12 +1212,8 @@ function analyze_single_call!(
11721212
sig::Signature, state::InliningState, todo::Vector{Pair{Int, Any}})
11731213
argtypes = sig.argtypes
11741214
cases = InliningCase[]
1175-
local only_method = nothing # keep track of whether there is one matching method
1176-
local meth::MethodLookupResult
1215+
local any_fully_covered = false
11771216
local handled_all_cases = true
1178-
local any_covers_full = false
1179-
local revisit_idx = nothing
1180-
11811217
for i in 1:length(infos)
11821218
meth = infos[i].results
11831219
if meth.ambig
@@ -1188,66 +1224,20 @@ function analyze_single_call!(
11881224
# No applicable methods; try next union split
11891225
handled_all_cases = false
11901226
continue
1191-
else
1192-
if length(meth) == 1 && only_method !== false
1193-
if only_method === nothing
1194-
only_method = meth[1].method
1195-
elseif only_method !== meth[1].method
1196-
only_method = false
1197-
end
1198-
else
1199-
only_method = false
1200-
end
12011227
end
1202-
for (j, match) in enumerate(meth)
1203-
any_covers_full |= match.fully_covers
1204-
if !isdispatchtuple(match.spec_types)
1205-
if !match.fully_covers
1206-
handled_all_cases = false
1207-
continue
1208-
end
1209-
if revisit_idx === nothing
1210-
revisit_idx = (i, j)
1211-
else
1212-
handled_all_cases = false
1213-
revisit_idx = nothing
1214-
end
1215-
else
1216-
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases)
1217-
end
1228+
for match in meth
1229+
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases, true)
1230+
any_fully_covered |= match.fully_covers
12181231
end
12191232
end
12201233

1221-
atype = argtypes_to_type(argtypes)
1222-
if handled_all_cases && revisit_idx !== nothing
1223-
# If there's only one case that's not a dispatchtuple, we can
1224-
# still unionsplit by visiting all the other cases first.
1225-
# This is useful for code like:
1226-
# foo(x::Int) = 1
1227-
# foo(@nospecialize(x::Any)) = 2
1228-
# where we where only a small number of specific dispatchable
1229-
# cases are split off from an ::Any typed fallback.
1230-
(i, j) = revisit_idx
1231-
match = infos[i].results[j]
1232-
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases, true)
1233-
elseif length(cases) == 0 && only_method isa Method
1234-
# if the signature is fully covered and there is only one applicable method,
1235-
# we can try to inline it even if the signature is not a dispatch tuple.
1236-
# -- But don't try it if we already tried to handle the match in the revisit_idx
1237-
# case, because that'll (necessarily) be the same method.
1238-
if length(infos) > 1
1239-
(metharg, methsp) = ccall(:jl_type_intersection_with_env, Any, (Any, Any),
1240-
atype, only_method.sig)::SimpleVector
1241-
match = MethodMatch(metharg, methsp::SimpleVector, only_method, true)
1242-
else
1243-
@assert length(meth) == 1
1244-
match = meth[1]
1245-
end
1246-
handle_match!(match, argtypes, flag, state, cases, true) || return nothing
1247-
any_covers_full = handled_all_cases = match.fully_covers
1234+
if !handled_all_cases
1235+
# if we've not seen all candidates, union split is valid only for dispatch tuples
1236+
filter!(case::InliningCase->isdispatchtuple(case.sig), cases)
12481237
end
12491238

1250-
handle_cases!(ir, idx, stmt, atype, cases, any_covers_full && handled_all_cases, todo, state.params)
1239+
handle_cases!(ir, idx, stmt, argtypes_to_type(argtypes), cases,
1240+
handled_all_cases & any_fully_covered, todo, state.params)
12511241
end
12521242

12531243
# similar to `analyze_single_call!`, but with constant results
@@ -1258,8 +1248,8 @@ function handle_const_call!(
12581248
(; call, results) = cinfo
12591249
infos = isa(call, MethodMatchInfo) ? MethodMatchInfo[call] : call.matches
12601250
cases = InliningCase[]
1251+
local any_fully_covered = false
12611252
local handled_all_cases = true
1262-
local any_covers_full = false
12631253
local j = 0
12641254
for i in 1:length(infos)
12651255
meth = infos[i].results
@@ -1275,42 +1265,39 @@ function handle_const_call!(
12751265
for match in meth
12761266
j += 1
12771267
result = results[j]
1278-
any_covers_full |= match.fully_covers
1268+
any_fully_covered |= match.fully_covers
12791269
if isa(result, ConstResult)
12801270
case = const_result_item(result, state)
12811271
push!(cases, InliningCase(result.mi.specTypes, case))
12821272
elseif isa(result, InferenceResult)
1283-
handled_all_cases &= handle_inf_result!(result, argtypes, flag, state, cases)
1273+
handled_all_cases &= handle_inf_result!(result, argtypes, flag, state, cases, true)
12841274
else
12851275
@assert result === nothing
1286-
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases)
1276+
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases, true)
12871277
end
12881278
end
12891279
end
12901280

1291-
# if the signature is fully covered and there is only one applicable method,
1292-
# we can try to inline it even if the signature is not a dispatch tuple
1293-
atype = argtypes_to_type(argtypes)
1294-
if length(cases) == 0
1295-
length(results) == 1 || return nothing
1296-
result = results[1]
1297-
isa(result, InferenceResult) || return nothing
1298-
handle_inf_result!(result, argtypes, flag, state, cases, true) || return nothing
1299-
spec_types = cases[1].sig
1300-
any_covers_full = handled_all_cases = atype <: spec_types
1281+
if !handled_all_cases
1282+
# if we've not seen all candidates, union split is valid only for dispatch tuples
1283+
filter!(case::InliningCase->isdispatchtuple(case.sig), cases)
13011284
end
13021285

1303-
handle_cases!(ir, idx, stmt, atype, cases, any_covers_full && handled_all_cases, todo, state.params)
1286+
handle_cases!(ir, idx, stmt, argtypes_to_type(argtypes), cases,
1287+
handled_all_cases & any_fully_covered, todo, state.params)
13041288
end
13051289

13061290
function handle_match!(
13071291
match::MethodMatch, argtypes::Vector{Any}, flag::UInt8, state::InliningState,
13081292
cases::Vector{InliningCase}, allow_abstract::Bool = false)
13091293
spec_types = match.spec_types
13101294
allow_abstract || isdispatchtuple(spec_types) || return false
1295+
# we may see duplicated dispatch signatures here when a signature gets widened
1296+
# during abstract interpretation: for the purpose of inlining, we can just skip
1297+
# processing this dispatch candidate
1298+
_any(case->case.sig === spec_types, cases) && return true
13111299
item = analyze_method!(match, argtypes, flag, state)
13121300
item === nothing && return false
1313-
_any(case->case.sig === spec_types, cases) && return true
13141301
push!(cases, InliningCase(spec_types, item))
13151302
return true
13161303
end
@@ -1346,7 +1333,9 @@ function handle_cases!(ir::IRCode, idx::Int, stmt::Expr, @nospecialize(atype),
13461333
handle_single_case!(ir, idx, stmt, cases[1].item, todo, params)
13471334
elseif length(cases) > 0
13481335
isa(atype, DataType) || return nothing
1349-
all(case::InliningCase->isa(case.sig, DataType), cases) || return nothing
1336+
for case in cases
1337+
isa(case.sig, DataType) || return nothing
1338+
end
13501339
push!(todo, idx=>UnionSplit(fully_covered, atype, cases))
13511340
end
13521341
return nothing
@@ -1442,7 +1431,8 @@ function assemble_inline_todo!(ir::IRCode, state::InliningState)
14421431

14431432
analyze_single_call!(ir, idx, stmt, infos, flag, sig, state, todo)
14441433
end
1445-
todo
1434+
1435+
return todo
14461436
end
14471437

14481438
function linear_inline_eligible(ir::IRCode)

base/sort.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ module Sort
55
import ..@__MODULE__, ..parentmodule
66
const Base = parentmodule(@__MODULE__)
77
using .Base.Order
8-
using .Base: copymutable, LinearIndices, length, (:),
8+
using .Base: copymutable, LinearIndices, length, (:), iterate,
99
eachindex, axes, first, last, similar, zip, OrdinalRange,
1010
AbstractVector, @inbounds, AbstractRange, @eval, @inline, Vector, @noinline,
1111
AbstractMatrix, AbstractUnitRange, isless, identity, eltype, >, <, <=, >=, |, +, -, *, !,

0 commit comments

Comments
 (0)