Commit deb87b9
committed
improve performance issue of
This commit tries to fix and improve performance for calling keyword
funcs whose arguments types are not fully known but `@nospecialize`-d.
The final result would look like (this particular example is taken from
our Julia-level compiler implementation):
```julia
abstract type CallInfo end
struct NoCallInfo <: CallInfo end
struct NewInstruction
stmt::Any
type::Any
info::CallInfo
line::Union{Int32,Nothing} # if nothing, copy the line from previous statement in the insertion location
flag::Union{UInt8,Nothing} # if nothing, IR flags will be recomputed on insertion
function NewInstruction(@nospecialize(stmt), @nospecialize(type), @nospecialize(info::CallInfo),
line::Union{Int32,Nothing}, flag::Union{UInt8,Nothing})
return new(stmt, type, info, line, flag)
end
end
@nospecialize
function NewInstruction(newinst::NewInstruction;
stmt=newinst.stmt,
type=newinst.type,
info::CallInfo=newinst.info,
line::Union{Int32,Nothing}=newinst.line,
flag::Union{UInt8,Nothing}=newinst.flag)
return NewInstruction(stmt, type, info, line, flag)
end
@Specialize
using BenchmarkTools
struct VirtualKwargs
stmt::Any
type::Any
info::CallInfo
end
vkws = VirtualKwargs(nothing, Any, NoCallInfo())
newinst = NewInstruction(nothing, Any, NoCallInfo(), nothing, nothing)
runner(newinst, vkws) = NewInstruction(newinst; vkws.stmt, vkws.type, vkws.info)
@benchmark runner($newinst, $vkws)
```
> on master
```
BenchmarkTools.Trial: 10000 samples with 186 evaluations.
Range (min … max): 559.898 ns … 4.173 μs ┊ GC (min … max): 0.00% … 85.29%
Time (median): 605.608 ns ┊ GC (median): 0.00%
Time (mean ± σ): 638.170 ns ± 125.080 ns ┊ GC (mean ± σ): 0.06% ± 0.85%
█▇▂▆▄ ▁█▇▄▂ ▂
██████▅██████▇▇▇██████▇▇▇▆▆▅▄▅▄▂▄▄▅▇▆▆▆▆▆▅▆▆▄▄▅▅▄▃▄▄▄▅▃▅▅▆▅▆▆ █
560 ns Histogram: log(frequency) by time 1.23 μs <
Memory estimate: 32 bytes, allocs estimate: 2.
```
> on this commit
```julia
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 3.080 ns … 83.177 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 3.098 ns ┊ GC (median): 0.00%
Time (mean ± σ): 3.118 ns ± 0.885 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▂▅▇█▆▅▄▂
▂▄▆▆▇████████▆▃▃▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▁▁▂▂▂▁▂▂▂▂▂▂▁▁▂▁▂▂▂▂▂▂▂▂▂ ▃
3.08 ns Histogram: frequency by time 3.19 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
```
So for this particular case it achieves roughly 200x speed up.
This is because this commit allows inlining of a call to keyword sorter
as well as removal of `NamedTuple` call.
Especially this commit is composed of the following improvements:
- add early return case for `structdiff`:
This change improves the return type inference for a case when
compared `NamedTuple`s are type unstable but there is no difference
in their names, e.g. given two `NamedTuple{(:a,:b),T} where T<:Tuple{Any,Any}`s.
And in such case the optimizer will remove `structdiff` and succeeding
`pairs` calls, letting the keyword sorter to be inlined.
- add special SROA handling for `NamedTuple` generated by keyword sorter:
With the change on `structdiff`, IR for a call with type-unstable
keyword arguments after inlining would look like:
```
%1 = tuple(a, b, c)::Tuple{Any, Any, Any}
%2 = NamedTuple{(:a, :b, :c)(%1)::NamedTuple{(:a, :b, :c), _A} where _A<:Tuple{Any, Any, Any}
%3 = Core.getfield(%2, :a)::Any
%4 = Core.getfield(%2, :b)::Any
%5 = Core.getfield(%2, :c)::Any
[... other body of the keyword func ...]
```
We can implement a bit hacky special handling within our SROA pass
that checks if this definition (%2) is partly well-known `NamedTuple`
construction, where its names are fully known, and also checks if its
call argument (%1) is fully-known `tuple` call. In a case when the
length of the `NamedTuple` names and the length of the arguments for
the `tuple` call, we can safely replace those `getfield` calls with
the corresponding `tuple` call argument, while letting the later DCE
pass to delete the constructions of tuple and named-tuple altogether.
With these changes, the IR for the example `NewInstruction` constructor
is fairly optimized, like:
```julia
julia> Base.code_ircode((NewInstruction,Any,Any,CallInfo)) do newinst, stmt, type, info
NewInstruction(newinst; stmt, type, info)
end |> only
2 1 ── %1 = Base.getfield(_2, :line)::Union{Nothing, Int32} │╻╷ Type##kw
│ %2 = Base.getfield(_2, :flag)::Union{Nothing, UInt8} ││┃ getproperty
│ %3 = (isa)(%1, Nothing)::Bool ││
│ %4 = (isa)(%2, Nothing)::Bool ││
│ %5 = (Core.Intrinsics.and_int)(%3, %4)::Bool ││
└─── goto #3 if not %5 ││
2 ── %7 = %new(Main.NewInstruction, _3, _4, _5, nothing, nothing)::NewInstruction NewInstruction
└─── goto #10 ││
3 ── %9 = (isa)(%1, Int32)::Bool ││
│ %10 = (isa)(%2, Nothing)::Bool ││
│ %11 = (Core.Intrinsics.and_int)(%9, %10)::Bool ││
└─── goto #5 if not %11 ││
4 ── %13 = π (%1, Int32) ││
│ %14 = %new(Main.NewInstruction, _3, _4, _5, %13, nothing)::NewInstruction│││╻ NewInstruction
└─── goto #10 ││
5 ── %16 = (isa)(%1, Nothing)::Bool ││
│ %17 = (isa)(%2, UInt8)::Bool ││
│ %18 = (Core.Intrinsics.and_int)(%16, %17)::Bool ││
└─── goto #7 if not %18 ││
6 ── %20 = π (%2, UInt8) ││
│ %21 = %new(Main.NewInstruction, _3, _4, _5, nothing, %20)::NewInstruction│││╻ NewInstruction
└─── goto #10 ││
7 ── %23 = (isa)(%1, Int32)::Bool ││
│ %24 = (isa)(%2, UInt8)::Bool ││
│ %25 = (Core.Intrinsics.and_int)(%23, %24)::Bool ││
└─── goto #9 if not %25 ││
8 ── %27 = π (%1, Int32) ││
│ %28 = π (%2, UInt8) ││
│ %29 = %new(Main.NewInstruction, _3, _4, _5, %27, %28)::NewInstruction │││╻ NewInstruction
└─── goto #10 ││
9 ── Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}
└─── unreachable ││
10 ┄ %33 = φ (#2 => %7, #4 => %14, #6 => %21, #8 => %29)::NewInstruction ││
└─── goto #11 ││
11 ─ return %33 │
=> NewInstruction
```@nospecialize-d keyword func call1 parent d498d36 commit deb87b9
File tree
4 files changed
+57
-4
lines changed- base
- compiler
- ssair
- test/compiler
4 files changed
+57
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
415 | 415 | | |
416 | 416 | | |
417 | 417 | | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
418 | 455 | | |
419 | 456 | | |
420 | 457 | | |
| |||
469 | 506 | | |
470 | 507 | | |
471 | 508 | | |
472 | | - | |
| 509 | + | |
473 | 510 | | |
474 | 511 | | |
475 | 512 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1594 | 1594 | | |
1595 | 1595 | | |
1596 | 1596 | | |
| 1597 | + | |
| 1598 | + | |
| 1599 | + | |
| 1600 | + | |
| 1601 | + | |
1597 | 1602 | | |
1598 | 1603 | | |
1599 | 1604 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
335 | 335 | | |
336 | 336 | | |
337 | 337 | | |
338 | | - | |
| 338 | + | |
339 | 339 | | |
340 | 340 | | |
341 | 341 | | |
342 | 342 | | |
343 | 343 | | |
344 | 344 | | |
345 | 345 | | |
| 346 | + | |
346 | 347 | | |
347 | 348 | | |
348 | 349 | | |
349 | | - | |
| 350 | + | |
350 | 351 | | |
351 | 352 | | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
352 | 357 | | |
353 | | - | |
| 358 | + | |
354 | 359 | | |
355 | 360 | | |
356 | 361 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2336 | 2336 | | |
2337 | 2337 | | |
2338 | 2338 | | |
| 2339 | + | |
| 2340 | + | |
| 2341 | + | |
| 2342 | + | |
| 2343 | + | |
| 2344 | + | |
2339 | 2345 | | |
2340 | 2346 | | |
2341 | 2347 | | |
| |||
0 commit comments