Skip to content
This repository was archived by the owner on Jul 13, 2021. It is now read-only.

Conversation

@timholy
Copy link
Contributor

@timholy timholy commented Sep 23, 2020

This is a small demonstration of a potential alternate path towards reducing latency compared to #502. This PR takes one tiny sliver of the problem, update_limits!, and improves its inferrability---not just in terms of its return type (which is Scene), but also its internals. The reasoning here is that if we improve the quality of inference, then (1) there's less abstract inference, which typically takes more time than concrete inference, and (2) it makes it possible to prevent runtime inference altogether, since we can typically precompile concrete signatures.

The main point of this PR is to tell you how I went about this and allow you to see the result: first, I profiled display(plot(rand(5))) and then used ProfileView:

using Profile
Profile.init(delay=0.01, n=10^7)    # just to make sure we don't overrun the buffer
@profile display(plot(rand(5)))
using ProfileView
ProfileView.view()

image

The flames that rise high to the sky are almost always inference. By hovering over the bars at the base you can sometimes figure out what is being inferred (obviously, @snoopi gives you more direct routes to figuring this out, but sometimes it's nice to know the call chain and profiling is really good for that). update_limits! is an incredibly skinny (i.e., not very consequential) vertical bar fairly near the beginning; I didn't pick it for any particular reason other than a glance or two showed me what was being inferred.

Then, I just did

using Cthulhu
descend(AbstractPlotting.update_limits!, (Scene,))

Hit 'o' to turn off optimization and 'w' to turn on coloration of non-concrete types. Then I just used the menu to follow the calls into update_limits!(::Scene, ::Automatic).

The problems I found stemmed largely from JuliaLang/julia#15276, and given the heavy do-block usage in this package I expect that to be pretty common. The underlying issues are well-described in https://docs.julialang.org/en/v1/manual/performance-tips/#man-performance-captured. It may be worth considering the use of FastClosures.jl if this proves to be widespread. In this case, however, I just solved the problem manually. Here, the let statements prior to use in an anonymous function solve the problem (with the small addition of making sure that the ifelse statements inferred well).

There were also a couple of other more localized problems:

  • scene.padding[] was not inferrable so I added a type-assert on the argument. This guarantees that the method will never be inferred for anything non-concrete at that argument position, no matter where it's called from.
  • data_limits was also not inferrable. One of the things to keep in mind is that if Julia has n applicable methods (given what it knows about the argument type(s)), all of which return the same type T, Julia still will infer it as Any if n > 3. (It might be 4 on 1.5 but it's 3 on 1.6.) This doesn't come up with objects of known argument type, because there's only ever 1 applicable method, but when you're pulling things out of containers with abstract element types it's crucial to be aware of this issue.

Again, this particular PR is such an infintesimal step that I expect no measurable improvements whatsoever; the main point is to demonstrate the idea and show you my workflow. For the record, this is quite similar to the work that I did eliminating invalidations in Julia itself, which I think has had a pretty big impact on latency.

Making this viable

I'd really rather not feel like I have to tackle this alone, so I'm curious how interested the Makie developers really are in this as a mini-project. If you're not already familiar with these tools, I hope somebody reading this actually tries these same steps to follow along, so you can get a sense for what such an adventure would look like. As a word of warning, while the tools are really good for this now, it still can be slow, laborious work, but the cumulative effect really adds up.

As motivation, let me show you where I've gotten Revise. Here's the test I did:

juliamns -i -e 'using Profile; @profile (using Revise; using Example; cp("/tmp/Example.jl", "/home/tim/.julia/dev/Example/src/Example.jl"; force=true); revise())'

This test profiles the entire Revise pipeline, from tracking packages to detecting diffs to evaluating them and creating new methods. Revise is not a tiny package, and moreover all this requires both JuliaInterpreter and LoweredCodeUtils, so there's a quite a lot of functionality actively being compiled during this demo.

After the REPL came up then I did using ProfileView; ProfileView.view() and got this:

image

You can see lots of code-loading and active compilation (the red bars going across the top), but the amount of that inference contributes to this is almost nothing (there are relatively few tall flames and they are very skinny). The few stragglers come from the fact that calling methods owned by other packages or Base with novel types is not currently precompilable, and for Revise that's mostly Dict and OrderedDict methods. So yes there is still latency, but in a very nontrivial example I've almost completely gotten rid of the contribution of inference to that latency. This is a very different result from Makie, but I predict one could probably get the Makie one looking pretty close to Revise's with sufficient effort. One might predict a 2-fold decrease in latency, which would not be a bad accomplishment.

@timholy
Copy link
Contributor Author

timholy commented Sep 23, 2020

By the way, while I submitted this as a demo, unless any of my type-asserts are wrong there should be no reason not to merge this---in addition to less latency, it should yield faster runtime performance and greater resistance to invalidation by other packages that get loaded later in the same session. What's not to like?

@SimonDanisch
Copy link
Member

Thanks a lot :) I will try to put some time into finding more of these!
If I'm not mistaken, /src/makielayout won't be refactored much anytime soon, so would be a nice place for optimization (@jkrumbiegel, that's about right, no?)
src/conversions.jl should also mainly get new code added, but stay mostly similar.

@SimonDanisch SimonDanisch merged commit 0601c06 into JuliaPlots:master Sep 23, 2020
@timholy timholy deleted the teh/infer_update_limits branch September 24, 2020 05:57
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants