Skip to content

Conversation

@vytas7
Copy link
Member

@vytas7 vytas7 commented Dec 20, 2020

This is an early prototype of how implementing the https://github.com/pythonspeed/cachegrind-benchmarking approach could look like for Falcon.

The prototype builds upon this excellent article by @itamarst: https://pythonspeed.com/articles/consistent-benchmarking-in-ci/ (thanks @njsmith for kindly pointing to this idea).

Closes #1450

To do if we want to proceed with this:

  • ASGI performance metric
  • Barebones "Hello, World!" performance metric
  • Media performance metric
  • Routing performance metric
  • URL params and headers performance metric
  • Clean up tox environments, requirements etc
  • Add Cython support
  • Run Cython gates for master only (?)
  • Add PyPy support, master only (?); probably only informational?

@vytas7 vytas7 marked this pull request as draft December 20, 2020 19:08
@codecov
Copy link

codecov bot commented Dec 20, 2020

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (29b05ed) to head (30d2886).
⚠️ Report is 386 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##            master     #1825   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           54        54           
  Lines         5154      5154           
  Branches       831       831           
=========================================
  Hits          5154      5154           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@itamarst
Copy link

Excited to see this, and to see if it turns out to be useful. I'm going to be setting up something similar for one of my projects, so will share what I come up with if this is still in progress, or maybe steal from you, depending 😀

@itamarst
Copy link

Note that I've found a bug in the cachegrind.py script calculation, so you'll want to pull a new version once I've updated it (tomorrow hopefully).

@vytas7
Copy link
Member Author

vytas7 commented Dec 21, 2020

Heh, thanks for heads up @itamarst !
I did experience that the cachegrind.py metric was a lot (1-2 orders of magnitude) noisier than the instruction count from valgrind, if that is what would be revised.
But maybe it is a bit noisier in its nature, so I didn't pay too much attention to it.

@vytas7
Copy link
Member Author

vytas7 commented Dec 21, 2020

@itamarst Btw, another thing I was really surprised when toying with this approach, was how much "rogue" Python hash seeds can affect performance 🙂

@itamarst
Copy link

Yeah, you really want to set a fixed PYTHONHASHSEED for consistency.

The issue was that I was counting L3 hits wrong, if something hit RAM I also counted this as hitting L3 (i.e. L3 hits were too high). I am not sure if this will have much impact on the noisiness though.

@itamarst
Copy link

OK, https://github.com/pythonspeed/cachegrind-benchmarking has been updated.

@vytas7
Copy link
Member Author

vytas7 commented Dec 22, 2020

Thanks @itamarst , I'll check that out!

@vytas7
Copy link
Member Author

vytas7 commented Dec 26, 2020

@itamarst thanks again for the update.
The noisiness is gone, and the variation of the least-squares linear regression fitting error is now low beyond belief 💯 .
In fact, I'm now getting exactly the same cost of one iteration in two different CI runs (within at least 9 significant digits!).

@itamarst
Copy link

I got benchmarks working. Beyond what is in original repository:

  1. I store results as pretty-printed, sorted JSON file on disk. Expectation is that developer runs this locally and checks in result, but I'm single-person project.
  2. On every PR, I run the benchmarks again and add diff as GitHub comment: https://github.com/pythonspeed/filprofiler/blob/master/.github/workflows/main.yml#L120
  3. In addition to setting PYTHONHASHSEED and figuring out equivalent fixed-seed for Rust, I ended up using Conda environments to reduce noise between my machine and GitHub Actions machines (so e.g. it's the same Python binary, rather than different dot-versions or different compilers etc.). The result is quite consistent on my machine, and a little noisy between my machine and VMs, but better than it would be without Conda.

Example output: pythonspeed/filprofiler#110

@kgriffs kgriffs mentioned this pull request Aug 4, 2021
14 tasks
@vytas7
Copy link
Member Author

vytas7 commented Mar 13, 2022

This has been rotting for so long due to the lack of my bandwidth, that I'm thinking to wait a couple of weeks more and then migrate this to Ubuntu 22.04 + CPython 3.10, to have the same gauge for a longer time.

I'm hoping to circle back on this shortly after we release 3.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add basic performance testing to the gate

3 participants