perf: sparse patch storage for as_of variables (~60x memory reduction)#1366
perf: sparse patch storage for as_of variables (~60x memory reduction)#1366
Conversation
eraviart
left a comment
There was a problem hiding this comment.
Much need complement of as_of variables
32ad81b to
8beb444
Compare
Instead of storing one full array per set_input call, as_of variables now use a base array + list of (instant, indices, values) sparse patches. Storage model: - _as_of_base : first full array (read-only), set once - _as_of_patches : sorted list of sparse diffs (Instant, idx, vals) - _as_of_patch_instants: parallel bisect index for O(log P) lookup - _as_of_snapshot : (Instant, array, patch_cursor) for incremental GETs Memory savings for 0.5% monthly change rate, N=1M, 120 months: before (dense): ~240 MB after (patches): ~4 MB (~60x reduction) GET performance (forward-sequential, snapshot cursor): first access : O(N) (base copy) subsequent : O(k) (apply only new patches since snapshot) cache hit : O(1) (same instant as snapshot) backward jump: O(N + k*P) (full reconstruction — degrades gracefully) Other changes: - _set routes as_of writes to _set_as_of, bypassing _memory_storage - get_array short-circuits _memory_storage for as_of variables - _register_instant removed (replaced by sorted insert in _set_as_of) - clone() shares read-only base, deep-copies patch lists, resets snapshot - delete_arrays() leaves _memory_storage empty for as_of (no-op, harmless) - Retroactive set_input supported (out-of-order patches, snapshot invalidation) - 18 tests: 14 semantics + 4 new (no-patch-on-unchanged, diff correctness, retroactive patch, snapshot cursor reuse) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ruction In the forward-sequential GET→SET→GET pattern (the real use case for as_of variables: each month's echelon depends on the previous one), _set_as_of was unconditionally clearing the snapshot after every write. This forced the next get_array(M) to fully reconstruct from the base + all M patches — O(N + M·k) — making the total simulation cost quadratic. Root cause: _reconstruct_at(instant) advanced the snapshot to `instant` during the diff computation inside _set_as_of, so the invalidation condition `snapshot[0] >= instant` was always true (equality case). Fix: distinguish forward vs retroactive SETs. - Forward (patch appended at end of list): store value.copy() as the new snapshot so the next GET(instant) is an O(1) exact cache hit. - Retroactive (out-of-order insert): keep the existing >= guard to prevent a stale snapshot from silently ignoring the new patch. Benchmark (N=1M, forward sim): 1 yr, 10% change/month: 66 ms → 46 ms (×1.4) 5 yr, 10% change/month: 772 ms → 188 ms (×4.1) 5 yr, 30% change/month: 1598 ms → 294 ms (×5.4) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pattern TestForwardSimulationBench models the actual use case: month-by-month simulation where each echelon depends on the previous one plus a stochastic transition rule. Parametrised by duration (1yr / 5yr) and change rate (10% / 30%). Used to surface and measure the O(N + M·k) snapshot regression fixed in the previous commit (×1.4 to ×5.4 speedup depending on scenario). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
b1374c4 to
f5583ab
Compare
…f test and benchmark Made-with: Cursor
|
@benjello and @eraviart have your read all the changes? Do you understand them? @benjello car you add use cases for those PR coded with Claude code and consort? It's a highly technical change and I am a bit worried with such a PR because the test and the code are very complex and it is really hard to have a critical point of view on the added lines of code. |
|
Yes I did as much as I can and I tested on a real implementation soon to be exposed to public. |
|
@guillett you can have more information here https://github.com/openfisca/openfisca-core/tree/master/docs/implementation |
|
This series of PRs, in particular #1363, have broken production. The fix proposed in #1371 is again generated with Claude but does not seem to address the root cause. Testing “on a real implementation soon to be exposed to public” is NOT sufficient for software that is used in production in many different environments. |
|
I read the changes and I made sure the test passed. I tested as musch as I can. But we may need to test on real package (which I did) but wxe may test on france because it uses all the features everytime and so we have to implement that in the workflow. Ands if your question is : do I want to use agents, my answer is yes ! |
|
Thank you for this reply. Testing the impact of changes is necessary but not sufficient. Understanding the changes that are applied is also necessary. No testing procedure can ever replace that. Do you understand the big-O notation that is used in the “comparison table” that was generated by an LLM in the description of this pull request? Few people have the permission to greenlight pushes to Core. And, as always, with great power comes great responsibility. The question is not about tooling (using agents or not). You could be using your mind to generate code, an army of interns, or an LLM, it's quite the same to me (except on copyright issues, but that's a different topic that we will also need to discuss). What matters is that the people who author and review code take full responsibility for shipping into production, and thus handling the consequences of their mistakes when they arise, and this is not what I see happening here. In the end, other contributors who did NOT author nor review nor ship these changes ended up working against the clock to investigate the production issues that arose and to yank the published versions; basically, to clean up the mess. This is not fair to them, and it impacts negatively the entire OpenFisca community. Will either of you take the time to cleanly revert the changes in the codebase now, in order to prevent the next version bump from publishing again code that breaks production? Will either of you take the time to clean install and restart the demonstration APIs that have been knocked out following rushed review? The mess is not cleaned yet, and everyone already has too much work on their hands. I completely understand that the pace of updates and reviews on Core can be slow and frustrating. However, until we secure hundreds of thousands of euros in yearly funding, this will stay this way. And breaking production is not exactly the best way to secure such funding. You are always welcome to use working branches or a fork to try out new features. Shipping to the main branch and triggering a deployment is literally impacting hundreds of production services around the world. Luckily not in real time, but still. There are 200 to 500 daily installs of Core. That's about one install every 4 minutes. That's a lot of broken installs in the hours between your deployment of this changeset and the yanking, and not an easy revert for those who already deployed it. Hardening the supply chain and increasing the pace of update are both important endeavours. I am looking forward to establishing work on that topic in the coming months. Until mid-April, considering that many senior contributors are mobilised on public events (RaC EU conference, OpenFisca Conference), I ask to please be extra careful in deployments in order to avoid the extra cleanup burden and the reputational risk that comes with production incidents. I understand that “being careful” might not be clear enough in the context of supply chain, so I'll spell it out: just don't press “merge” if you don't understand 100% of the lines in the changeset and if you're not fully equipped and ready to handle an emergency rollback in case things go wrong after deployment. Thank you. |
Context
Follows #1365 which introduced
as_ofvariables with dense storage.Dense storage means one full array per
set_inputcall — even when only0.5% of individuals changed value that month, we stored 100% of the data.
For 1M people, 0.5% change rate, 120 months: 240 MB vs ~4 MB with patches.
What changed
Storage model (
openfisca_core/holders/holder.py):_memory_storagedict_as_of_base+_as_of_patcheslistset_input_memory_storageused foras_ofGET performance (snapshot cursor):
Memory savings for N=1M, 0.5% monthly change, 120 months:
Test plan
tests/core/test_asof_variable.py— all pass locallytest_asof_no_patch_when_value_unchanged: no patch stored when identicaltest_asof_patch_stores_only_changed_indices: sparse diff correctnesstest_asof_retroactive_patch: out-of-orderset_inputwith snapshot invalidationtest_asof_snapshot_cursor_no_copy_between_patches: same array object returned for periods with no intermediate patches🤖 Generated with Claude Code