Skip to content

perf: sparse patch storage for as_of variables (~60x memory reduction)#1366

Merged
benjello merged 5 commits intomasterfrom
feat/as-of-patches
Mar 4, 2026
Merged

perf: sparse patch storage for as_of variables (~60x memory reduction)#1366
benjello merged 5 commits intomasterfrom
feat/as-of-patches

Conversation

@benjello
Copy link
Copy Markdown
Member

Context

Follows #1365 which introduced as_of variables with dense storage.
Dense storage means one full array per set_input call — even when only
0.5% of individuals changed value that month, we stored 100% of the data.
For 1M people, 0.5% change rate, 120 months: 240 MB vs ~4 MB with patches.

What changed

Storage model (openfisca_core/holders/holder.py):

Before (dense) After (patches)
Storage structure _memory_storage dict _as_of_base + _as_of_patches list
Memory per set_input O(N) — full array O(k) — changed indices only
_memory_storage used for as_of Yes No (bypassed entirely)

GET performance (snapshot cursor):

Access pattern Cost
Exact cache hit (same instant) O(1)
Forward sequential (no new patches) O(1) — same array reused
Forward sequential (new patches) O(k) — apply only new patches
Backward jump / first access O(N + k×P) — full reconstruction

Memory savings for N=1M, 0.5% monthly change, 120 months:

  • Dense: ~240 MB
  • Patches: ~4 MB → ~60× reduction

Test plan

  • 18 tests in tests/core/test_asof_variable.py — all pass locally
  • Full test suite: 487 passed, 0 new failures
  • test_asof_no_patch_when_value_unchanged: no patch stored when identical
  • test_asof_patch_stores_only_changed_indices: sparse diff correctness
  • test_asof_retroactive_patch: out-of-order set_input with snapshot invalidation
  • test_asof_snapshot_cursor_no_copy_between_patches: same array object returned for periods with no intermediate patches

🤖 Generated with Claude Code

Copy link
Copy Markdown
Member

@eraviart eraviart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much need complement of as_of variables

@benjello benjello force-pushed the feat/as-of-variable branch 2 times, most recently from 32ad81b to 8beb444 Compare March 4, 2026 10:49
Base automatically changed from feat/as-of-variable to master March 4, 2026 10:55
benjello and others added 4 commits March 4, 2026 11:57
Instead of storing one full array per set_input call, as_of variables
now use a base array + list of (instant, indices, values) sparse patches.

Storage model:
- _as_of_base       : first full array (read-only), set once
- _as_of_patches    : sorted list of sparse diffs (Instant, idx, vals)
- _as_of_patch_instants: parallel bisect index for O(log P) lookup
- _as_of_snapshot   : (Instant, array, patch_cursor) for incremental GETs

Memory savings for 0.5% monthly change rate, N=1M, 120 months:
  before (dense): ~240 MB    after (patches): ~4 MB    (~60x reduction)

GET performance (forward-sequential, snapshot cursor):
  first access : O(N)         (base copy)
  subsequent   : O(k)         (apply only new patches since snapshot)
  cache hit    : O(1)         (same instant as snapshot)
  backward jump: O(N + k*P)   (full reconstruction — degrades gracefully)

Other changes:
- _set routes as_of writes to _set_as_of, bypassing _memory_storage
- get_array short-circuits _memory_storage for as_of variables
- _register_instant removed (replaced by sorted insert in _set_as_of)
- clone() shares read-only base, deep-copies patch lists, resets snapshot
- delete_arrays() leaves _memory_storage empty for as_of (no-op, harmless)
- Retroactive set_input supported (out-of-order patches, snapshot invalidation)
- 18 tests: 14 semantics + 4 new (no-patch-on-unchanged, diff correctness,
  retroactive patch, snapshot cursor reuse)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ruction

In the forward-sequential GET→SET→GET pattern (the real use case for
as_of variables: each month's echelon depends on the previous one),
_set_as_of was unconditionally clearing the snapshot after every write.
This forced the next get_array(M) to fully reconstruct from the base +
all M patches — O(N + M·k) — making the total simulation cost quadratic.

Root cause: _reconstruct_at(instant) advanced the snapshot to `instant`
during the diff computation inside _set_as_of, so the invalidation
condition `snapshot[0] >= instant` was always true (equality case).

Fix: distinguish forward vs retroactive SETs.
- Forward (patch appended at end of list): store value.copy() as the new
  snapshot so the next GET(instant) is an O(1) exact cache hit.
- Retroactive (out-of-order insert): keep the existing >= guard to
  prevent a stale snapshot from silently ignoring the new patch.

Benchmark (N=1M, forward sim):
  1 yr, 10% change/month:   66 ms →  46 ms  (×1.4)
  5 yr, 10% change/month:  772 ms → 188 ms  (×4.1)
  5 yr, 30% change/month: 1598 ms → 294 ms  (×5.4)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pattern

TestForwardSimulationBench models the actual use case: month-by-month
simulation where each echelon depends on the previous one plus a
stochastic transition rule. Parametrised by duration (1yr / 5yr) and
change rate (10% / 30%).

Used to surface and measure the O(N + M·k) snapshot regression fixed in
the previous commit (×1.4 to ×5.4 speedup depending on scenario).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@benjello benjello force-pushed the feat/as-of-patches branch from b1374c4 to f5583ab Compare March 4, 2026 10:58
@guillett
Copy link
Copy Markdown
Member

guillett commented Mar 4, 2026

@benjello and @eraviart have your read all the changes? Do you understand them?

@benjello car you add use cases for those PR coded with Claude code and consort?

It's a highly technical change and I am a bit worried with such a PR because the test and the code are very complex and it is really hard to have a critical point of view on the added lines of code.

@benjello benjello merged commit ff0d6d6 into master Mar 4, 2026
24 checks passed
@benjello benjello deleted the feat/as-of-patches branch March 4, 2026 13:38
@benjello
Copy link
Copy Markdown
Member Author

benjello commented Mar 4, 2026

Yes I did as much as I can and I tested on a real implementation soon to be exposed to public.
It is based on a real use case: progression on a salary scale.

@benjello
Copy link
Copy Markdown
Member Author

benjello commented Mar 4, 2026

@guillett
Copy link
Copy Markdown
Member

guillett commented Mar 4, 2026

Did someone @benjello and @eraviart critically reviewed this PR? We may be creating black boxes with open source code. Personnaly, I write code for humans (myself included) so if OpenFisca Core is now mainly developed for machine (to write and read) it is a important change in code governance.

@MattiSG
Copy link
Copy Markdown
Member

MattiSG commented Mar 5, 2026

This series of PRs, in particular #1363, have broken production. The fix proposed in #1371 is again generated with Claude but does not seem to address the root cause. Testing “on a real implementation soon to be exposed to public” is NOT sufficient for software that is used in production in many different environments.

What is your plan @benjello @eraviart for correcting this?

@benjello
Copy link
Copy Markdown
Member Author

benjello commented Mar 5, 2026

I read the changes and I made sure the test passed.

I tested as musch as I can.

But we may need to test on real package (which I did) but wxe may test on france because it uses all the features everytime and so we have to implement that in the workflow.

Ands if your question is : do I want to use agents, my answer is yes !

@MattiSG
Copy link
Copy Markdown
Member

MattiSG commented Mar 6, 2026

Thank you for this reply.

Testing the impact of changes is necessary but not sufficient. Understanding the changes that are applied is also necessary. No testing procedure can ever replace that.

Do you understand the big-O notation that is used in the “comparison table” that was generated by an LLM in the description of this pull request?
Did you notice that the Lint: black and flake8 claim in the changelog was not actually done on the entirety of the mentioned files?
Do you realise that this changeset added a benchmark file that creates a “hypothetical dense storage” that has absolutely no relevance to the features offered by the codebase?
Can you explain in plain language what the TestForwardSimulationBench method does?
Are you 100% sure that changing the conditions order and removing the if value is not None: return value statement in the holder code has no side-effect?
Only if the answer to all of these questions is “yes” for both author and reviewer, and if both are ready to oversee the deployment and potential rollback, is it fine to push a changeset like this one into production.

Few people have the permission to greenlight pushes to Core. And, as always, with great power comes great responsibility. The question is not about tooling (using agents or not). You could be using your mind to generate code, an army of interns, or an LLM, it's quite the same to me (except on copyright issues, but that's a different topic that we will also need to discuss). What matters is that the people who author and review code take full responsibility for shipping into production, and thus handling the consequences of their mistakes when they arise, and this is not what I see happening here. In the end, other contributors who did NOT author nor review nor ship these changes ended up working against the clock to investigate the production issues that arose and to yank the published versions; basically, to clean up the mess. This is not fair to them, and it impacts negatively the entire OpenFisca community. Will either of you take the time to cleanly revert the changes in the codebase now, in order to prevent the next version bump from publishing again code that breaks production? Will either of you take the time to clean install and restart the demonstration APIs that have been knocked out following rushed review? The mess is not cleaned yet, and everyone already has too much work on their hands.

I completely understand that the pace of updates and reviews on Core can be slow and frustrating. However, until we secure hundreds of thousands of euros in yearly funding, this will stay this way. And breaking production is not exactly the best way to secure such funding.

You are always welcome to use working branches or a fork to try out new features. Shipping to the main branch and triggering a deployment is literally impacting hundreds of production services around the world. Luckily not in real time, but still. There are 200 to 500 daily installs of Core. That's about one install every 4 minutes. That's a lot of broken installs in the hours between your deployment of this changeset and the yanking, and not an easy revert for those who already deployed it.

Hardening the supply chain and increasing the pace of update are both important endeavours. I am looking forward to establishing work on that topic in the coming months.

Until mid-April, considering that many senior contributors are mobilised on public events (RaC EU conference, OpenFisca Conference), I ask to please be extra careful in deployments in order to avoid the extra cleanup burden and the reputational risk that comes with production incidents. I understand that “being careful” might not be clear enough in the context of supply chain, so I'll spell it out: just don't press “merge” if you don't understand 100% of the lines in the changeset and if you're not fully equipped and ready to handle an emergency rollback in case things go wrong after deployment.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants