I received access to the model referenced in discussion #790. There are a few issues affecting that model that need to be fixed in SDEverywhere. The following describes one of those defects.
Current Behavior
Overview
The expandedRefIdsForVar() function in packages/compile/src/model/read-equations.js determines which RHS variable instances are referenced by a LHS variable. It uses a cartesian product algorithm that is extremely slow for models with large subscript dimensions.
The algorithm (lines 1024-1045):
- Computes the cartesian product of all LHS subscript dimensions → array of combo strings
- For each RHS variable instance, computes the cartesian product of its subscript dimensions → another array of combo strings
- Does brute-force
Array.includes() string matching between the two sets
The model in discussion #790 has variables that use multiple dimensions, for example 36 countries × 56 materials × 22 kinds = 44,352 combinations per variable, this is O(product of dimension sizes) per call. With 112,753 calls during model analysis, this results in ~7 minutes of build time for what could be a ~35 second build.
Profiling Data
BEFORE optimization:
expandedRefIdsForVar: 421,749ms across 112,753 calls
Total build: ~449,000ms (~7.5 min)
AFTER optimization:
expandedRefIdsForVar: 3,443ms across 112,753 calls
Total build: ~35,000ms (~35 sec)
Expected Behavior
The function should produce the same results but use an O(sum of dimension sizes) algorithm instead of O(product of dimension sizes).
AI Disclosure
I guided Claude Code (Opus 4.6) to investigate the performance issue and to propose a fix. I refactored and refined the fix and tests myself. I also used Claude to help prepare a summary of the problem to help populate the issue description above.
/cc @LSarribouette @m2gi-mohamami
I received access to the model referenced in discussion #790. There are a few issues affecting that model that need to be fixed in SDEverywhere. The following describes one of those defects.
Current Behavior
Overview
The
expandedRefIdsForVar()function inpackages/compile/src/model/read-equations.jsdetermines which RHS variable instances are referenced by a LHS variable. It uses a cartesian product algorithm that is extremely slow for models with large subscript dimensions.The algorithm (lines 1024-1045):
Array.includes()string matching between the two setsThe model in discussion #790 has variables that use multiple dimensions, for example 36 countries × 56 materials × 22 kinds = 44,352 combinations per variable, this is O(product of dimension sizes) per call. With 112,753 calls during model analysis, this results in ~7 minutes of build time for what could be a ~35 second build.
Profiling Data
Expected Behavior
The function should produce the same results but use an O(sum of dimension sizes) algorithm instead of O(product of dimension sizes).
AI Disclosure
I guided Claude Code (Opus 4.6) to investigate the performance issue and to propose a fix. I refactored and refined the fix and tests myself. I also used Claude to help prepare a summary of the problem to help populate the issue description above.
/cc @LSarribouette @m2gi-mohamami