Skip to content

Change PyEmu's RNG to be mostly locally scoped#689

Merged
briochh merged 10 commits intopypest:developfrom
mmorphew:develop
Mar 18, 2026
Merged

Change PyEmu's RNG to be mostly locally scoped#689
briochh merged 10 commits intopypest:developfrom
mmorphew:develop

Conversation

@mmorphew
Copy link

@mmorphew mmorphew commented Mar 11, 2026

This PR changes PyEmu’s RNG to be local rather than global, based on the discussion in #688. This protects both the user and PyEmu. When PyEmu’s RNG is dependent on the global seed, it is prone to manipulation by both the user and other package imports that may be altering np.random.seed(). Local RNG protects PyEmu and ensures that reproducible results occur regardless of the state of the script that’s calling PyEmu functions. i.e. If the global seed is changed by the user or another package, PyEmu’s draws will not change because nearly all draw functions now rely on local rng objects.

The goal of this PR is moving RNG to a local scope while also maintaining the hard-sought reproducibility in draws that PyEmu creates for ensembles and other cases for which values must be drawn from a distribution. Draws made by PyEmu should not change post-merge, and the fact that all tests pass give me confidence in this. Some tests check PyEmu output from a pickled state, and as PyEmu output is still matching those saved base cases, this tells me that the RNG state has remained intact even though the implementation has changed.

The main changes occur in en.py, with np.random.seed(SEED) now replaced with rng = np.random.RandomState(SEED). RandomState is the older NumPy generator class (identical to manipulating rng using np.random.seed()) and ensures consistency regardless of the NumPy version that’s in the environment that is using PyEmu. By default, functions or methods that rely on rng point to the RandomState object created at the start of en.py. Functions in en.py, helpers.py, geostats.py and pst_from.py, are also altered to accept a local “rng” object. This allows tests that previously manipulated the global seed to still pass, and additionally allows users that want a different generator, for whatever reason, to draw using their own generator. For the vast majority of users, the rng argument will remain set to None for all draw functions, resulting in the use of the RandomState object at the top of en.py (which produces the same sequences as the current development branch).

An exhaustive list of methods that have had their rng code altered to be local are listed below. If methods are missing, please let me know. There are two special cases that I did not alter:

draw_conditional” which was already set up to accept a seed number, has not been altered. This results in a slight discontinuity in how rng is implemented in draw_conditional vs. other methods (seed number vs. rng object), but I don’t want to cause issues for current users.
The other culprit is DSIVC’s AutoEncoder in dsiae.py. Due to how potentially entangled tensorflow’s and numpy’s RNGs may be under the hood of some of these ML functions, I have left the current global seed code untouched. I am unsure of the order of operations for these functions, but the current implementation leaves prepare_dsivc untouched, so that its draw function still points to the RNG from en.py. Once the AutoEncoder is fit, the global seed for both tensorflow and numpy will be set to the user-defined or default random_state value, at which point any np.random or tf.random operations performed afterward in the script will rely on that newly defined seed. Again, this is the same behavior as before. I am less worried about this particular case because this global seed definition is nestled away inside a function definition, rather than activated on import.

en.py:
Rng at top of file has been changed to a RandomState object.
reseed now points to the new RandomState object.
If a function below has a definition for both ObservationEnsemble and ParameterEnsemble, both versions have been altered.
_guassian_draw
_draw_new_ensemble
from_guassian_draw
draw_new_ensemble
from_triangular_draw
from_uniform_draw
from_mixed_draws

geostats.py:
draw_arrays
grid_par_ensemble_helper

helpers.py:
autcorrelated_draw
draw_by_group
geostatistical_draws
get_maha_obs_summary

pst_from.py:
draw

(In various other locations and example notebooks, small np.random functions pop up. These have been changed to point to the rng object in en.py instead.)

I know this is a substantial change and will require some thorough review. I appreciate your time. I am investigating the large diffs in dsiae.py and transformers.py. I presume a rogue space or something has caused GitHub's diff to freak out a bit.

Runners are now all passing.

@mmorphew mmorphew changed the title DRAFT: Change PyEmu's RNG to be mostly locally scoped Change PyEmu's RNG to be mostly locally scoped Mar 11, 2026
@codecov-commenter
Copy link

codecov-commenter commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 81.06904% with 85 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.15%. Comparing base (c3ac16d) to head (6059a00).

Files with missing lines Patch % Lines
pyemu/emulators/transformers.py 79.85% 81 Missing ⚠️
pyemu/utils/geostats.py 80.00% 2 Missing ⚠️
pyemu/eds.py 50.00% 1 Missing ⚠️
pyemu/plot/plot_utils.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #689      +/-   ##
===========================================
+ Coverage    75.14%   75.15%   +0.01%     
===========================================
  Files           37       37              
  Lines        19234    19253      +19     
===========================================
+ Hits         14453    14470      +17     
- Misses        4781     4783       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@briochh briochh merged commit 59f22ee into pypest:develop Mar 18, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants