Skip to content

Conversation

@marioevz
Copy link
Member

🗒️ Description

This PR refactors the blockchain and state test infrastructure to leverage pytest's native collection mechanism via pytest_collect_file, eliminating redundant JSON file reads and improving test execution efficiency.

Key Improvements

  1. Native pytest Collection
  • Implements pytest_collect_file hook to collect tests directly from JSON files during pytest's discovery phase
  • Each JSON file is now read exactly once during collection, rather than being read multiple times during parameterization and execution
  • Test fixtures are created as pytest Item objects (e.g., BlockchainTestFixture, StateTestFixture) that encapsulate all test data
  1. Eliminated Redundant File I/O
  • Before: JSON files were read during test parameterization (fetch_blockchain_tests) and again during test execution (run_blockchain_st_test)
  • After: JSON files are read once in FixturesFile.collect(), and test data is stored in fixture objects for later execution
  • Removes intermediate dictionaries passing file paths that triggered repeated file reads
  1. Cleaner Architecture
  • Introduces Fixture base class for shared fixture behavior
  • Test execution logic moved into runtest() methods of fixture classes
  • Test metadata (markers, fork info) configured during collection rather than parameterization
  • Eliminates the need for custom idfn functions - pytest handles naming automatically

Performance Impact

This refactoring significantly reduces I/O overhead for large test suites where the same JSON files contain multiple test cases across different forks.

Open Issues

Some failing tests still that need to be investigated, for now I'd like to start running this in CI and see how it improves execution speed.

🔗 Related Issues or PRs

N/A.

✅ Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx --with=tox-uv tox -e static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered adding an entry to CHANGELOG.md.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

@marioevz marioevz force-pushed the refactor-json-infra branch from d18197e to 8503878 Compare October 23, 2025 18:41
Comment on lines +270 to +280
# Remove any python files in the downloaded files to avoid
# importing them.
for python_file in glob(
os.path.join(fixture_path, "**/*.py"), recursive=True
):
try:
os.unlink(python_file)
except FileNotFoundError:
# Not breaking error, another process deleted it first
pass

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels... strange? I can't quite put my finger on why.

Like, why do the fixtures contain python files at all? Is there another way we could accomplish the same thing (like excluding a directory)?

I dunno, this just triggers my spidey sense 🤣

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the culprit: https://github.com/ethereum/legacytests/tree/1f581b8ccdc4c63acf5f2c5c1b155c690c32a8eb/src/LegacyTests/Cancun/GeneralStateTestsFiller/Pyspecs

Checking out ethereum/tests at this commit, when submodules are included, results in these python files being checked out too, and when collecting ./tests/json_infra/fixtures for JSON files, pytest tries to collect these files too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we exclude that directory on the command line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed that because with this approach the files are collected directly by pytest, as opposed to doing a glob in the test itself.

Comment on lines +7 to +8
ALL_FIXTURE_TYPES.append(BlockchainTestFixture)
ALL_FIXTURE_TYPES.append(StateTestFixture)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these get executed when importing only, for example, .load_state_tests? From my limited knowledge of Python's import machinery, I would guess yes, but I'm just checking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's correct, it gets executed only when importing from .helpers. If we were to, for example, import directly from .helpers.fixtures, this logic would not be executed and ALL_FIXTURE_TYPES would be empty, so it is indeed a bit brittle if being honest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh really? I thought parent modules were implicitly imported. I'm glad I checked!

big_memory: Tuple[Pattern[str], ...]


@lru_cache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often is this called to require an lru_cache? O.o

Depending on when the cache is populated (in worker vs. in master), using lru_cache can explode memory: each worker has its own cache.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it thinking it might reduce the memory footprint and it did by half a GB, but it still consumes around 30GB+ because all fixtures are in memory when running.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

chetna-mittal pushed a commit to gnosischain/execution-specs that referenced this pull request Oct 24, 2025
* zkevm: add BLOBHASH benchs

Signed-off-by: Ignacio Hagopian <[email protected]>

* generalize params

Signed-off-by: Ignacio Hagopian <[email protected]>

* improvements

Signed-off-by: Ignacio Hagopian <[email protected]>

---------

Signed-off-by: Ignacio Hagopian <[email protected]>
@SamWilsn
Copy link
Contributor

I was thinking briefly about this. I also know next to nothing about pytest, so this might not make any sense at all, but...

What if we use an LRU cache for the JSON files (one per worker), and loadgroup all the tests that come from the same file?

So you'd read once during collection, find all the tests and group them by file, then while running the tests you minimize the number of times you need to re-read the same file.

@marioevz marioevz force-pushed the refactor-json-infra branch from c6408c9 to 0110511 Compare November 3, 2025 23:01
Copy link
Contributor

@gurukamath gurukamath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though this is a much larger re-factor than #1730, I do like this approach since it uses more of the pytest native patterns. So a one-time larger change might be worth it.

)

expected_post_state = load.json_to_state(json_data["postState"])
assert chain.state == expected_post_state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is currently not set up to catch any tests where the blocks themselves do not throw any exceptions but the overall state comparison fails . This I think is causing the current CI failure

* fix(tests): remove evm_tools marker from blockchain tests

* remove coverage from json_infra

* enhance(tools): add json_test_name to Hardfork

* fix(tests): handle failing transactions in state tests

* enhance(tests): add from and until fork option to json_infra

* enhance(tests): run json_infra selectively

* enhance(tests): subclass Hardfork

* bug(tests): run all tests for t8n changes

* enhance(tests): minor fix
@marioevz marioevz marked this pull request as ready for review November 20, 2025 15:19
@marioevz marioevz requested a review from SamWilsn November 20, 2025 15:19
@codecov
Copy link

codecov bot commented Nov 21, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.90%. Comparing base (9563a51) to head (09bb76c).
⚠️ Report is 57 commits behind head on forks/osaka.

Additional details and impacted files
@@               Coverage Diff               @@
##           forks/osaka    #1666      +/-   ##
===============================================
- Coverage        86.07%   85.90%   -0.17%     
===============================================
  Files              743      743              
  Lines            44078    44076       -2     
  Branches          3894     3891       -3     
===============================================
- Hits             37938    37865      -73     
- Misses            5659     5722      +63     
- Partials           481      489       +8     
Flag Coverage Δ
unittests 85.90% <ø> (-0.17%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

This commit refactors exception markers and marks the EEST static tests as slow
@gurukamath gurukamath marked this pull request as draft November 24, 2025 02:00
@gurukamath
Copy link
Contributor

This PR is almost ready for review. However, I'm moving this to draft in order to resolve the discrepancy between the collectd vs run tests in CI for json_infra.

@gurukamath
Copy link
Contributor

@SamWilsn This is now ready for review. The failing tests json_infra tests are unrelated and should be fixed with #1813

@gurukamath gurukamath marked this pull request as ready for review November 26, 2025 14:21
Copy link
Contributor

@SamWilsn SamWilsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review so far

# Get changed files and save to disk
FILE_LIST="changed_files.txt"
git diff --name-only "$BASE_SHA" "$HEAD_SHA" > "$FILE_LIST"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is BASH_SHA going to be the head of the base branch, or the merge-base of the two branches?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BASE_SHA in this case would be the head of the base branch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so a file added in the base branch would get tested here, even if no changes were made to it in this pull request?

Copy link
Contributor

@gurukamath gurukamath Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added file itself would not be tested but the addition might trigger a broader set of tests than what the PR explicitly changes. Perhaps this is not desirable and we should stick to comparing with the merge-base of the two branches. I'll give it a bit more thought

@SamWilsn SamWilsn merged commit afaa270 into ethereum:forks/osaka Nov 28, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants