Skip to content

[Core] Improve reading performance StlIO#14246

Open
loumalouomega wants to merge 20 commits intomasterfrom
Core/performance-stl-io
Open

[Core] Improve reading performance StlIO#14246
loumalouomega wants to merge 20 commits intomasterfrom
Core/performance-stl-io

Conversation

@loumalouomega
Copy link
Member

@loumalouomega loumalouomega commented Mar 2, 2026

📝 Description

Figure_1

This PR refactors StlIO to significantly improve reading performance by deferring node insertion and parallelising entity creation, and adds a Google Benchmark to measure the improvement (-30% time reduction aprox., depends on STL file size).

Motivation

The previous implementation created and added nodes to the ModelPart one by one inside ReadLoop, which triggered an expensive ordered-container search on every insertion. Entity creation (elements, conditions, geometries) was also done serially inside the read loop via a std::function functor passed all the way down the call stack.

How to test

# Build with benchmark support and run:
stl_io_benchmark.exe --benchmark_format=json --benchmark_out=output.json

🆕 Changelog

Performance improvements (stl_io.cpp / stl_io.h)

  • Deferred node insertion – nodes are now constructed locally into a std::vector<Node::Pointer> during facet reading and added to the model part in a single bulk call after the entire solid block is parsed. This avoids repeated searches through the already-inserted node container.
  • Capacity pre-allocation – before reading a solid block the stream position is saved, the file is scanned to count lines (7 lines per facet), and new_nodes is reserve()d accordingly, eliminating repeated reallocations.
  • Parallel entity creation – after all nodes are collected, elements and conditions are created in parallel with IndexPartition::for_each using thread-local storage (TLS), then added to the model part in a single call. Geometry creation falls back to sequential due to an existing memory issue with parallel geometry construction.

Refactoring

  • Replaced the std::function<void(ModelPart&, NodesArrayType&)> functor passed through ReadSolidReadFacetReadLoop with a plain std::vector<Node::Pointer>&, removing unnecessary indirection.
  • ReadPoint now takes std::array<double, 3>& by reference instead of returning a Point value, avoiding an extra allocation and copy.
  • Cleaned up stl_io.h includes (removed redundant <string>, <iostream>, <filesystem>, "includes/define.h"; added "includes/kratos_filesystem.h").
  • Fixed Doxygen group from ApplicationNameApplication to KratosCore.
  • Fixed typo: correspongingcorresponding.

New benchmark

  • Added kratos/benchmarks/stl_io_benchmark.cpp: a Google Benchmark suite (BM_StlIO) that measures StlIO::ReadModelPart on a real STL file, with a fresh ModelPart created for each iteration.
  • Added kratos/benchmarks/file.stl as the benchmark input asset.
  • Updated kratos/CMakeLists.txt to build the new benchmark target.

Commits

@loumalouomega loumalouomega disabled auto-merge March 3, 2026 14:45
@loumalouomega loumalouomega enabled auto-merge March 3, 2026 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant