Optimize CI test speed with Minitest parallelization and test infrastructure improvements#2530
Optimize CI test speed with Minitest parallelization and test infrastructure improvements#2530
Conversation
…arse in DSL tests Use Prism.parse instead of shelling out to sorbet --stop-after parser for syntax validation in DSL compiler tests. This eliminates ~0.06-0.5s of subprocess overhead per rbi_for call across ~374 DSL tests.
…ests Default enforce_typechecking to false in MockProject#tapioca since no tests depend on runtime type validation. This reduces subprocess overhead by ~40% per tapioca invocation by skipping sorbet-runtime type checks.
…mmands Skip the overhead of bundle exec by directly invoking ruby with bundler/setup and BUNDLE_GEMFILE. This saves ~0.2-0.3s per tapioca invocation across ~130 calls. Also handles gems.rb as an alternative to Gemfile for projects that use it.
… that don't need it
…id subprocess overhead
…per gem isolation
…lled The lockfile cache was returning early without running bundle install, which meant gems listed in the cached lockfile might not actually be installed in the system gem path. This caused sqlite3 version conflicts on CI where activerecord 7.0.x expected sqlite3 ~> 1.4 but sqlite3 2.9.x was the only version installed. Now the cache only pre-populates the lockfile (skipping resolution) but still runs bundle install to ensure gems are present. Also removes --prefer-local which could cause stale resolution.
… serial performance" This reverts commit d7f8d36.
… print sequentially Each worker's stdout/stderr is redirected to a temp file during execution. When a worker finishes, its output is printed as a contiguous block with clear header/footer separators showing worker number, pass/fail status, elapsed time, and file count. A final summary table is printed at the end. This eliminates the interleaved output problem where multiple workers wrote to the same stdout/stderr simultaneously.
On CI (GITHUB_ACTIONS=true), each worker's output is wrapped in ::group::/::endgroup:: markers creating collapsible log sections. Passed workers are collapsed by default; failed workers get a ::error:: annotation visible in the PR checks summary. Progress lines like '[1/4] ✓ Worker 2 finished in 120s (3 still running)' appear outside groups so they're always visible, giving real-time feedback on which workers have completed. The final SUMMARY block stays outside any group and is always visible. Locally (no GITHUB_ACTIONS), the separator-based format is preserved.
…al file lock
Two race conditions caused sporadic CI failures when running tests in parallel:
1. ETXTBSY: Gem.install('bundler') rewrites binstubs in the shared gem bin
directory. If another worker is simultaneously exec'ing that binstub via
bundle exec, the kernel returns ETXTBSY. Fixed by using cross-process
marker files so Gem.install only runs once (first worker), not per-worker.
2. GemNotFound: Concurrent bundle install processes write gems into the same
GEM_HOME simultaneously. A worker running bundle exec can see partially-
installed gems. Fixed by serializing all bundle install calls under a
global file lock (.bundle_install_global.lock).
The performance impact is minimal because lockfile caching makes most
bundle install calls fast no-ops (~1-2s), and the lock only blocks when
two workers call bundle_install! at the exact same moment.
Also uses atomic writes (write-to-temp + rename) for the lockfile cache
to prevent readers from seeing partially-written lockfiles.
…le install The ETXTBSY race condition occurs when bundle install (writing binstubs/gems into GEM_HOME) runs concurrently with bundle exec (executing those binstubs). Solution: read-write file locking using flock: - bundle_install! takes LOCK_EX (exclusive) — runs alone, no concurrent execs - bundle_exec takes LOCK_SH (shared) — multiple execs run in parallel, but they wait if bundle install holds the exclusive lock This means bundle exec calls across workers can still run concurrently (good for performance), but they never overlap with bundle install (prevents ETXTBSY and GemNotFound from partially-installed gems).
A monitor thread tails each worker's output file every second, parsing minitest's dot output to count completed tests. Every 10 seconds it prints a compact progress line: [50s] W1: 19 tests (ok) | W2: 104 tests (ok) | W3: 148 tests (ok) | W4: done Failures/errors detected in the output are surfaced immediately with a worker prefix, so you don't have to wait for the worker to finish. Only lines consisting entirely of minitest result characters ([.FES]) are counted, avoiding false positives from error messages, stack traces, or forked process output.
Add Tapioca::Helpers::Test::Parallel module that calls parallelize_me! on included test classes, enabling Minitest's built-in thread pool for classes that don't rely on minitest-hooks' before(:all)/after(:all). This replaces the 352-line custom bin/parallel_test runner with a 25-line module included in 12 safe test classes (DslSpec, PipelineSpec, BuilderSpec, and all unit spec classes). CI switches back to bin/test. Measured: full suite 5m34s locally (44% faster than serial baseline).
amomchilov
left a comment
There was a problem hiding this comment.
Still working through reviewing spec/helpers/mock_project.rb, but the rest looks good so far. Left a few questions/comments
| module Dsl | ||
| module Helpers | ||
| class ActiveModelTypeHelperSpec < Minitest::Spec | ||
| include Tapioca::Helpers::Test::Parallel |
There was a problem hiding this comment.
Why not call parallelize_me! directly?
|
|
||
| def wait_until_exists(path) | ||
| Timeout.timeout(4) do | ||
| Timeout.timeout(30) do |
There was a problem hiding this comment.
Is this because we're expecting more CPU contention, now that we might actually be fully saturating multiple threads?
| backtrace_filter = Minitest::ExtensibleBacktraceFilter.default_filter | ||
| backtrace_filter.add_filter(%r{gems/sorbet-runtime}) | ||
| backtrace_filter.add_filter(%r{gems/railties}) | ||
| backtrace_filter.add_filter(%r{tapioca/helpers/test/}) |
There was a problem hiding this comment.
Are we ok with giving this up?
| } | ||
| } | ||
| config.logger = Logger.new('/dev/null') | ||
| config.log_level = :fatal |
There was a problem hiding this comment.
Can you please add a comment to explain that this is a performance optimization?
| @@ -34,6 +34,7 @@ class Dummy < Rails::Application | |||
| } | |||
| } | |||
| config.logger = Logger.new('/dev/null') | |||
There was a problem hiding this comment.
Actually, this would still be making syscalls. Wanna use a NullLogger like so?
class NullLogger
include Singleton
# : (*untyped) -> nil
def unknown(*) = nil
# : (*untyped) -> nil
def fatal(*) = nil
# : (*untyped) -> nil
def error(*) = nil
# : (*untyped) -> nil
def warn(*) = nil
# : (*untyped) -> nil
def info(*) = nil
# : (*untyped) -> nil
def debug(*) = nil
# : (untyped) -> nil
# : () -> bool
def fatal? = false
# : () -> bool
def error? = false
# : () -> bool
def warn? = false
# : () -> bool
def info? = false
# : () -> bool
def debug? = false
def level=(_)
nil
end
end| config.logger = Logger.new('/dev/null') | |
| config.logger = NullLogger.instance |
I'm curious to measure how much it might help
Motivation
CI test runs were taking ~22 minutes on average (up to ~27 minutes for the slowest matrix job). Most of this time was spent in subprocess overhead: each CLI test creates a temporary project, runs
bundle install, and invokesbundle exec tapioca <command>— spawning ~95bundle installprocesses and manytapiocasubprocesses per run.Implementation
This PR introduces serial test infrastructure optimizations and in-process parallel test execution via Minitest's built-in
parallelize_me!, reducing CI time by ~34%.Serial optimization breakdown (cumulative):
bundle installflags (--jobs=4 --quiet --retry=0)ENFORCE_TYPECHECKING=0in tapioca subprocessesTAPIOCA_SKIP_VALIDATION=1in tapioca subprocessesconfigure!replacing subprocess callsParallel execution via
parallelize_me!A
Tapioca::Helpers::Test::Parallelmodule (25 lines) callsparallelize_me!on any test class that includes it, enabling Minitest's built-in thread pool for safe classes. Thread count is controlled by theMT_CPUenv var (defaults toEtc.nprocessors).12 test classes include the module:
DslSpec(+ 38 subclasses) — 374 DSL compiler tests that use fork-based isolationPipelineSpec— 120 gem pipeline tests with fork-based isolationBuilderSpec,CompilerSpec,ExecutorSpec,RBIHelperSpec,SorbetHelperSpec,ReflectionSpec,GenericTypeRegistrySpec,ActiveModelTypeHelperSpec,GraphqlTypeHelperSpec,LockFileDiffParserSpec— unit tests with no shared stateClasses using minitest-hooks'
before(:all)/after(:all)(allSpecWithProjectsubclasses) remain serial, sinceparallelize_me!bypasses thewith_info_handlerlifecycle that minitest-hooks relies on.Gemfile.lock caching (
spec/helpers/mock_project.rb)bundle installafter cache hit to ensure gems are actually installed, but skips slow dependency resolutionbundle installcalls under a global exclusive file lock to prevent concurrent gem directory corruptionSubprocess overhead reduction
ENFORCE_TYPECHECKING=0: Disables sorbet runtime type checking in tapioca subprocesses (~40% faster per invocation)TAPIOCA_SKIP_VALIDATION=1: Skips sorbet static validation in test subprocesses (tests that specifically validate this behavior opt out)configure!method replacestapioca("configure")subprocess calls where possiblePrism.parsein DSL compiler testsMinor test infrastructure improvements
config.log_level = :fatalfor Rails logger--jobs=4 --quiet --retry=0flags forbundle installaddon_spectimeout from 4s to 30s for CI resilienceCI Results
All 14 CI matrix jobs pass (Ruby 3.2/3.3/3.4/4.0/head × Rails 8.0/current/main).
Tests
Existing test suite is unchanged and all tests pass. The
bin/testentry point is unmodified — all parallelism is opt-in viainclude Tapioca::Helpers::Test::Parallelin individual test classes. One pre-existing test (run_gem_rbi_check_spec.rb) hangs on Ruby 4.0 due to anOpen3.capture3/Bundler.with_unbundled_envinteraction unrelated to this PR.