Optimize CI test speed with Minitest parallelization and test infrastructure improvements by paracycle · Pull Request #2530 · Shopify/tapioca

paracycle · 2026-03-11T16:25:48Z

Motivation

CI test runs were taking ~22 minutes on average (up to ~27 minutes for the slowest matrix job). Most of this time was spent in subprocess overhead: each CLI test creates a temporary project, runs bundle install, and invokes bundle exec tapioca <command> — spawning ~95 bundle install processes and many tapioca subprocesses per run.

Implementation

This PR introduces serial test infrastructure optimizations and in-process parallel test execution via Minitest's built-in parallelize_me!, reducing CI time by ~34%.

Serial optimization breakdown (cumulative):

Optimization	Time	Improvement
Baseline (main)	~600s	—
Default reporter + Rails log silencing + minitest/hooks	~580s	~3%
Gemfile.lock content-hash caching	~420s	~28%
`bundle install` flags (`--jobs=4 --quiet --retry=0`)	~400s	~5%
Prism.parse replacing sorbet subprocess syntax check	~370s	~8%
`ENFORCE_TYPECHECKING=0` in tapioca subprocesses	~340s	~8%
`TAPIOCA_SKIP_VALIDATION=1` in tapioca subprocesses	~330s	~3%
In-process `configure!` replacing subprocess calls	~320s	~3%
Total serial improvement	~320s	~47%

Parallel execution via parallelize_me!

A Tapioca::Helpers::Test::Parallel module (25 lines) calls parallelize_me! on any test class that includes it, enabling Minitest's built-in thread pool for safe classes. Thread count is controlled by the MT_CPU env var (defaults to Etc.nprocessors).

12 test classes include the module:

DslSpec (+ 38 subclasses) — 374 DSL compiler tests that use fork-based isolation
PipelineSpec — 120 gem pipeline tests with fork-based isolation
BuilderSpec, CompilerSpec, ExecutorSpec, RBIHelperSpec, SorbetHelperSpec, ReflectionSpec, GenericTypeRegistrySpec, ActiveModelTypeHelperSpec, GraphqlTypeHelperSpec, LockFileDiffParserSpec — unit tests with no shared state

Classes using minitest-hooks' before(:all)/after(:all) (all SpecWithProject subclasses) remain serial, since parallelize_me! bypasses the with_info_handler lifecycle that minitest-hooks relies on.

Gemfile.lock caching (spec/helpers/mock_project.rb)

Caches lockfile resolution by content-hashing the Gemfile and all referenced gemspec files
Still runs bundle install after cache hit to ensure gems are actually installed, but skips slow dependency resolution
Serializes all bundle install calls under a global exclusive file lock to prevent concurrent gem directory corruption

Subprocess overhead reduction

ENFORCE_TYPECHECKING=0: Disables sorbet runtime type checking in tapioca subprocesses (~40% faster per invocation)
TAPIOCA_SKIP_VALIDATION=1: Skips sorbet static validation in test subprocesses (tests that specifically validate this behavior opt out)
In-process configure! method replaces tapioca("configure") subprocess calls where possible
Replaces sorbet subprocess syntax checking with in-process Prism.parse in DSL compiler tests

Minor test infrastructure improvements

Default minitest reporter instead of SpecReporter (less I/O)
config.log_level = :fatal for Rails logger
--jobs=4 --quiet --retry=0 flags for bundle install
Increased addon_spec timeout from 4s to 30s for CI resilience

CI Results

Metric	Main (baseline)	This PR	Improvement
CI average per job	22.4m	14.8m	34% faster
CI fastest job	19.3m	13.3m	31% faster
CI slowest job	26.8m	17.0m	37% faster

All 14 CI matrix jobs pass (Ruby 3.2/3.3/3.4/4.0/head × Rails 8.0/current/main).

Tests

Existing test suite is unchanged and all tests pass. The bin/test entry point is unmodified — all parallelism is opt-in via include Tapioca::Helpers::Test::Parallel in individual test classes. One pre-existing test (run_gem_rbi_check_spec.rb) hangs on Ruby 4.0 due to an Open3.capture3/Bundler.with_unbundled_env interaction unrelated to this PR.

…nstall

…changes

…arse in DSL tests Use Prism.parse instead of shelling out to sorbet --stop-after parser for syntax validation in DSL compiler tests. This eliminates ~0.06-0.5s of subprocess overhead per rbi_for call across ~374 DSL tests.

…ests Default enforce_typechecking to false in MockProject#tapioca since no tests depend on runtime type validation. This reduces subprocess overhead by ~40% per tapioca invocation by skipping sorbet-runtime type checks.

…mmands Skip the overhead of bundle exec by directly invoking ruby with bundler/setup and BUNDLE_GEMFILE. This saves ~0.2-0.3s per tapioca invocation across ~130 calls. Also handles gems.rb as an alternative to Gemfile for projects that use it.

…x speedup)

…Ruby startup

… that don't need it

…id subprocess overhead

…y 4.0+

…arallelism

…n CI

…per gem isolation

…ubprocesses

…lled The lockfile cache was returning early without running bundle install, which meant gems listed in the cached lockfile might not actually be installed in the system gem path. This caused sqlite3 version conflicts on CI where activerecord 7.0.x expected sqlite3 ~> 1.4 but sqlite3 2.9.x was the only version installed. Now the cache only pre-populates the lockfile (skipping resolution) but still runs bundle install to ensure gems are present. Also removes --prefer-local which could cause stale resolution.

…performance

… serial performance" This reverts commit d7f8d36.

… print sequentially Each worker's stdout/stderr is redirected to a temp file during execution. When a worker finishes, its output is printed as a contiguous block with clear header/footer separators showing worker number, pass/fail status, elapsed time, and file count. A final summary table is printed at the end. This eliminates the interleaved output problem where multiple workers wrote to the same stdout/stderr simultaneously.

On CI (GITHUB_ACTIONS=true), each worker's output is wrapped in ::group::/::endgroup:: markers creating collapsible log sections. Passed workers are collapsed by default; failed workers get a ::error:: annotation visible in the PR checks summary. Progress lines like '[1/4] ✓ Worker 2 finished in 120s (3 still running)' appear outside groups so they're always visible, giving real-time feedback on which workers have completed. The final SUMMARY block stays outside any group and is always visible. Locally (no GITHUB_ACTIONS), the separator-based format is preserved.

…al file lock Two race conditions caused sporadic CI failures when running tests in parallel: 1. ETXTBSY: Gem.install('bundler') rewrites binstubs in the shared gem bin directory. If another worker is simultaneously exec'ing that binstub via bundle exec, the kernel returns ETXTBSY. Fixed by using cross-process marker files so Gem.install only runs once (first worker), not per-worker. 2. GemNotFound: Concurrent bundle install processes write gems into the same GEM_HOME simultaneously. A worker running bundle exec can see partially- installed gems. Fixed by serializing all bundle install calls under a global file lock (.bundle_install_global.lock). The performance impact is minimal because lockfile caching makes most bundle install calls fast no-ops (~1-2s), and the lock only blocks when two workers call bundle_install! at the exact same moment. Also uses atomic writes (write-to-temp + rename) for the lockfile cache to prevent readers from seeing partially-written lockfiles.

…le install The ETXTBSY race condition occurs when bundle install (writing binstubs/gems into GEM_HOME) runs concurrently with bundle exec (executing those binstubs). Solution: read-write file locking using flock: - bundle_install! takes LOCK_EX (exclusive) — runs alone, no concurrent execs - bundle_exec takes LOCK_SH (shared) — multiple execs run in parallel, but they wait if bundle install holds the exclusive lock This means bundle exec calls across workers can still run concurrently (good for performance), but they never overlap with bundle install (prevents ETXTBSY and GemNotFound from partially-installed gems).

A monitor thread tails each worker's output file every second, parsing minitest's dot output to count completed tests. Every 10 seconds it prints a compact progress line: [50s] W1: 19 tests (ok) | W2: 104 tests (ok) | W3: 148 tests (ok) | W4: done Failures/errors detected in the output are surfaced immediately with a worker prefix, so you don't have to wait for the worker to finish. Only lines consisting entirely of minitest result characters ([.FES]) are counted, avoiding false positives from error messages, stack traces, or forked process output.

bin/parallel_test

Add Tapioca::Helpers::Test::Parallel module that calls parallelize_me! on included test classes, enabling Minitest's built-in thread pool for classes that don't rely on minitest-hooks' before(:all)/after(:all). This replaces the 352-line custom bin/parallel_test runner with a 25-line module included in 12 safe test classes (DslSpec, PipelineSpec, BuilderSpec, and all unit spec classes). CI switches back to bin/test. Measured: full suite 5m34s locally (44% faster than serial baseline).

amomchilov

Still working through reviewing spec/helpers/mock_project.rb, but the rest looks good so far. Left a few questions/comments

amomchilov · 2026-03-13T21:42:54Z

spec/tapioca/dsl/helpers/active_model_type_helper_spec.rb

  module Dsl
    module Helpers
      class ActiveModelTypeHelperSpec < Minitest::Spec
+        include Tapioca::Helpers::Test::Parallel


Why not call parallelize_me! directly?

amomchilov · 2026-03-13T21:43:33Z

spec/tapioca/addon_spec.rb


      def wait_until_exists(path)
-        Timeout.timeout(4) do
+        Timeout.timeout(30) do


Is this because we're expecting more CPU contention, now that we might actually be fully saturating multiple threads?

amomchilov · 2026-03-13T21:44:32Z

spec/spec_helper.rb

-backtrace_filter = Minitest::ExtensibleBacktraceFilter.default_filter
-backtrace_filter.add_filter(%r{gems/sorbet-runtime})
-backtrace_filter.add_filter(%r{gems/railties})
-backtrace_filter.add_filter(%r{tapioca/helpers/test/})


Are we ok with giving this up?

amomchilov · 2026-03-13T22:07:06Z

spec/rails_spec_helper.rb

              }
            }
            config.logger = Logger.new('/dev/null')
+            config.log_level = :fatal


Can you please add a comment to explain that this is a performance optimization?

amomchilov · 2026-03-13T22:07:50Z

spec/rails_spec_helper.rb

@@ -34,6 +34,7 @@ class Dummy < Rails::Application
              }
            }
            config.logger = Logger.new('/dev/null')


Actually, this would still be making syscalls. Wanna use a NullLogger like so?

class NullLogger include Singleton # : (*untyped) -> nil def unknown(*) = nil # : (*untyped) -> nil def fatal(*) = nil # : (*untyped) -> nil def error(*) = nil # : (*untyped) -> nil def warn(*) = nil # : (*untyped) -> nil def info(*) = nil # : (*untyped) -> nil def debug(*) = nil # : (untyped) -> nil # : () -> bool def fatal? = false # : () -> bool def error? = false # : () -> bool def warn? = false # : () -> bool def info? = false # : () -> bool def debug? = false def level=(_) nil end end

Suggested change

config.logger = Logger.new('/dev/null')

config.logger = NullLogger.instance

I'm curious to measure how much it might help

paracycle added 30 commits March 10, 2026 23:31

autoresearch: baseline - ~600s test time

6ed49f9

exp3: use default minitest reporter instead of SpecReporter

61965cd

exp4: disable debug prelude to reduce test startup overhead

255d744

exp6: reduce Rails logging overhead in tests

3b667b5

exp7: use minitest/hooks instead of minitest/hooks/default

c1be355

exp9: set RAILS_ENV=test for potential test-specific optimizations

01a2f83

exp10: silence deprecation warnings to reduce output overhead

9af27f0

exp15: cache Gem.install('bundler') across test runs

f9ba794

exp16: add --jobs=4 --prefer-local to bundle install in tests

554900c

exp19: cache Gemfile.lock by content hash to avoid redundant bundle i…

4df283f

…nstall

exp21: add --quiet to bundle install to reduce I/O overhead

5b5b8cf

exp24: add --retry=0 to bundle install to avoid retries in tests

c91806a

fix: remove RAILS_ENV=test setting that broke addon tests

eeb452c

fix: include gemspec content in lockfile cache key to handle version …

48ca13f

…changes

exp28: file-lock Gem.install to enable safe parallel test execution

607c944

exp29: add bin/parallel_test for 4-worker parallel test execution (~2…

2aeff2e

…x speedup)

exp30: add --disable=did_you_mean to tapioca subprocesses for faster …

b3b6776

…Ruby startup

exp31: skip sorbet namer validation in tapioca subprocesses for tests…

2ecd26d

… that don't need it

exp32: replace tapioca('configure') with in-process configure! to avo…

cebf824

…id subprocess overhead

update parallel_test runtime estimates to match current measurements

4650d61

use bin/parallel_test in CI and only exclude run_gem_rbi_check on Rub…

9b7a709

…y 4.0+

increase addon_spec wait_until_exists timeout from 4s to 30s for CI p…

c6459df

…arallelism

revert to bundle exec for tapioca subprocesses to fix gem isolation o…

c49556c

…n CI

fix tapioca() to use bundle_exec with bundler version pinning for pro…

39f3877

…per gem isolation

remove RUBYOPT override that clobbered bundler's -rbundler/setup in s…

ba30894

…ubprocesses

fix rubocop style offenses

c476f5d

paracycle added 10 commits March 12, 2026 17:18

remove accidental test.rb scratch file

d9b0ea5

revert CI to bin/test instead of bin/parallel_test to measure serial …

d7f8d36

…performance

Revert "revert CI to bin/test instead of bin/parallel_test to measure…

b38be1b

… serial performance" This reverts commit d7f8d36.

fix rubocop offenses in parallel_test: extract methods to reduce nesting

c039ac8

restore bin/test to match main — optimizations live in bin/parallel_test

58eb8fa

paracycle added the chore label Mar 12, 2026

paracycle changed the title ~~Autoresearch/test speed 2026 03 10~~ Optimize CI test speed with parallel runner and test infrastructure improvements Mar 12, 2026

paracycle marked this pull request as ready for review March 12, 2026 22:33

paracycle requested a review from a team as a code owner March 12, 2026 22:34

amomchilov reviewed Mar 12, 2026

View reviewed changes

bin/parallel_test Outdated Show resolved Hide resolved

paracycle added 2 commits March 13, 2026 21:05

fix Sorbet typecheck: use T::Module[top] for generic Module parameter

033c3f4

paracycle requested a review from amomchilov March 13, 2026 20:27

paracycle changed the title ~~Optimize CI test speed with parallel runner and test infrastructure improvements~~ Optimize CI test speed with Minitest parallelization and test infrastructure improvements Mar 13, 2026

amomchilov reviewed Mar 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize CI test speed with Minitest parallelization and test infrastructure improvements#2530

Optimize CI test speed with Minitest parallelization and test infrastructure improvements#2530
paracycle wants to merge 42 commits intomainfrom
autoresearch/test-speed-2026-03-10

paracycle commented Mar 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

amomchilov left a comment

Uh oh!

amomchilov Mar 13, 2026

Uh oh!

amomchilov Mar 13, 2026

Uh oh!

amomchilov Mar 13, 2026

Uh oh!

amomchilov Mar 13, 2026

Uh oh!

amomchilov Mar 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	config.logger = Logger.new('/dev/null')
	config.logger = NullLogger.instance

Conversation

paracycle commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Implementation

CI Results

Tests

Uh oh!

Uh oh!

amomchilov left a comment

Choose a reason for hiding this comment

Uh oh!

amomchilov Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

amomchilov Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

amomchilov Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

amomchilov Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

amomchilov Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

paracycle commented Mar 11, 2026 •

edited

Loading

amomchilov Mar 13, 2026 •

edited

Loading