-
Notifications
You must be signed in to change notification settings - Fork 154
Optimize CI test speed with Minitest parallelization and test infrastructure improvements #2530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
paracycle
wants to merge
42
commits into
main
Choose a base branch
from
autoresearch/test-speed-2026-03-10
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 40 commits
Commits
Show all changes
42 commits
Select commit
Hold shift + click to select a range
6ed49f9
autoresearch: baseline - ~600s test time
paracycle 61965cd
exp3: use default minitest reporter instead of SpecReporter
paracycle 255d744
exp4: disable debug prelude to reduce test startup overhead
paracycle 3b667b5
exp6: reduce Rails logging overhead in tests
paracycle c1be355
exp7: use minitest/hooks instead of minitest/hooks/default
paracycle 01a2f83
exp9: set RAILS_ENV=test for potential test-specific optimizations
paracycle 9af27f0
exp10: silence deprecation warnings to reduce output overhead
paracycle f9ba794
exp15: cache Gem.install('bundler') across test runs
paracycle 554900c
exp16: add --jobs=4 --prefer-local to bundle install in tests
paracycle 4df283f
exp19: cache Gemfile.lock by content hash to avoid redundant bundle i…
paracycle 5b5b8cf
exp21: add --quiet to bundle install to reduce I/O overhead
paracycle c91806a
exp24: add --retry=0 to bundle install to avoid retries in tests
paracycle eeb452c
fix: remove RAILS_ENV=test setting that broke addon tests
paracycle 48ca13f
fix: include gemspec content in lockfile cache key to handle version …
paracycle 83a9953
exp25: replace sorbet subprocess syntax check with in-process Prism.p…
paracycle 75b171c
exp26: disable runtime type checking in tapioca subprocesses during t…
paracycle 1191d8a
exp27: use ruby -rbundler/setup instead of bundle exec for tapioca co…
paracycle 607c944
exp28: file-lock Gem.install to enable safe parallel test execution
paracycle 2aeff2e
exp29: add bin/parallel_test for 4-worker parallel test execution (~2…
paracycle b3b6776
exp30: add --disable=did_you_mean to tapioca subprocesses for faster …
paracycle 2ecd26d
exp31: skip sorbet namer validation in tapioca subprocesses for tests…
paracycle cebf824
exp32: replace tapioca('configure') with in-process configure! to avo…
paracycle 4650d61
update parallel_test runtime estimates to match current measurements
paracycle 9b7a709
use bin/parallel_test in CI and only exclude run_gem_rbi_check on Rub…
paracycle c6459df
increase addon_spec wait_until_exists timeout from 4s to 30s for CI p…
paracycle c49556c
revert to bundle exec for tapioca subprocesses to fix gem isolation o…
paracycle 39f3877
fix tapioca() to use bundle_exec with bundler version pinning for pro…
paracycle ba30894
remove RUBYOPT override that clobbered bundler's -rbundler/setup in s…
paracycle 78c79b1
fix lockfile cache: still run bundle install to ensure gems are insta…
paracycle c476f5d
fix rubocop style offenses
paracycle d9b0ea5
remove accidental test.rb scratch file
paracycle d7f8d36
revert CI to bin/test instead of bin/parallel_test to measure serial …
paracycle b38be1b
Revert "revert CI to bin/test instead of bin/parallel_test to measure…
paracycle 3c18d9e
fix parallel_test output: capture per-worker output to temp files and…
paracycle b6ff842
use GitHub Actions collapsible groups for parallel test output
paracycle 111364d
fix parallel test race conditions: serialize bundle install with glob…
paracycle bacf6a6
fix ETXTBSY: use read-write lock so bundle exec never races with bund…
paracycle 56a9d21
add live progress monitoring to parallel test runner
paracycle c039ac8
fix rubocop offenses in parallel_test: extract methods to reduce nesting
paracycle 58eb8fa
restore bin/test to match main — optimizations live in bin/parallel_test
paracycle a85e195
replace bin/parallel_test with Minitest parallelize_me! module
paracycle 033c3f4
fix Sorbet typecheck: use T::Module[top] for generic Module parameter
paracycle File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,352 @@ | ||
| #!/usr/bin/env ruby | ||
| # frozen_string_literal: true | ||
|
|
||
| # Parallel test runner for Tapioca | ||
| # Splits test files across N worker processes using LPT scheduling | ||
| # for optimal load balancing based on measured test file runtimes. | ||
| # | ||
| # Output strategy: | ||
| # - Each worker's output is captured to a temp file | ||
| # - A monitor thread tails all files, printing a live progress line every 10s | ||
| # and surfacing failures/errors immediately as they appear | ||
| # - When a worker finishes, its full output is printed in a GitHub Actions | ||
| # collapsible group (or with separators locally) | ||
| # - The final summary is always visible | ||
| # | ||
| # Usage: | ||
| # bin/parallel_test # run all tests with 4 workers | ||
| # bin/parallel_test -n 8 # run with 8 workers | ||
| # bin/parallel_test spec/path_spec.rb # run specific files | ||
|
|
||
| require "optparse" | ||
| require "tempfile" | ||
|
|
||
| workers = 4 | ||
| # run_gem_rbi_check_spec.rb hangs on Ruby 4.0+ due to Open3.capture3 + Bundler.with_unbundled_env bug | ||
| exclude_patterns = RUBY_VERSION >= "4.0" ? ["run_gem_rbi_check"] : [] | ||
|
|
||
| OptionParser.new do |opts| | ||
| opts.banner = "Usage: bin/parallel_test [options] [test_files...]" | ||
| opts.on("-n", "--workers N", Integer, "Number of parallel workers (default: 4)") { |n| workers = n } | ||
| opts.on("-e", "--exclude PATTERN", "Exclude files matching pattern") { |p| exclude_patterns << p } | ||
| end.parse! | ||
|
|
||
| # Collect test files | ||
| test_files = if ARGV.any? | ||
| ARGV.dup | ||
| else | ||
| Dir.glob("spec/**/*_spec.rb").reject { |f| exclude_patterns.any? { |p| f.include?(p) } }.sort | ||
| end | ||
|
|
||
| if test_files.empty? | ||
| $stderr.puts "No test files found" | ||
| exit 0 | ||
| end | ||
|
|
||
| # Detect GitHub Actions for collapsible log groups | ||
| GITHUB_ACTIONS = ENV["GITHUB_ACTIONS"] == "true" | ||
| PROGRESS_INTERVAL = 10 # seconds between progress lines | ||
|
|
||
| # Estimated runtimes (seconds) from profiling — used for load balancing | ||
| RUNTIME_ESTIMATES = { | ||
| "gem_spec" => 130, | ||
| "dsl_spec" => 58, | ||
| "pipeline_spec" => 49, | ||
| "active_record_associations_spec" => 19, | ||
| "active_record_columns_spec" => 16, | ||
| "addon_spec" => 16, | ||
| "check_shims_spec" => 15, | ||
| "annotations_spec" => 13, | ||
| "active_record_scope_spec" => 11, | ||
| "active_storage_spec" => 9, | ||
| "active_record_typed_store_spec" => 8, | ||
| "identity_cache_spec" => 8, | ||
| "url_helpers_spec" => 7, | ||
| "active_record_enum_spec" => 7, | ||
| "config_spec" => 6, | ||
| "action_controller_helpers_spec" => 5, | ||
| "todo_spec" => 5, | ||
| "active_record_fixtures_spec" => 5, | ||
| "active_record_store_spec" => 5, | ||
| "json_api_client" => 5, | ||
| }.freeze | ||
|
|
||
| def estimate_runtime(file) | ||
| basename = File.basename(file, ".rb") | ||
| RUNTIME_ESTIMATES.each { |pattern, time| return time if basename.include?(pattern) } | ||
| 3 # default estimate | ||
| end | ||
|
|
||
| # LPT (Longest Processing Time) scheduling: assign heaviest files first to lightest worker | ||
| group_times = Array.new(workers, 0.0) | ||
| groups = Array.new(workers) { [] } | ||
|
|
||
| test_files.sort_by { |f| -estimate_runtime(f) }.each do |file| | ||
| min_idx = group_times.each_with_index.min_by { |t, _| t }[1] | ||
| groups[min_idx] << file | ||
| group_times[min_idx] += estimate_runtime(file) | ||
| end | ||
|
|
||
| $stderr.puts "Parallel test runner: #{workers} workers for #{test_files.size} files" | ||
| groups.each_with_index do |g, i| | ||
| $stderr.puts " Worker #{i + 1}: #{g.size} files, est. #{group_times[i].round(0)}s" | ||
| end | ||
| $stderr.puts | ||
|
|
||
| # Launch workers, capturing each worker's output to a temp file | ||
| tapioca_root = File.expand_path("..", __dir__) | ||
| start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC) | ||
|
|
||
| worker_info = groups.each_with_index.filter_map do |group_files, idx| | ||
| next if group_files.empty? | ||
|
|
||
| output_file = Tempfile.new(["worker_#{idx}_", ".log"]) | ||
| output_path = output_file.path | ||
| output_file.close | ||
|
|
||
| pid = Process.fork do | ||
| # Redirect both stdout and stderr to the temp file | ||
| $stdout.reopen(output_path, "w") | ||
| $stderr.reopen($stdout) | ||
| $stdout.sync = true | ||
| $stderr.sync = true | ||
|
|
||
| cmd = [ | ||
| "ruby", | ||
| "-e", | ||
| "$LOAD_PATH << File.expand_path('spec', '#{tapioca_root}'); " \ | ||
| "ENV['DEFAULT_TEST'] = 'spec/**/*_spec.rb'; " \ | ||
| "ENV['TAPIOCA_SILENCE_DEPRECATIONS'] = '1'; " \ | ||
| "require 'bundler/setup'; " \ | ||
| "require 'logger'; " \ | ||
| "require 'active_support'; " \ | ||
| "require 'rails/test_unit/runner'; " \ | ||
| "ARGV.replace(#{group_files.inspect}); " \ | ||
| "Rails::TestUnit::Runner.parse_options(ARGV); " \ | ||
| "Rails::TestUnit::Runner.run(ARGV)", | ||
| ] | ||
| exec(*cmd) | ||
| end | ||
|
|
||
| { idx: idx, pid: pid, output_path: output_path, files: group_files } | ||
| end | ||
|
|
||
| # Count minitest result characters on a line of worker output | ||
| def count_test_results(state, line) | ||
| if line.match?(/^Finished in/) | ||
| state[:in_running] = false | ||
| elsif line.match?(/\A[.FES]+\s*\z/) | ||
| line.each_char do |c| | ||
| case c | ||
| when "." then state[:dots] += 1 | ||
| when "F" then state[:dots] += 1 | ||
| state[:fail_chars] += 1 | ||
| when "E" then state[:dots] += 1 | ||
| state[:error_chars] += 1 | ||
| when "S" then state[:dots] += 1 | ||
| end | ||
| end | ||
| end | ||
| end | ||
|
|
||
| # Build a compact progress label for one worker | ||
| def worker_progress_label(idx, state) | ||
| label = "W#{idx + 1}" | ||
| if state[:done] | ||
| "#{label}: done" | ||
| elsif state[:dots] > 0 | ||
| status = state[:fail_chars] > 0 || state[:error_chars] > 0 ? "#{state[:fail_chars]}F #{state[:error_chars]}E" : "ok" | ||
| "#{label}: #{state[:dots]} tests (#{status})" | ||
| else | ||
| "#{label}: setup" | ||
| end | ||
| end | ||
|
|
||
| # Monitor thread: tails worker output files for live progress and failure detection. | ||
| # Scans each file for minitest result lines and failure/error blocks, printing | ||
| # a compact progress summary every PROGRESS_INTERVAL seconds and surfacing | ||
| # failures immediately. | ||
| monitor_stop = false | ||
| monitor_mutex = Mutex.new | ||
| # Per-worker state tracked by the monitor | ||
| monitor_state = worker_info.each_with_object({}) do |w, h| | ||
| h[w[:idx]] = { | ||
| file_pos: 0, # bytes read so far | ||
| dots: 0, # count of test result chars (. F E S) | ||
| fail_chars: 0, # count of F chars in test output | ||
| error_chars: 0, # count of E chars in test output | ||
| in_running: false, # seen "# Running:" — now counting dots | ||
| done: false, # worker process exited | ||
| failure_lines: [], # accumulated failure/error text to emit | ||
| in_failure_block: false, | ||
| failure_block_lines: 0, | ||
| } | ||
| end | ||
|
|
||
| monitor_thread = Thread.new do | ||
| last_progress_at = start_time | ||
|
|
||
| until monitor_stop | ||
| sleep(1) | ||
| now = Process.clock_gettime(Process::CLOCK_MONOTONIC) | ||
| elapsed = now - start_time | ||
|
|
||
| # Read new output from each worker | ||
| monitor_mutex.synchronize do | ||
| worker_info.each do |w| | ||
| state = monitor_state[w[:idx]] | ||
| file_size = File.size(w[:output_path]) rescue 0 # rubocop:disable Style/RescueModifier | ||
| next if state[:done] && state[:file_pos] >= file_size | ||
|
|
||
| begin | ||
| File.open(w[:output_path], "r") do |f| | ||
| f.seek(state[:file_pos]) | ||
| new_content = f.read | ||
| next unless new_content && !new_content.empty? | ||
|
|
||
| state[:file_pos] += new_content.bytesize | ||
|
|
||
| new_content.each_line do |line| | ||
| # Detect the "# Running:" marker — after this, dots are test results | ||
| if line.include?("# Running:") | ||
| state[:in_running] = true | ||
| next | ||
| end | ||
|
|
||
| # Count test result characters (dots, F, E, S) in running output. | ||
| # Minitest prints result chars on lines consisting ONLY of [.FES] characters | ||
| # (plus optional trailing whitespace). This avoids false positives from error | ||
| # messages, stack traces, or forked process output that contain these letters. | ||
| count_test_results(state, line) if state[:in_running] | ||
|
|
||
| # Detect failure/error blocks and accumulate them | ||
| if line.match?(/^\s*(Failure|Error):/) | ||
| state[:in_failure_block] = true | ||
| state[:failure_block_lines] = 0 | ||
| state[:failure_lines] << line | ||
| elsif state[:in_failure_block] | ||
| state[:failure_lines] << line | ||
| state[:failure_block_lines] += 1 | ||
| # End the block after a blank line or after enough context | ||
| if line.strip.empty? && state[:failure_block_lines] > 2 | ||
| state[:in_failure_block] = false | ||
| end | ||
| end | ||
| end | ||
| end | ||
| rescue Errno::ENOENT | ||
| # File not yet created | ||
| end | ||
| end | ||
|
|
||
| # Emit accumulated failure lines immediately | ||
| worker_info.each do |w| | ||
| state = monitor_state[w[:idx]] | ||
| next if state[:failure_lines].empty? | ||
|
|
||
| lines = state[:failure_lines].dup | ||
| state[:failure_lines].clear | ||
| $stderr.puts "[W#{w[:idx] + 1}] #{lines.join}" | ||
| end | ||
| end | ||
|
|
||
| # Print periodic progress summary | ||
| if now - last_progress_at >= PROGRESS_INTERVAL | ||
| last_progress_at = now | ||
| parts = monitor_mutex.synchronize do | ||
| worker_info.map { |w| worker_progress_label(w[:idx], monitor_state[w[:idx]]) } | ||
| end | ||
| $stderr.puts "[#{elapsed.round(0)}s] #{parts.join(" | ")}" | ||
| end | ||
| end | ||
| end | ||
|
|
||
| # Print a worker's captured output, using GitHub Actions grouping when available. | ||
| def print_worker_output(worker, status, elapsed) | ||
| label = "Worker #{worker[:idx] + 1}" | ||
| status_str = status.success? ? "PASSED" : "FAILED (exit #{status.exitstatus})" | ||
| header = "#{label}: #{status_str} (#{elapsed.round(1)}s, #{worker[:files].size} files)" | ||
|
|
||
| if GITHUB_ACTIONS | ||
| if status.success? | ||
| $stderr.puts "::group::#{header}" | ||
| else | ||
| $stderr.puts "::error::#{header}" | ||
| $stderr.puts "::group::#{header} — full output" | ||
| end | ||
| else | ||
| separator = "=" * 70 | ||
| $stderr.puts separator | ||
| $stderr.puts header | ||
| $stderr.puts separator | ||
| end | ||
|
|
||
| if File.exist?(worker[:output_path]) | ||
| File.open(worker[:output_path], "r") do |f| | ||
| while (chunk = f.read(8192)) | ||
| $stderr.write(chunk) | ||
| end | ||
| end | ||
| File.delete(worker[:output_path]) | ||
| end | ||
|
|
||
| if GITHUB_ACTIONS | ||
| $stderr.puts "::endgroup::" | ||
| else | ||
| $stderr.puts | ||
| end | ||
| end | ||
|
|
||
| # Wait for workers and print their output as each one finishes (completion order). | ||
| pending = worker_info.map { |w| [w[:pid], w] }.to_h | ||
| results = [] | ||
| completed_count = 0 | ||
|
|
||
| until pending.empty? | ||
| finished_pid, status = Process.waitpid2(-1, 0) | ||
| worker = pending.delete(finished_pid) | ||
| elapsed_worker = Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time | ||
| completed_count += 1 | ||
|
|
||
| # Mark worker as done in monitor state | ||
| monitor_mutex.synchronize { monitor_state[worker[:idx]][:done] = true } | ||
|
|
||
| results << { **worker, status: status, elapsed: elapsed_worker } | ||
|
|
||
| # Progress indicator before the group (always visible) | ||
| remaining = pending.size | ||
| $stderr.puts "[#{completed_count}/#{worker_info.size}] #{status.success? ? "✓" : "✗"} Worker #{worker[:idx] + 1} " \ | ||
| "finished in #{elapsed_worker.round(1)}s#{remaining > 0 ? " (#{remaining} still running)" : ""}" | ||
|
|
||
| print_worker_output(worker, status, elapsed_worker) | ||
| end | ||
|
|
||
| # Stop the monitor thread | ||
| monitor_stop = true | ||
| monitor_thread.join(2) | ||
|
|
||
| # Final summary — always visible (outside any group) | ||
| elapsed = Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time | ||
|
|
||
| $stderr.puts | ||
| $stderr.puts "=" * 70 | ||
| $stderr.puts "SUMMARY" | ||
| $stderr.puts "=" * 70 | ||
|
|
||
| results.sort_by { |r| r[:idx] }.each do |r| | ||
| icon = r[:status].success? ? "✓" : "✗" | ||
| status_str = r[:status].success? ? "PASSED" : "FAILED" | ||
| $stderr.puts " #{icon} Worker #{r[:idx] + 1}: #{status_str} (#{r[:elapsed].round(1)}s, #{r[:files].size} files)" | ||
| end | ||
|
|
||
| $stderr.puts | ||
| $stderr.puts "Total: #{test_files.size} files across #{workers} workers in #{elapsed.round(1)}s" | ||
|
|
||
| failed = results.reject { |r| r[:status].success? } | ||
| if failed.any? | ||
| $stderr.puts "#{failed.size} worker(s) FAILED" | ||
| exit 1 | ||
| else | ||
| $stderr.puts "All workers PASSED" | ||
| exit 0 | ||
| end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.