To speed up our build and deployment time, we recently implemented test parallelization for our two biggest app. By doing so, we reduced the total deployment time of one service from 50min down to <20min. Both services had problems with flaky tests, that are gone now. In this post, I will outline the steps we took to achieve this + scripts.
Parallelization + distribution of tests
We use Test-boosters Gem, which is a simple runner, that splits all tests into a defined number of groups and runs one of those groups. Each runner itself does not use parallelization, but you utilize multiple CI-Runners/workers for parallelization.
To determine the load, we made a small scripts, that takes the output of RSpec’s example_status_persistence_file_path
file, and sums the runtime per file, then distributes them evenly across the number of runners and writes a JSON-file that we commit to the repository.
First, enable example_status_persistence_file_path
in your .rspec
file:
RSpec.configure do |config|
config.example_status_persistence_file_path = './tmp/rspec.failed.txt'
end
Then run all your tests on a machine, best with similar specs as your CI-Runner (rspec spec
). After that, run the following script:
# bin/generate_ci_test_distribution.rb
worker_count = 4
rspec_split_configuration_path = './tmp/rspec_split_configuration.txt'
rspec_failed_path = './tmp/rspec.failed.txt'
class Worker
attr_reader :jobs, :time
def initialize
@jobs = []
@time = 0
end
def add_job(file:, time:)
@jobs << { file: file, time: time }
@time += time
end
end
workers = worker_count.times.map { Worker.new }
tests = File.read(rspec_failed_path).lines.drop(2).map do |line|
file, state, time = line.split('|').map(&:strip)
# time: parse: x.xxxx seconds | x minute(x) xx.xx seconds
if time.include?('minute')
minute, seconds = time.split('minute')
s_time = minute.to_i * 60 + seconds.to_f
else
s_time = time.scan(/\d+\.\d+/).first.to_f
end
{ file: file, time: s_time }
end
# group by file without line number, sum time, because rspec booster cannot handle [:file, :line] call format
jobs = tests.group_by { |test| test[:file].split('[').first }.map do |file, tests|
{ file: file, time: tests.sum { |test| test[:time] } }
end
jobs.sort_by! { |test| -test[:time] }
jobs.each do |job|
worker = workers.min_by(&:time)
worker.add_job(**job)
end
json = workers.map { |worker|
{
files: worker.jobs.map { |job|
job[:file].sub('./', '')
}
}
}
File.write(rspec_split_configuration_path, JSON.pretty_generate(json))
Run it, and then commit the ./tmp/rspec_split_configuration.txt
file to your repository.
Now, you can instruct your workers to each run a slice of the tests. Depending on your CI-Runner, you get environment variables, which tell you about the current worker in a parallelized build. We use Gitlab-CI, so we use the CI_NODE_INDEX
and CI_NODE_TOTAL
variables.
# ./bin/ci
# you can modify the RSpec command line options, by overriding the environment variables
# export TB_RSPEC_OPTIONS='--tty --color --format documentation'
# export TB_RSPEC_FORMATTER='EnhancedDocumentationFormatter'
RSPEC_SPLIT_CONFIGURATION_PATH=./tmp/rspec_split_configuration.txt \
rspec_booster --job $CI_NODE_INDEX/$CI_NODE_TOTAL
Asset pre-precompilation (Gitlab CI)
If your app uses Sprockets/Webpacker/Vite or similar, you will need to compile the assets before each test run. Because you split up the work, it would be wasteful, to let each worker compile the assets. Instead, we run a different task beforehand and distribute the assets to the subsequent workers by using the Gitlab-CI artifacts
.
# gitlab-ci.yml
rspec:
needs: ["rspec:assets"]
#...
script:
#...
- rspec_booster --job $CI_NODE_INDEX/$CI_NODE_TOTAL
rspec:assets:
services:
- 'postgres:15-bullseye'
- "redis:4.0"
# maybe not needed, depending on your asset build setup. Somethings a database connection is required
needs: []
variables:
RAILS_ENV: production
DATABASE_URL: postgresql://postgres:postgres@postgres:5432/mydbname
SECRET_KEY_BASE_DUMMY: 1
stage: test
cache:
# cache the whole node_modules folder by hashing the yarn.lock file,
# if you are using npm, you can hash the package-lock.json file
- key:
files:
- yarn.lock
paths:
- node_modules/
# cache all the asset side effects, to speed up subsequent asset builds that do not change much
- key:
files:
- Gemfile.lock
- .ruby-version
paths:
- vendor/bundle
- key: cache-$CI_COMMIT_REF_SLUG
fallback_keys:
- cache-$CI_DEFAULT_BRANCH
- cache-default
paths:
- public/assets
- public/packs
- tmp/cache/assets
- tmp/cache/packs
# - tmp/cache/webpacker
# - tmp/cache/webpacker-compile
- tmp/cache/vite
- vendor/assets
# build normal rails assets and share as cache for gitlab
script: "./bin/build_assets_for_production"
artifacts:
# generate artifcats - those are available for all subsequent jobs
expire_in: 1 day
name: "assets-$CI_COMMIT_REF_NAME"
paths:
- public/assets
- public/packs
With an example ./bin/build_assets_for_production
script:
#!/bin/bash
export RAILS_ENV=production
export NODE_ENV=production
export NODE_OPTIONS="--max-old-space-size=8192"
export SECRET_KEY_BASE_DUMMY=1
set -e
# if using RVM
# source /usr/local/rvm/scripts/rvm
# rvm install `cat .ruby-version`
# rvm use `cat .ruby-version`
bundle config set --local path 'vendor/bundle'
bundle --jobs $(nproc) --path=/cache/bundler
yarn config set cache-folder .yarn
yarn install --frozen-lockfile
# depending on the setup, a DB connection might be needed
rm config/database.yml
bundle exec rake db:create
bundle exec rake db:schema:load
bundle exec rails assets:precompile
Collecting test coverage
If you split up the work between workers, each worker will only see a part of the tests and will only generate a partial coverage report. To get a full coverage report, you need to collect all the partial reports and merge them into one in a new job that runs after all the tests have finished.
1. Rename coverage report in each worker
In each test worker run, save the coverage report into a separate file, so they do not get overwritten by the next worker.
rspec_booster --job $CI_NODE_INDEX/$CI_NODE_TOTAL
mv coverage/.resultset.json coverage/.resultset-$CI_NODE_INDEX.json
Also, tell Gitlab, to ignore the coverage print out in the console, so it does not get parsed by the greedy coverage parser and ruins the coverage that Gitlab displays alongside each pipeline/MR.
# gitlab-ci.yml
rspec:
coverage: '/COVERAGE DISABLED \(\d+.\)/'
#...
2. Create a new CI task
# gitlab-ci.yml
rspec:coverage:
needs: ["rspec"]
script:
- ./bin/merge_coverage_reports
coverage: '/\(\d+.\d+\%\) covered/'
artifacts:
paths: ['coverage/coverage.xml']
reports:
coverage_report:
coverage_format: cobertura
path: coverage/coverage.xml
We also specify the Coberatura export, because it is supported by Gitlab-CI and can display the line coverage in each Merge request diff view.
3. Create the merge script
#!/usr/bin/env/ruby
# bin/merge_coverage_reports
#
require 'bundler/inline'
gemfile do
source 'https://rubygems.org'
gem 'simplecov'
gem 'simplecov-cobertura'
# gem 'rspec_junit_formatter'
end
require 'simplecov'
require 'simplecov-cobertura'
SimpleCov.formatter = SimpleCov::Formatter::CoberturaFormatter
SimpleCov.collate Dir['coverage/.resultset*.json']