Gitlab + Hetzner S3: Migrate Omnibus Gitlab CE storages to new Hetzner S3

9th December 2024 – 1787 words

Recently, Hetzner announced the new S3-compatible storage service. At the same time, we are using Gitlab CE as our main development back-end since their early 1.0 versions - as issues back-end, customer inquiries management, container registry, CI-CD with runners, company chat via Mattermost, and of course as a git repository. During the last 5 years or so, we were running Gitlab as the Omnibus installation, meaning, all the upgrades of Gitlab were done by the package manager. If you are following the upgrade path, we had no issues ever during upgrades. One issue though, was the increasing space requirements, especially after slowly adopting Docker for deployment, like using Kamal in a recent project. So, the previous solution was, to add a cloud volume from Hetzner to the box.

That’ll work, but also gets a little more expensive than we’d hope, and handling data + backup is not as easy. So, we decided to move some of the data to Hetzner’s newly (2024/12) launched S3 storage service. I guess, most of the steps could also be applied to other S3-compatible storage services, like Minio, Backblaze, etc.

General Notes

  • Gitlab’s omnibus Administration Config is done via the /etc/gitlab/gitlab.rb file. All the settings need to be done there, afterwards always run gitlab-ctl reconfigure to apply the changes.
  • Create Buckets and credentials in Hetzner’s S3 service - Make sure to read the pricing. The min. billing is $5/month for 1 TB, so even if you use less than 1 TB, you’ll be billed for 1 TB. On the other hand, you can have any number of buckets.
  • The Hetzner S3 credentials can only be scoped on the project level, so it’s not possible to create different access keys for different buckets, if they are in the same project.
  • Gitlab allows configuration of different storage back-ends for all their type of data - namely, build artifacts, container repository, LFS storage and more. That allows us, to migrate each individual piece of data to the new storage, check it before moving on.

Moving the container registry

That was our biggest storage consumer, so we started here first. If your container registry is very busy (writing), it might be a good idea to stop pushing to it during migration to stay consistent (Guide from Gitlab). If you are like us, and now exactly what’s deploying at the moment, or you are a small team and can coordinate, you can do it live.

1. Download mc + configure Hetzner credentials

To move existing data to S3. After wasting a little time with the AWS CLI and trying to configure it unsuccessfully with trying to shoehorn it into accessing non-AWS stuff, we tried the Minio Client (mc). You can download it without any dependencies for the migration:

wget https://dl.min.io/client/mc/release/linux-amd64/mc -O /usr/local/bin/mc
chmod +x /usr/local/bin/mc

mc alias ls
# will launch a wizard to configure

That will create a ~/.mc/config.json file with the credentials. You can also edit it manually:

{
  "version": "10",
  "aliases": {
    "default-nbg1": {
      "url": "https://nbg1.your-objectstorage.com",
      "accessKey": "xxxx",
      "secretKey": "xxx",
      "api": "S3v4",
      "path": "off"
    },
   ...
  • put in the accessKey and secretKey from Hetzner, use the URL with the region of the bucket built in, do not use the full bucket name in the URL. Use path = off.
# check if connection is working
mc ls default-nbg1
# should list all buckets available to the key

cd /var/opt/gitlab/registry/docker
mc mirror registry/ default-nbg1/mycompanies-gitlab-container-registry/docker/registry

Make sure, the folder where your existing registry artifacts are uploaded to, is BUCKET_NAME/docker/registry, inside should be a v2 folder.

2. Adjusting gitlab.rb

edit /etc/gitlab/gitlab.rb and change the registry storage settings. Make sure there is no other registry[‘storage’] below it, as it will override it.

# gitlab.rb
registry_external_url 'https://registry.mycompany.com'

### Settings used by GitLab application
gitlab_rails['registry_enabled'] = true
registry['storage'] = {
  's3' => {
    'accesskey' => 'xxx',
    'secretkey' => 'xxx',
    'bucket' => 'mycompanies-gitlab-container-registry',
    'region' => 'nbg1'
    'regionendpoint' => 'nbg1.your-objectstorage.com',
    'pathstyle' => false
  }
}
gitlab-ctl reconfigure

Check the container registry of any project that uses it, if all the images are there.

LFS + Artifacts

The other storages are very similar to each other, Gitlab even provides a migrate task for them.

Here for lfs + artifacts:

# gitlab.rb
gitlab_rails['lfs_enabled'] = true
gitlab_rails['lfs_storage_path'] = "/mount/lfs/lfs-objects"
gitlab_rails['lfs_object_store_enabled'] = true
gitlab_rails['lfs_object_store_proxy_download'] = false
gitlab_rails['lfs_object_store_remote_directory'] = "mycompany-gitlab-objects/lfs"
gitlab_rails['lfs_object_store_connection'] = {
   'provider' => 'AWS',
   'aws_access_key_id' => hetzner_s3_config['accesskey'],
   'aws_secret_access_key' => hetzner_s3_config['secretkey'],
   'path_style' => false,
   'endpoint' => "https://#{hetzner_s3_config['regionendpoint']}",
}


######## Artifacts
### Job Artifacts
gitlab_rails['artifacts_enabled'] = true
gitlab_rails['artifacts_object_store_enabled'] = true
gitlab_rails['artifacts_object_store_remote_directory'] = "mycompany-gitlab-objects/artifacts"
gitlab_rails['artifacts_object_store_connection'] = {
   'provider' => 'AWS',
   'aws_access_key_id' => hetzner_s3_config['accesskey'],
   'aws_secret_access_key' => hetzner_s3_config['secretkey'],
   'path_style' => false,
   'endpoint' => "https://#{hetzner_s3_config['regionendpoint']}",
}
gitlab-ctl reconfigure

gitlab-rake gitlab:lfs:migrate
gitlab-rake gitlab:artifacts:migrate

Mattermost

We also like to use Mattermost as our company chat. It’s also part of the Omnibus installation, so we can also move the data to S3.

cd /var/opt/gitlab/mattermost

mc mirror data/ default-nbg1/mycompany-gitlab-objects/mattermost-data

Then configure it in the System Console of Mattermost under File Storage.

  • File Storage System: Amazon S3
  • Amazon S3 Bucket: mycompany-gitlab-objects
  • Amazon S3 Path Prefix: mattermost-data
  • Amazon S3 Region: EMPTY
  • Amazon S3 Access Key: xxx
  • Amazon S3 Endpoint: nbg1.your-objectstorage.com
  • Amazon S3 Secret Key: xxx
  • Secure: Yes

Backup

If you like, you can also store the daily backup in Hetzner’s S3. Just make sure, to also replicate it somewhere else to avoid scenarios where you lose access to Hetzner or their data center burns down.

gitlab_rails['backup_upload_connection'] = {
  'provider' => 'AWS',
  'endpoint' => "https://#{hetzner_s3_config['regionendpoint']}",
  'aws_access_key_id' => hetzner_s3_config['accesskey'],
  'aws_secret_access_key' => hetzner_s3_config['secretkey'],
  'path_style' => false,
}
gitlab_rails['backup_upload_remote_directory'] = 'mycompany-gitlab-backup'

Check backup by creating a smaller version without the largest chunks:

gitlab-backup create SKIP=registry,artifacts,lfs,builds,repositories

CI-runner cache

Depending on how you configure your runners:

# 1. as arguments to gitlab-runner 

gitlab-runner register --executor docker
  --cache-s3-server-address "${s3_server}" \
  --cache-s3-access-key "${s3_accesskey}" \
  --cache-s3-secret-key "${s3_secretkey}" \
  --cache-s3-bucket-name "${s3_bucketname}" \
  --cache-type s3 \
  --cache-path 'runners' \
  --cache-shared
  • where s3_server is again the nbg1.your-objectstorage.com

Or by modifying the /etc/gitlab/config.toml of the runner:

[[runners]]
  name = "hcworker-0"
  url = "..."
  id = 36
  token = "..."
  token_obtained_at = ...
  token_expires_at = ...
  executor = "docker"
  [runners.cache]
    MaxUploadedArchiveSize = 0
    Type = "s3"
    Path = "runner"
    Shared = true
    [runners.cache.s3]
      ServerAddress = "nbg1.your-objectstorage.com"
      AccessKey = "xx"
      SecretKey = "xx"
      BucketName = "mycompany-gitlab-worker-cache"
  [runners.docker]
    ...

Other storages

Gitlab has a couple of more storage types, such as:

  • uploads (screenshots etc. in Issues)
  • merge request diffs
  • packages
  • pages
  • dependency proxy

It should be very similar to integrate them, just check the documentation for the correct settings.

If you starting a completely new Gitlab-CE server, it might be a good idea to configure all the storages from the beginning, so you don’t have to migrate them later. That can be done via a single configuration, so called consolidated object storage configuration.

Bonus: Gitlab Backup encryption (and decryption)

You should probably encrypt the backup before uploading it to any provider. Gitlab supports AES256 ENC-C encryption. You can set the key in the gitlab.rb:

gitlab_rails['backup_encryption'] = 'AES256'
gitlab_rails['backup_encryption_key'] = 'someexactly32bytelengthkey'

If your key consist of binary stuff, you can also pass it via Base64, the file is ruby after all:

require 'base64'
gitlab_rails['backup_encryption_key'] = Base64.decode64('somebase64encodedkeyof32bytelength')

Create a test backup. Afterwards, you should not be able to download it via Hetzner console, or via Minio directly, without providing the key.

Now the interesting part: If you want to retrieve the file via Minio mc client, you have to provide the key in a special format: EITHER Base64 OR hex coded. The key should be 32 bytes long, so 64 characters in hex or 44 characters in Base64. Note: If using Base64 strip the ==’s at the end!. Otherwise you just receive random “SSE Error. SSE key is missing.” errors!

mc ls default-nbg1/company-gitlab-backup
mc cp --enc-c default-nbg1/company-gitlab-backup=Base64EncodedKeyWithoutEqualsAtTheEnd  "default-nbg1/company-gitlab-backup/1733820344_2024_12_10_17.6.1_gitlab_backup.tar" .