TLDR: I released a Ruby CLI tool that can semi-automatically extract raw Strings from Ruby-files, SLIM, ERB-Views and Vue-Pug templates. Read on about the background.
Recently, we started to internationalize our ATS (Applicant tracking system) recruiter backend, to also show support English speaking recruiters, a step that most growing Rails apps sooner or later will face. That’s why I want to share some tooling that I use (or build) to help you make this very first step, extracting ALL the strings in the app into the I18n space.
Additionally, we also use bunch of Vue-components (about ~175) to add interactivity in many parts of the app. Having started lean with YAGNI in mind, Internationalization for the internal users (the Recruiters) was not a major concern in the first 2 years of running our ATS, so most Strings where plain German around the app. In our case, I used some Regex searches and found:
- 160 SLIM-Views with at least one string
- 175 Vue-Components
- bunch of Controller flash messages, page titles, breadcrumbs
- Models/Forms (custom error messages, enum-translations etc.)
- …and more business related classes with custom descriptions/titles (in our case, E-mail-template definitions with placeholders, states, events, actions etc.)
1. Slim - SlimKeyfy
Manually extract keys from templates is tedious as well as error prone, as one will mix up keys quiet easily, forgot to save files/overwrite changes and so forth. This is why I tried searching for some tooling that helps with Extraction. Luckily I found Slimkeyfy first, which did an OK job extraction most SLIM-keys. But first, there where some problems with the tool, namely it could not parse parts of our Slim-syntax, and some of the extraction regexp could be improved. This is why I started forking the Gem to add missing functionality. Anyhow, this still leaves us with a bunch of Vue-Components and Ruby-classes… But the slimkeyfy gave me the idea.
2. Ruby Classes - Using Ruby-parser
I was reading a couple of posts about using the awesome Parser((-Gem, which allows to rewrite Ruby code on the fly using hooks. This is what tools like **Rubocop using under the hood. Using another great Gem, tty-prompt, I cobbled together a small CLI that interactively asks the user for every “interesting” String in given Ruby files and automatically extract those. The result was extremely satisfying, as all the code was still valid and the transformations very precise. Even extraction of Heredocs and interpolating arguments worked perfectly!
3. Vue Extraction
This one is a little trickier… Most of the components are written with the Pug-templating language, which is very similar to Slim. So I hacked the Slimkeyfy Gem to also handle Vue-Pug files, which worked good, too. That means I could extract those in the same manner.
What about the Vue-components that are still using the default HTML-like templating? After fiddling (without success) with HTML-parsers like Nokogiri, that had major showstoppers (Nokogiri will gulp the “@click” and similiar attributes of Vue), I just decided to convert the remaining html templates to Pug. Luckily, there are several online API’s that can convert that easily and are about 98% correct, so I just build a small Ruby script to walk all the unconverted files, send them to the “API” and write it into a file. Afterwards, extraction worked the same as above
The missing 2%: Just watch out if you are using a bunch of self-closing custom tags next to each other, those were nested after converting with the tool in 1 case instead of next to each other which a test picked up.
4. Manual effort required for the finishing
In the end, most apps have some kind of exceptional usage, that will not be covered by the tooling above, but the Tooling helped to reach the Goal to 90%. To find obvious omissions, I’ve created a ToDo-List by using Rip-grep/Ack/Ag regex search, e.g. Find all Strings that begin with a uppercase letter or Heredocs is a very good heuristic to find untranslated Strings:
rg '"[A-Z][a-z]' app/ >> todos.txt
rg "'[A-Z][a-z]" app/ >> todos.txt
Putting it together: Introducing ExtractI18n Gem
Afterwards, after having used different tools, I decided to put it all together, optimize and streamline the interaction. This is why I want to show you a Gem that does all the work above, but also can convert ERB.
ExtractI18n is Ruby Command-Line-Tool, that given a path to Ruby/Slim/Vue-Pug/ERB-Files will walk those and ask you for every string, if you want to extract those in a given yml file (e.g. config/locales/unsorted.de.yml
). Before every file system change you will be shown a Diff at the end, so you can decide if the tool messed up.
extract-i18n --locale de --yaml config/locales/unsorted.de.yml app/views/user
It uses SlimKeyfy code as a basis for the SLIM/Vue-extraction, also HTML-Extractor for ERB and the mentioned Ruby-Parser into a unified interface.
Sorting it out: I18n-tasks
EVERY Rails app, that has a non-trivial amount of I18n-strings, should make use of the awesome i18n-tasks
.
I ran i18n-tasks normalize -p
repeatedly during the internationalization process to sort the just extracted keys into the right files. In the end, every major view/part of the app has it’s own I18n-file, like: config/locales/email_templates.LOCALE.yml
, config/locales/admin.LOCALE.yml
etc.
We also use the Rails I18n backend for storing the Vue’s frontend translations under the js.
namespace. I’ve wrote an article about that integration some time ago. To make I18n-tasks find usages of those Vue-strings, you can also download and require my Vue-Scanner for I18n-tasks.
Also checking the “health” with i18n-tasks health -l de
(just focus on the extracted language, as the other language is still missing) and see, if some keys are missed, overwritten etc.
Granted, that will produce a huge Git-diff once, but afterwards the extraction and I18n is totally managed by I18n-tasks.
Then, I exported all missing keys for the target language, used the Google-Translate feature to make a basic translation and generated a CSV with the suggested translation by filtering for the missing keys. After getting those fixed by a non-technical person, I can reimport it, too! Great tool!
Stats
In the end, I extracted about 2000 keys of about 120.000 characters, which is not too much. Google Translate API via I18n-tasks took about 3 seconds to translate those. During this process, I touched almost every file of the app and produced a giant diff, here only the app/ folder:
git diff origin/master --shortstat -- app
512 files changed, 4641 insertions(+), 5096 deletions(-)
But … with the tooling and producing a Gem along the way, learning about the Ruby-Parser Transformer, it was much more fun than anticipated! :) In the future, when building new views, I will still write those out in German first, and run this tool in the end, as the result is more correct than me copy pasting stuff and messing up interpolations.