Suffering is only suffering if you learn nothing. When you learn nothing and needlessly perpetuate suffering, that’s where misery comes in. We prefer to be misery-free around here.
In the spirit of helping others avoid needless pain, I launched this Lessons learned from localization series. In Part 1, we explored documentation pain and coping strategies.
For Part 2, I talked to Russ Sherk, a developer here at Klocwork, who works on our web tools and handles product licensing, to see if he was happy to share some of his survival strategies from our Japanese localization push.
Maybe “happy” wasn’t the correct word, but he’s still here, so they must have worked. Here are the questions he faced and his approaches (in his words) to each:
What to translate?
Mark what not to translate
Each team had thousands of strings in their sections of code. Each string had to be inspected individually to determine whether translation was required–a b-i-i-i-i-g task, and not one you want to repeat. We decided on using annotations to mark strings as non-translatable, so we wouldn’t have re-analyze them later.
Some types of strings were easily identified as non-translatable (HashMap keys, CSS class names, etc.). Some not so easy. Error messages were tricky to handle. The end result was better error handling in the product.
Error message handling
There are generally two types of errors: those that are expected and handled in the code and those that are unexpected. Our strategy was to draw a clear line between these two and only translate user-visible messages. Unexpected error messages were not translated, but the user is still notified about the problem in their language. The other side to this is that we, being unable to read Japanese, needed error messages in our language. To solve this, we decided that all messages in the log files would be English. Typically, the user sends you a screenshot or the error text during the troubleshooting process. Not much use if you can’t match the error message they send you with the logs. To solve this, we added a reference ID to all error messages.
“結果のビルドがありません。 (log entry id 1349102073932)”
In the log, the detailed error message and relevant stack traces, in English, have a matching ID.
Once the lines were drawn, and the non-translatable stuff was annotated, the big task of doing the translation could begin.
How do I handle the scope and monotony?
Use a framework and git ‘er done
In a product with many components written in different languages, you can’t just use one strategy or framework to handle all the translations. Throw in interaction between these systems and it becomes more complex. When it comes down to it, once the frameworks are in place (and carefully engineered to work together), the monotonous task of extracting strings is just that–monotonous. It is best to take the advice of Larry the Cable Guy and “Git ‘er done!” Extract the strings to your framework (JSON, properties, annotated interface, whatever).
Use tools or create auditing scripts
Because of the scale of the initial localization task, I recommend using tools or creating auditing scripts to find all the strings and work through them. In our Java code, we used IntelliJ IDEA’s internationalization inspection to work through most of them. I also created some scripts to create an archive of all the translatable files that would be sent to the translator. These scripts were never used again, but the alternative (manually collecting files that need translation) is error prone. Same for applying the translated files back to the source repository.
How do I know when I’m done?
Accept undoneness but use good systems
If you have an actively developed product, the localization process will never be done. Each new label, help description and error message will need to be translated. A good system needs to be in place to ensure that each new string is sent to the translator, translated, and makes it back into the product. As Kevin mentioned, we used a simple wiki table after the bulk translations were complete. This worked well for me. Add the new lines to the wiki table, wait for the translation-complete notification, and update the localization files with the translation. In this maintenance phase, I find it helpful to annotate the untranslated strings with ‘TODO’ items and periodically run a script to scan for untranslated items before milestone builds and release builds.
Thanks, Russ! Stay tuned for our final installment where we learn how the test team faced localization torment and lived to tell the tale.