We moved all of our user documentation from Author-it to MediaWiki a few releases ago. At that point, we translated only a part of our documentation to Japanese – the help pages for detected issues. For these wiki pages, we used MediaWiki language templates to display language links at the bottom, and we copied-and-pasted the translated text.
For our most recent release, we expanded the translation effort. This meant more copy-and-paste – from the wiki to Microsoft Word, to send to the translator, and then from Word to the wiki, when we received the translated text.
We discovered that the language templates don’t work when the page title contains a backslash, so we had to change some page titles – for example, some of our page titles include “C/C++”. In MediaWiki, changing a page title means manually editing all pages that link to that page – not so fun.
Recently, one of our product managers (who’s been doing double-duty as localization coordinator – a.k.a. Copy-and-Paster Extraordinaire) said that he needed to get a quote on translating the entire wiki into Japanese. Warning bells went off in my head. I wondered:
- How do we ensure that shared content is translated only once?
- How do we get the material out of the wiki to send to the translator for a quote and translation?
- Where do we put the translated text?
- How well does our documentation lend itself to translation?
Handling shared content
We use MediaWiki templates to “single-source” documentation. Much as in traditional documentation platforms, wikis allow you to reuse content. For instance, we use a template for the current release number, so that we have to edit it only once per release. We also use templates for information that’s identical across multiple Klocwork tools – from phrases to multiple paragraphs.
Getting the content out of the wiki – attempt #1
Clearly, copy-and-paste for 435 wiki pages plus templates = carpal tunnel syndrome.
Given that we were in the last days of our release, my brain was perhaps not at its most efficient. We use the MediaWiki Book extension, so that we (and our users) can create collections of pages that can be downloaded as PDF. To get a translation quote, I decided to create a massive PDF.
This was no easy task. I don’t recommend it, so I won’t explain the torture involved. Plus, the resultant PDF did not handle the shared text, so the word count was not accurate.
Getting the content out of the wiki – attempt #2
Eventually, once our release was out the door and my head cleared, I did some more reading and investigated MediaWiki’s XML export feature.
First, I used the MediaWiki special page “All Pages” to display a list of all of the pages in the Main namespace. This list is displayed in three columns over several screens, so I pasted all of the page names into an Excel spreadsheet. Then I created a single column of page titles. Next, I copied this list from Excel into the MediaWiki page Special:Export. This page requires only a straight list of page titles, one per line, without the surrounding double square brackets, so the copy-and-paste from Excel worked perfectly. I chose to include only the current revision, not the full history. I chose to include templates (shared text). And I chose to save to file.
To my great relief, the XML file was very readable, and shared text was included only once. The translator provided a second quote and said that they felt more confident with the XML file than with the earlier PDF.
So, we’d taken care of points 1 and 2: We got the material out of the wiki for a quote and translation, and ensured that shared content would be translated only once.
A medium for translated content
Now, what to do with the translated text? The only sane option would be to use MediaWiki’s XML import feature. To do this, we’d need to abandon our current model of having the Japanese pages alongside the English in the same wiki. Instead, we’d need a separate Japanese wiki, where we could simply import the translated XML file. Changing our translation model also means changing how the Japanese documentation is integrated into our software.
Last and most painful: how well does our documentation lend itself to translation? My research and common sense told me that we have some work to do. For example, to make our documentation more engaging and user-friendly, we’ve adopted a casual, conversational, and (some might say) humorous style. This can make translation tricky, but it can also be problematic for ESL readers. And what we think is funny here in Canada may make no sense, or may be just plain annoying, across the pond. I’ll write more about style issues and translation in a future post.