Wiki translation woes

on Dec 7, 10 • by Helen Abbott • with 4 Comments

We moved all of our user documentation from Author-it to MediaWiki a few releases ago. At that point, we translated only a part of our documentation to Japanese – the help pages for detected issues. For these wiki pages, we used MediaWiki language templates to display language links at the bottom, and we...

Home » Documentation » Wiki translation woes

We moved all of our user documentation from Author-it to MediaWiki a few releases ago. At that point, we translated only a part of our documentation to Japanese – the help pages for detected issues. For these wiki pages, we used MediaWiki language templates to display language links at the bottom, and we copied-and-pasted the translated text.

MediaWiki's language templates

For our most recent release, we expanded the translation effort. This meant more copy-and-paste – from the wiki to Microsoft Word, to send to the translator, and then from Word to the wiki, when we received the translated text.

We discovered that the language templates don’t work when the page title contains a backslash, so we had to change some page titles – for example, some of our page titles include “C/C++”. In MediaWiki, changing a page title means manually editing all pages that link to that page – not so fun.

Recently, one of our product managers (who’s been doing double-duty as localization coordinator – a.k.a. Copy-and-Paster Extraordinaire) said that he needed to get a quote on translating the entire wiki into Japanese. Warning bells went off in my head. I wondered:

  • How do we ensure that shared content is translated only once?
  • How do we get the material out of the wiki to send to the translator for a quote and translation?
  • Where do we put the translated text?
  • How well does our documentation lend itself to translation?

Handling shared content

We use MediaWiki templates to “single-source” documentation. Much as in traditional documentation platforms, wikis allow you to reuse content. For instance, we use a template for the current release number, so that we have to edit it only once per release. We also use templates for information that’s identical across multiple Klocwork tools – from phrases to multiple paragraphs.

Getting the content out of the wiki – attempt #1

Clearly, copy-and-paste for 435 wiki pages plus templates = carpal tunnel syndrome.

Given that we were in the last days of our release, my brain was perhaps not at its most efficient. We use the MediaWiki Book extension, so that we (and our users) can create collections of pages that can be downloaded as PDF. To get a translation quote, I decided to create a massive PDF.

This was no easy task. I don’t recommend it, so I won’t explain the torture involved. Plus, the resultant PDF did not handle the shared text, so the word count was not accurate.

Getting the content out of the wiki – attempt #2

Eventually, once our release was out the door and my head cleared, I did some more reading and investigated MediaWiki’s XML export feature.

First, I used the MediaWiki special page “All Pages” to display a list of all of the pages in the Main namespace. This list is displayed in three columns over several screens, so I pasted all of the page names into an Excel spreadsheet. Then I created a single column of page titles. Next, I copied this list from Excel into the MediaWiki page Special:Export. This page requires only a straight list of page titles, one per line, without the surrounding double square brackets, so the copy-and-paste from Excel worked perfectly. I chose to include only the current revision, not the full history. I chose to include templates (shared text). And I chose to save to file.

MediaWiki's XML export

To my great relief, the XML file was very readable, and shared text was included only once. The translator provided a second quote and said that they felt more confident with the XML file than with the earlier PDF.

So, we’d taken care of points 1 and 2: We got the material out of the wiki for a quote and translation, and ensured that shared content would be translated only once.

A medium for translated content

Now, what to do with the translated text? The only sane option would be to use MediaWiki’s XML import feature. To do this, we’d need to abandon our current model of having the Japanese pages alongside the English in the same wiki. Instead, we’d need a separate Japanese wiki, where we could simply import the translated XML file. Changing our translation model also means changing how the Japanese documentation is integrated into our software.

Documentation style

Last and most painful: how well does our documentation lend itself to translation? My research and common sense told me that we have some work to do. For example, to make our documentation more engaging and user-friendly, we’ve adopted a casual, conversational, and (some might say) humorous style. This can make translation tricky, but it can also be problematic for ESL readers. And what we think is funny here in Canada may make no sense, or may be just plain annoying, across the pond. I’ll write more about style issues and translation in a future post.

Related Posts

4 Responses to Wiki translation woes

  1. CH says:

    Nice article! I checked out your wiki and it looks awesome! It’s apparent that you guys have spent a lot of time configuring it. I’m a bit curious about how you solved the release of new software versions. I can see that you have divided your content into subpages and that you archive old versions of your manual in such subpages. But how do you copy your content? Special:Export/import in the same way as when you exported the XML files?

    • Kevin Welsh says:

      Hi CH,

      My name is Kevin and I’m part of the Documentation team here at Klocwork. Basically every time we have a new release, we create an entirely new wiki that is a clone of the previous version. Then we take that version and begin making edits that will be a part of the next release. This is likely the quickest way to handle it as opposed to importing all of the XML files. I give a lot of credit to our IT department for help with that part. The Japanese content is handled in a similar way. We start with a fresh wiki that has content from the previous release, then we only import content that has been changed and re-sent for translation.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Scroll to top