Providing software for users in the whole world means providing it in multiple languages. This consists of two steps:
The Open-Xchange platform offers several facilities to simplify both parts. The L10n part is mostly a concern for translators. Open-Xchange facilities for that consist mainly of using a well-established format for translations: GNU Gettext Portable Object (PO) files. This allows translators to use existing dedicated translation tools or a simple UTF-8-capable text editor.
The i18n part is what software developers will be mostly dealing with and is what the rest of this document describes.
The main part of i18n is the translation of various text strings which are
presented to the user. For this purpose, the Open-Xchange platform provides
the RequireJS plugin 'gettext'
. Individual translation files are
specified as a module dependency of the form
'gettext!module_name'
. The resulting module API is
a function which can be called to translate a string using the specified
translation files:
After building the module, the file ox.pot in the project's root directory
will contain an entry for every call to one of gettext
functions:
#: apps/com.example/example.js:4
msgid
"Hello, world!"
msgstr
""
After translation, the PO files in the directory i18n
shoud
contain the translated entry:
#: apps/com.example/example.js:4
msgid
"Hello, world!"
msgstr
"Hallo, Welt!"
During the next build, the entries are copied from the central PO files
into individual translation files. In our example, this would be
apps/com.example/example.de_DE.js
. Because of the added language
ID, translation files can usually have the same name as the corresponding main
module. Multiple related modules should use the same translation file to avoid
the overhead of too many small translation files.
Most modules will require more complex translations than can be provided by
a simple string lookup. To handle some of these cases, the gettext
module provides traditional methods in addition to being a callable function.
Other cases are handled by the build system.
In most cases, the translated texts will not be static, but contain dynamic
values as parts of a sentence. The straight-forward approach of translating
parts of the sentence individually and then using string concatenation to
compose the final text is a
The solution is to translate entire sentences and then to use the
gettext
function to insert dynamic values:
Results in:
#. %1$s is the given name
#. %2$s is the family name
#, c-format
msgid
"Welcome, %1$s, %2$s"
msgstr
""
As shown in the example, it is possible to add comments for translators by
starting a comment with "#.
". Such comments must be placed
immediately before the name of the variable which refers to
the gettext
module (in this case gt
). They can be
separated by arbitrary whitespace and newlines, but not other tokens. All such
gettext
calls should have comments explaining every format
specifier.
Comments starting with "#,
" are meant for Gettext tools, which
in the case of "#, c-format
", can ensure that the translator did
not leave out or mistype any of the format specifiers.
For the cases when the format string must be translated by one of
the functions described below, there is a dedicated format function
gettext.format
which, except for debugging, is an alias for
_.printf
.
One of the most common i18n errors is forgetting to use
a gettext
function. To catch this kind of error, the UI can be
started with the hash parameter "#debug-i18n=1
". (Reloading of
the browser tab is usually required for the setting to take effect.)
In this mode, every translated string is marked with invisible Unicode
characters, and any DOM text without those characters gets reported on the
console. The gettext.format
function then also checks that every
parameter is translated. This is the reason why _.printf
should not
be used for user-visible strings directly.
Unfortunately, this method will also report any string which does not
actually require translation. Examples of such strings include user data,
numbers, strings already translated by the server, etc. To avoid filling
the console with such false positives, every such string must be marked by
passing it through the function gettext.noI18n
:
This results in the strings being marked as 'translated' without actually
changing their visible contents. When not debugging, gettext.noI18n
simply returns its first parameter.
gettext
FunctionsBesides gettext.format
and gettext.noI18n
there
are several other functions which are required to cover all typical translation
scenarios.
Sometimes, the same English word or phrase has multiple meanings and must be
translated differently depending on context. To be able to tell
the individual translations apart, the method gettext.pgettext
('p' stands for 'particular') should be used instead of calling
gettext
directly. It takes the context as the first parameter
and the text to translate as the second parameter:
Results in:
msctxt
"description"
msgid
"Title"
msgstr
"Beschreibung"
msctxt
"salutation"
msgid
"Title"
msgstr
"Anrede"
In the case of numbers, the rules to select the proper plural form can be
very complex. With the exception of languages with no separate plural forms,
English is the second simplest language in this respect, having only two plural
forms: singular and plural. Other languages can have up to four forms, and
theoretically even more. The functions gettext.ngettext
and
gettext.npgettext
(for a combination of plural forms with contexts)
can select the proper plural form by using a piece of executable code embedded
in the header of a PO file:
The function gettext.ngettext
accepts three parameters:
the English singular and plural forms and the number which determines the chosen
plural form. The function gettext.npgettext
adds a context
parameter before the others, similar to gettext.pgettext
. They are
usually used in combination with gettext.format
to insert
the actual number into the translated string.
The above example results in the following entry:
#. %1$d is the number of mails
#, c-format
msgid
"You have %1$d mail"
msgid_plural
"You have %1$d mails"
msgstr[0]
""
msgstr[1]
""
The number of msgstr[N]
lines is determined by
the number of plural forms in each language. This number is specified in the
header of each PO file, together with the code to compute the index of
the correct plural form the supplied number.