Context strings for localization are of uneven quality and purpose. These have been generalized for attributes (1395 context strings, actually attribute types), commonly used for instruments (354 context strings, actually disambiguation comments), and lesser used elsewhere.
To address MBS-5380, it became clear we need to create guidelines for developers that write context strings, and document the reasoning behind it, in order to prevent unproductive back and forth changes of our i18n practices.
Mihkel Tõnnov
added a comment - Another reference: the KDE KI18n programmer's guide: https://api.kde.org/frameworks/ki18n/html/prg_guide.html - specifically the sections "Writing Good Texts" and "Writing Good Contexts".
Things I'd like to see in MBz l12y guidelines for devs:
For new code, always add a short context about a string's role: "checkbox", "button", "menu header", "menu entry", "tab title", "in page title", "sidebar entry", "edit status", "release status", "election status", etc.
For one-word-strings, see additional note below.
Existing strings without any context should get it added upon request from a localization team (for example, "please split out ../root/user/UserProfile.js:nnn from string X", or "please add contexts to all occurrences of string Y").
When adding context, follow the guideline (1) above.
One-word strings might need additional context, especially if the word can belong to several word classes in English (such as "Edit": noun/verb; "Open": verb/adjective). So add the intended word class to the context as well, e.g. "verb, button", "verb, tab title", "adjective, edit status", etc.
Avoid composite strings as much as possible, at least if the resulting strings are static on runtime. Many languages have obligatory gender/animacy/number/case/etc. marking - meaning that composite strings might be impossible to translate properly. So it's better to add 10 complete phrases where some part repeats between the strings, than 1+9 strings which are then composed into a single phrase on runtime.
Concerning placeholders:
Prefer named placeholders.
Avoid lumping various types of entities (artist, person, group, label, recording, area, ...) into a single placeholder (I'm looking at you, {entity}). Same reason as for (4) above.
Location of a placeholder must be movable in translations, not hardcoded at beginning/end.
In case of multiple placeholders, their relative order must be possible to change in translations (so avoid things like "%s something-something %s").
Mihkel Tõnnov
added a comment - Things I'd like to see in MBz l12y guidelines for devs:
For new code, always add a short context about a string's role: "checkbox", "button", "menu header", "menu entry", "tab title", "in page title", "sidebar entry", "edit status", "release status", "election status", etc.
For one-word-strings, see additional note below.
Existing strings without any context should get it added upon request from a localization team (for example, "please split out ../root/user/UserProfile.js:nnn from string X", or "please add contexts to all occurrences of string Y").
When adding context, follow the guideline (1) above.
One-word strings might need additional context, especially if the word can belong to several word classes in English (such as "Edit": noun/verb; "Open": verb/adjective). So add the intended word class to the context as well, e.g. "verb, button", "verb, tab title", "adjective, edit status", etc.
Avoid composite strings as much as possible, at least if the resulting strings are static on runtime. Many languages have obligatory gender/animacy/number/case/etc. marking - meaning that composite strings might be impossible to translate properly. So it's better to add 10 complete phrases where some part repeats between the strings, than 1+9 strings which are then composed into a single phrase on runtime.
Concerning placeholders:
Prefer named placeholders.
Avoid lumping various types of entities (artist, person, group, label, recording, area, ...) into a single placeholder (I'm looking at you, {entity}). Same reason as for (4) above.
Location of a placeholder must be movable in translations, not hardcoded at beginning/end.
In case of multiple placeholders, their relative order must be possible to change in translations (so avoid things like " %s something-something %s ").
The bulk of my translation work during the past 10 years has been for LibreOffice. Strangely enough, I can't find a concrete reference to their i18n/l12y principles - but from my experience, the general rule (with some exceptions and occasional oversight) is to minimize string re-use (that is, if same string appears in several places in the UI, all of those occurrences should be independently translatable), and to avoid using composite strings where all the values of the individual components are known (again, with the occasional exception/oversight).
Mihkel Tõnnov
added a comment - The Mozilla best practices sound very reasonable.
The bulk of my translation work during the past 10 years has been for LibreOffice. Strangely enough, I can't find a concrete reference to their i18n/l12y principles - but from my experience, the general rule (with some exceptions and occasional oversight) is to minimize string re-use (that is, if same string appears in several places in the UI, all of those occurrences should be independently translatable), and to avoid using composite strings where all the values of the individual components are known (again, with the occasional exception/oversight).
yvanzo
added a comment - Context strings may actually contain several context information of different kind: role/layout, disambiguation/meaning, ...
Our current documentation says nothing about context strings: https://musicbrainz.org/doc/Server_Internationalisation
We should probably inspire from existing best practices in more experienced open source projects around, references are welcome!
At Mozilla: https://developer.mozilla.org/en-US/docs/Mozilla/Localization/Localization_content_best_practices
Another reference: the KDE KI18n programmer's guide: https://api.kde.org/frameworks/ki18n/html/prg_guide.html - specifically the sections "Writing Good Texts" and "Writing Good Contexts".