Forum OpenACS Q&A: Globalization and package instances and communities

As most of you probably know, we're about to internationalize OpenACS and .LRN. So we're starting to think things through in more detail, and there are several things that we're not certain of. I'm going to post a couple of those there.

The background information is the the specs that were used at our mini-conference in Amsterdam in June, at http://openacs.org /wp/display/326/409.wimpy.

The problem is: Say you're a university in Germany. Most of your courses are in German, but some are in English or Spanish or some other language, either to accomodate visitors, or because it's a language course. For those course, you want the online .LRN community to be in English or Spanish as well. Or in Thai.

Say I'm a Swedish exchange student at this school, and I've set my browser or a cookie saying that my preferred language is Swedish, something that .LRN happens to support.

Will I get navigation in Swedish and content in German, English, Spanish, or Thai, depending on the course (i.e. my user preference always takes precedence over the community setting)?

Or will I get both navigation and content in German, English, Spanish, Thai (i.e. the community setting always takes precedence over my settings, meaning my user setting is useless)?

Will I have the option of choosing Swedish in one community, English in another, and Spanish in a third (i.e. there's a separate user preference per package instance or community or whateve)?

Isn't it going to be confusing to have the navigation keeep changing languages? Isn't it going to be confusing for people to keep track of their locale preferences in each of several subcommunities? (Btw, I've used language and locale interchangeably in this post.)

On another note: What exactly does it mean to say that a certain package instance or community is in a certain langauge? Does it impact what charset is used? The database is supposedly always in Unicode, so that shouldn't change.

As you can see, there are still a number of issues that we don't fully understand. Perhaps it's us that are slow, or perhaps they're just not fully understood. If they're not well enough understood yet, I'd rather wait with implementing them until we've tried running a multi-lingual community for a little while. Thanks for your help.

/Lars

I'm very interested in internationalization of OpenACS and dotLRN.

I don't know how was gp-lang done or how it does just job (is this published somewhere?), but I haven't heard anything of using something like GNU's gettext (i18n library).

gettext's documentation is very well done, and it covers many issues of i18n. Many free software developers are already familiar with it and many languages already support it.

Interesting. Thanks, Roberto, we'll be sure to take a look.

/Lars

Roberto, here is an extremely brief description of how multilanguage support in Greenpeace Planet works.
I think it would be most easy to treat the components independently whenever possible. The components involved in your course example would map to navigational language (#10 in the wimpy point) and multi-language content (#12) (deliberately leaving out #11 to hide my confusion about that one).

While it might be somehow cool to be able to control the language used in the navigation depending on the language of the currently displayed content item, I don't see any realistic use case that would make that worth banging our heads against the wall to come up with a way to implement it.

Also I would suggest, for the sake of simplicity, to make the user's language selection site wide. E.g. if I choose german as my preferred language, it will be my preferred language on all subsites of the system, until I change it to something else. If I don't choose any language than it will be based on my browser's settings, and it's most likely that I'll visit different parts of the site with the same browser anyway. This way user preferences can be stored in the user record (or a specific cookie) and don't have to keep track of subsites.

Packages that are handling multi-language content might decide to serve specific content items in a different language than the one preferred by the user, most likely because it's not available in the preferred language. E.g. like in your example with the user's language set to Swedish and most courses available in German only.

The only problems that I can see with this approach are possible character set clashes - for example when you want to view the course in Thai and have your navigational language set to Swedish. OTOH I doubt that this is a big issue in practice, and don't see an easy way to avoid it. (Also propably users who deal with different languages are more likely to have Unicode-enabled browsers installed).

Speaking of multiple languages for the same user, I think it would be benefitial to have an ordered list of language preferences instead of a single preferred language - the way that browser settings already implement them. This belongs to #3, language selection. Maybe this could include a proc that takes a list of possible (available) languages, and returns the one that is most applicable for the current request.

So a multi-language content package could tell the language selection mechanism: "hey, I got my item in these three languages: en, sv, de", and the mechanism would reply with: "the current user wants to see it in sv". The component handling the navigational language (gp-lang) could use this mechanism as well. This way all the knowledge about the current cookie setting, browser preferences, default language etc. could be in one central place.

I'd say, we should keep the navigation in the preferred language of the user (in order as Til suggested). But you should have the possibility as a community admin to force the navigation to be in the communities language (opt-in), if the admin thinks there is a special reason for it.
From confusion/not point of view, user should be seeing a single-language interface. If she chooses to use English (which can also be a default/fall-back choice), even if she is Swedish and has set browser settings thusly, English interface to all of the site's content should be provided.

I guess a tricky part maybe if content, as well as UI elements are available in multiple languages for a particular item, e.g. (purely hypothetical option) a bboard posting.

Otherwise content should be displayed in whatever language it is stored in, regardless of user preferences. HTTP headers should, however, be set in such a way that browser picks Unicode setting, to ensure that no 'bird signs' appear on a web page.

...just a few fractions of a ¢2

Malte - I don't see any other reason why an administrator would want to _force_ the navigation of a subsite to be in another language than the one selected by the user, other than to somehow deal with the problem that you describe in this document: http://www.sussdorff-roy.com/resources/internationalization (Summary: character encoding of the navigation is not displayable in the character encoding of the content, thus the character encoding of the navigation must be changed.). Although in the solution you describe, the language is altered automatically by the system when needed, not by the admin, so maybe you had something else in mind.

I would agree though that the default language - the fallback in case the language preference of a user could not be determined - should be configurable on a subsite level, not site-wide as I suggested above.

The solution of the problem that's described in that document - character encoding clash between navigation and content - by switching the navigation to english because it's displayable in all character encodings is very cool. For a general openacs solution though I would want the server to alternatively be able to return the page in unicode if the client supports it, especially because the availability of unicode browsers is propably increasing, and because users who manage to navigate to a page that contains different languages are more likely to have a unicode enabled browser installed.

Tilmann, I am not sure that _force_ is the right word... but being able to set a
default language as the adminitrator is important (think of it as a linguistic
recomendation). Two examples: Professor teaches Spanish in France...
dotLRN defalt is French but he wants everything in his Spanish course
(group) to be in Spanish (even though some of his students might prefer parts
of his course in French); OpenACS web spinner (a.k.a. master) in Germany is
really bad at "Englisch", but because his employer insists he does his best to
translate a couple of pages into something that resembles English that he
would rather have disappear. To keep his embarrassment to a minimum he
would like the initial default set to German so that users that come to his
pages are confronted with the German version first (with a not so obvious link
to the English version that can change the default display).