Forum OpenACS Q&A: i18n work update: We'll soon be committing the first chunks

Here's the update: We've been cleaning up acs-lang, making it a
required part of ACS Core, and integrating it into the templating
system, the request-processor, and the APM parameters system as
required.

Shortly after the 4.6 branch has been cut, we'll commit this into
the OpenACS CVS tree on the trunk.

Once that has happened, you can all start participating by moving
language text out of the pages and procs and into the message
catalog. We'll provide you with a short document that tells you how
to do that.

In the meantime, we'll be adding more sophistication to the local
negotiation, such as settings per site node, URL rewriting that
allows us to have the locale as the first part of a URL, and lots of
other goodies.

/Lars

Sorry if I'm chiming in too late, but having the locale as the first part of URL makes linking between language versions more difficult, because they should use absolute links starting with /.

If you had an option for locale extension, e.g. q-and-a-fetch-msg.locale.tcl instead of /locale/bboard/q-and-a-fetch-msg.tcl, then one could use relative links between same content in different languages.

...but that would screw linking within the language version (always need to add the proper .locale part), which I overlooked. Oops.
Explicitely linking to another language is at least possible by specifying an absolute url like this: /ru/bboard/q-and-a-fetch-msg.

It should be easy to construct a language-switch page that redirects to the page it was referenced from with a different language prefix, possibly making it unnecessary to do relative links between different languages.

Some random questions / remarks:

Are you planning to enable both language _or_ locale information in the smart url system, e.g. by supporting both the urls /en/bboard/foo and /en-gb/bboard/foo? (I hope this is the correct distinction between the meaning of the words language and locale)

If yes then I'd like to vote for the /en-gb/ syntax instead of the more common /en_GB/, just for aesthetic reasons and in order to conform with the URL style of the rest of openacs.

When working on acs-lang, did you base your work on gp-lang from greenpeace, and if yes did you consider splitting the key into two separate fields in the database? As far as I remember the new style message keys introduced in gp-lang have always two parts, and it occured to me that having two separate fields would make a lot of operations on the messages easier (this is purely theoretical though).

I haven't looked into smart URLs yet. I was simply planning on using the changes that Don made over at Greenpeace.

Wouldn't there be problems with en-gb? How can the server tell that this is a locale-prefix and not part of the URL?

We'll get to it in a bit. This is not critical for committing these changes. The idea is to commit the stuff that's necessary for package translation. Then we can make the core locale negotiation process more sophisticated at a later point.

/Lars

Apologies as I'm not all that familiar with this whole effort but...

Increasingly its looking like I may need to be.....

Is therea description/doc or anything describing whats already there and in particular what was done for GP?

Thanks

Well, it could define that anything that matches either the syntax /en/ or /en-gb/ is locale information and should be ripped out of the url, which should be a simple regexp. This would only make it impossible to use top-level URL's with that syntax, I don't see a big problem with that.

I have a version of Don's code with the greenpeace specific stuff removed (it only looks for the /en/ syntax yet), which may be useful as a starting point.

Simon, check out Carl's Wimpy Point presentation at: https://openacs.org/wp/display/326/, and in there the case study.
I tend to use /gc for general comments and /ds for developer-support. Hm.

/Lars

The smart url detector should first see if it matches a pattern and if yes, see if that pattern actually contains an active language of the system, and only remove it from the URL in that case. So if /ds/ does not correspond to a real language that is also activated on the system then there shouldn't be a problem.

So this might still be a cause for confusion but not as serious as if it would be when all top-level two-letter URL's were disabled.

Wouldn't it be easier to use another prefix for smart URL's such as /_en-gb/? This would simply mean that you shouldn't http://root.org/_*/ for normal purposes.

Or even say that something like "/_g" are used for smart URL's connected to globalization and other underscore prefixes could be used for smart URL's in connection with other modules?

Steffen

Steffen,

Yes, I've thought about something like that, but it's ugly! :)

I don't know what the best solution is here.

/locale=en_US/foobar? ... we can have an equal sign be part of the URL, no?

/Lars

Yeah, Lars,

You're probably right about that. Putting a "=" won't help much though -- it'll still be ugly.

Another way might be to use the "~" as a prefix. It probably won't be used with OpenACS, but people are used to seeing it in URL's...

Steffen

We have committed our changes so far to the acs-lang package and the changes that we have made to OpenACS core. The CVS commit comment for acs-lang was:

adding namespaces to the TCL API, adding new procedures for extracting keys from adp pages and parsing keys embedded in text, adding a translation web UI that was used at Greenpeace (at www/admin) and making it work with PostgreSQL, moving the old pages under www to be under www/admin/test, making the lang_messages table use locale rather than language, added upgrade scripts

The most important changes to OpenACS core include substituting #package_key.message_key# occurencies in adp pages with message lookups and setting up ad_conn locale in the request processor.

We are now working on committing Jeff Davis's magic script that can suck out translatable text from adp pages and replace them with message catalog lookups (yes I know, it's amazing, but it works!). As soon as we have committed this script we will encourage package owners to run those scripts on their packages and then test and go through pages and manually do any missing message lookups. We have started writing an I18N developers guide and you can find a snapshot of it here.

Nice work guys! Look forward to you all pointing me to a usable translation UI for dotLRN so I can put people on our side on getting those keys translated (I hope Peter hasn't translated everything already 😉