Forum OpenACS Development: Language codes in acs-lang

Collapse
Posted by Emmanuelle Raffenne on
Hi,

I'm working on adding ISO-639-2 language codes (3 digits, http://www.loc.gov/standards/iso639-2/php/code_list.php) in acs-lang and ran into a problem I suspect is in there since 5.1 or so (I'll TIP within a few days about adding iso-639-2).

My understanding is that at first, acs-lang was designed to use iso-639-1 language codes (2 digits) and so it is assumed in the Tcl API (lang::*::language do a [string range $locale 0 1]). However the language column of ad_locales table is a three characters one, and indeed, there's a "ast_ES" locale there with language set to "ast". I didn't look for more locale with a 3 digits (iso-639-2) language but at least for that one, the Tcl API won't work.

I'm not sure on how we should address this. Having a mix of iso-639-1 and iso-639-2 for locale languages doesn't seem right to me. We should stick to one representation for language (iso-639-1) and maybe add the iso-639-2 equivalent as a separate data (that's what I'm doing, locally for now).

To fix this situation, we have 2 options:

1. Remove the ast_ES locale and any other that has a iso-639-2 language code
2. Hack the acs-lang Tcl API to admit languages of 2 and 3 digits

There is another one: substitute iso-639-2 for iso-639-1 but it would require a lot of work and I'm not convinced it would be a good thing anyway.

Collapse
Posted by Emmanuelle Raffenne on
"digits" should read "chars" in the previous post.
Collapse
Posted by Emmanuelle Raffenne on
After doing some research...

According to section 2.1 of RFC 4646 "Tags for Identifying Languages" (http://tools.ietf.org/html/rfc4646#section-2.1), the language must be represented using the shortest ISO 639 code. If there's no ISO-639-1 (2 chars) for a language then ISO-639-2 (3 chars) code must be used (if exists). If the terminology code is different than the bibliography one, then the terminology one should be used.

Bottom line, we need to fix the Tcl API to deal with 3 chars language codes.