Forum OpenACS Development: Re: cr_write_content and utf-8
are you sure, you were running on your server system tclsh and aolserver with the same environment variables and linked against the same tcl shared libs? I see with both, Mac OS X and lenny/sid with aolserver 4.5.1 and 4.0 in tclsh and ds/shell always utf-8 for [encoding system].
background: During initialization, Tcl determines the default system encoding from the LC_* or LANG environment variables. If nothing can be found, it uses TCL_DEFAULT_ENCODING, which is set depending on the OS. For example, under Mac OS X the TCL_DEFAULT_ENCODING is utf-8. If configure can't determine anything, the final default system encoding is "iso8859-1". Later, Tcl's system encoding can be altered on the scripting layer via "encoding system ?XXX?" or from C via Tcl_SetSystemEncoding(). Aolserver 4.0.10/4.5.1 does not set it via Tcl or C, naviserver has a config variable named "systemencoding" and sets the encoding in init.tcl (if nothing specified, it defaults to utf-8).
note, that when you load a library file or a www/*tcl script that sets the encoding via "encoding system ...", it is set for the whole server (all threads). The system encoding is a global variable in the Tcl implementation. The only OpenACS package that sets the system encoding is lors-central (most likely, not a good idea).
It is a good idea to check the LANG variable in your startup script for aolserver and use in doubt something like LANG=en_US.UTF-8
Hope this helps and all the best
Thanks for your answer.
I am not sure about the configuration of the server at installation time, I need to check with Héctor on that. From what I can see, LANG is set to es_ES.UTF-8 or en_US.UTF-8 for all the users involved (aolserver one, etc), the default being es_ES.UTF-8.
Regarding setting "encoding system" from inside OpenACS, I already grep'd the whole tree when we first noticed the difference and indeed the only one that sets it is lors-central but in our case 1. we don't use it, 2. it sets it to utf-8 anyway.
Also, trying to run Brian's test case on my mac (so UTF-8 in all cases then), I noticed that "fconfigure $channel -translation binary" would use iso8859-1 unless -encoding is set. I tested with a text file, encoded using utf-8. The new file encoding is iso8859-1. Note that in the content I use spanish specific characters like "ñ".
Yes, it's what I am saying :S.
Héctor and I just checked again, in case we were missing something, but same result. The user who runs AOLserver has LANG set to UTF-8 (we tried with both en_US.UTF-8 and es_ES.UTF-8 just in case) and still get iso8859-1 when running "encoding system" in the Tcl script of dev-support, while we get "utf-8" from tclsh. Very strange.
in ds/shell, do you get "en_US.UTF-8" as result? What Tcl version are you using on the server in question?