Forum OpenACS Q&A: How can I make euc-kr(Korean) encoded .tcl and .adp work properly?

I tried both OpenACS 3.2 and 4.5 with Postgres 7.1.3 and AOLserver3.2 with acs.
What I want is to build a korean only website using openacs.

Consulting all of the threads in this forum and documents on internet, I could not figure out make it work perfectly.

* What I could do: - .html files (euc-kr encoded) under my webroot works ok. - saved data from web browser is being displayed properly, I mean in euc-kr korean. meaning my db working fine.

* What I could not: .tcl and .adp files (saved as euc-kr encoded) under my webroot are displaying with just a lot of ???????'s, even though its output charset is euc-kr. The english portion of the same file is displayed correctly. (For example, if I change one sentence of /index.tcl into korean and save it as euc-kr encoded, it doesn't display correctly)

and the samefile (encoded euc-kr) saved in different name foo.html and foo.adp, that is working and this is not.

I beleve that this problem is something to do with AOLserver's encoding at the moment of reading scripts(tcl and adps). But I cannot find where to configure to change euc-kr saved files to unicode.

It took me more than 2 months struggling it, could anyone help?

Did you try with the aolserver version that includes the i18n patches - 3.3+ad13? Download link is here: https://openacs.org/doc/openacs-4/aolserver.html

Reading euc-kr encoded tcl/adp files might work in this version (I know that it works with iso-8859-1 at least).

The patches are going to be included in newer versions of aolserver but I think the safest bet is still 3.3+ad13 as for now.

This sounds like the exact problem discussed at the beginning of Rob Mayoff's article:

http://dqd.com/~mayoff/encoding-doc.html

I would guess what's happening is that your .html files are being simply streamed out to the browswer by AOLserver without any conversion. If your version of AOLserver is configured to use euc-kr as the output charset, and that's the encoding of the .html file, then all is well. (You set OutputCharset to euc-kr in your Aolserver config file, right?)

When you use .tcl and .adp files, the files are being opened by AOLserver and read into a unicode space used by tcl internally, so a conversion occurs. The assumption of what type of file is being opened is key. If tcl thinks the file is unicode and it's not, you'll get garbage for all non-ASCII characters.

I believe the procedure you want to look at is template::util::read_file

You can add the line:

    fconfigure $fd -encoding <charset_name>

matching charset_name to the encoding of your files on disk. Be careful here: the charset names for tcl are different than the ones used by AOLserver... meaning you may have to use a different name than you used for OutputCharset.

Also check out these threads:

https://openacs.org/forums/message-view?message_id=52676  (has information on converting between tcl and Aolserver names for encodings)

https://openacs.org/forums/message-view?message_id=52676

This fixed it for me (ACS 4.5):
  1. Applying the patches at packages/acs-lang/ACS4.1b-PATCHES/

  2. Editing <your_service>.tcl config file. The following lines may be relevant:
    ns_section ns/parameters
    <snip>
    ns_param HackContentType 1
    ns_param OutputCharset windows-1251
    ns_param HttpOpenCharset windows-1251
    ns_param DefaultCharset windows-1251
    

    ns_section ns/mimetypes <snip> ns_param .html "text/html; charset=windows-1251" ns_param .tcl "text/html; charset=windows-1251" ns_param .adp "text/html; charset=windows-1251"

    (replace windows-1251 with your charset, of course)
Let me know if this helps.