OpenACS Internationalization (i18n) HOWTO v0.1
-- Kenny Chan
2/24/2001
This document intends to be a personal journal for
my
research on i18n under OpenACS and by sharing this, I hope people can
save the
time experimenting and hitting walls like me. Time was spent and I
want to make
the time spent worth. Just as Phil says, wasted time is not wasted
anymore if
we share the information and help others saving their time. I also
hope that
this doc leads to further investigation for OpenACS i18n because this
doc
doesnt deal with problems like how would ns_sendmail work with i18n.
There are two options that I have tested and found
working
as of today (2/24/2001). The current latest versions of all the
related
softwares are:
PostgreSQL v7.0.3 (rpm version, from http://www.postgresql.org)
Aolserver v3.2 (original from aolserver.com)
Aolserver v3.2 + ad12 (from ArsDigita)
Pgdriver v1.1.0 (from http://www.openacs.org/
stop asking about Pgdriver not working if you are still using the one
from
aolserver.com :~))
OpenACS v3.2.4 (from http://www.openacs.org/)
All tested configurations contain the following 3 common
components:
- PostgreSQL
v7.0.3 (rpm version, from http://www.postgresql.org)
- Pgdriver
v1.1.0 (from http://www.openacs.org/)
- OpenACS
v3.2.4 (from http://www.openacs.org/)
Note: the openacs database encoding is just SQL-
ASCII
Configurations found to fail for i18n:
- The 3
common components + Aolserver v3.2 (original) with tcl8x (nsd8x)
Working Configurations:
- The 3
common components + Aolserver v3.2 (original) with tcl76 (nsd76)
- The 3
common components + Aolserver v3.2 + ad12 with tcl76 (nsd76)
- The 3
common components + Aolserver v3.2 + ad12 with tcl8x (nsd8x)
Details of how to make things work
Working configurations #1 and #2:
For installation procedures, please check the
OpenACS
installation documentations.
For working configurations 1 and 2, I am not going
to talk
about the details since these two configurations work pretty much out
of the
box! (no detail I can talk about :~)) There have been postings in
the forum
talking about not being able to make multi-byte characters work with
nsd76,
thats not true! Tcl76 (which nsd76 has compiled into) handles
strings in raw
form and it should work.
However, I do have some pointers for those who
couldnt get
it to work. The things to pay attention are that the submitting page
and
outputting page MUST set to the correct (and same) encoding.
All db inserts that require user input have 2 or
more tcl
pages. The first one contains the html form (filename most likely be
pagename.tcl, per ArsDigitas convention), while the 2nd one or more
contains
data validation and actual db inserting SQL functions (filename most
likely be
pagename-2.tcl, pagename-3.tcl, etc. as ArsDigitas convention). In
order for
multi-byte characters to work correctly thru input and output, we
must set the
encoding of the html page so that the client browser transfer the
data in the
correct encoding. This is kinda vague in text so I would illustrate
with some
code:
Assumptions:
Assume we want to use big5 encoding for
traditional Chinese.
Assume the database contains a table named
i18n_test:
create table i18n_test (id int, first_names varchar
(1000));
myform.tcl:
ns_return 200 "text/html; charset=big5"
"
<form
method=get action='myform-2.tcl'>
First Names:
<input type='text' name='first_names'><br>
<input
type='submit' name='submit' value='Submit'>
</form>
myform-2.tcl:
set_the_usual_form_variables 0
set insertion "insert into i18n_test (id,
first_names)
values (1, '$QQfirst_names)
set selection "select first_names from
i18n_test where
id = 1"
set deletion "delete from i18n_test"
set db [ns_db gethandle]
ns_db dml $insertion
set dbfirstnames [database_to_tcl_list $db
$selection]
set dbfirstnames [lindex $dbfirstnames 0]
ns_db dml $db $deletion
ns_db releasehandle $db
ns_return 200 "text/html; charset=big5"
"
first name = $first_names
<br>
dbfirstname = $dbfirstnames
"
If we dont explicitly set the character encoding
of the
data-submitting page (myform.tcl in this case), client browsers would
most
likely set it as iso-8859-1. Users can still input big5-encoded
characters in
the form field and submit (e.g. if client is using external viewer
like Njwin),
but the resulting data passed to the data processing page (myform-
2.tcl in this
case) would be junk.
Now comes the truly useful stuff, working configuration #3.
The 3 common components + Aolserver v3.2 + ad12 with tcl8x
(nsd8x):
If we have to use the new features in tcl8x (e.g.
non-greedy
regexp), do we have to give up i18n? No! Just use the Aolserver with
ad
patches. ArsDigita provides a patched version of Aolserver that
contains
security, bug fixes and feature enhancements. It also contains
patches to make
i18n under tcl8x (nsd8x) easy.
Installation of the ad-patched version is pretty
much the
same as the original version. Just untar and cd to appropriate
directory and
make; make install
blah.
To use character encodings other than iso-8859-1
under
nsd8x, we have to tell nsd8x how to interpret the data submitted.
The new myform.tcl
and myform-2.tcl look like this:
myform.tcl:
set _charset "big5"
ns_return 200 "text/html;
charset=$_charset"
"
<form
method=get action='myform-2.tcl'>
<input
type='hidden' name='_charset' value='$_charset'>
First Names:
<input type='text' name='first_names'><br>
<input
type='submit' name='submit' value='Submit'>
</form>
"
myform-2.tcl:
ns_formfieldcharset _charset
set_the_usual_form_variables 0
set insertion "insert into i18n_test (id,
first_names)
values (1, '$QQfirst_names)
set selection "select first_names from
i18n_test where
id = 1"
set deletion "delete from i18n_test"
set db [ns_db gethandle]
ns_db dml $insertion
set dbfirstnames [database_to_tcl_list $db
$selection]
set dbfirstnames [lindex $dbfirstnames 0]
ns_db dml $db $deletion
ns_db releasehandle $db
ns_return 200 "text/html;
charset=$_charset"
"
first name = $first_names
<br>
dbfirstname = $dbfirstnames
"
The hidden form variable _charset get passed to
myform-2.tcl
and by making use of the proc ns_formfieldcharset, we can tell nsd8x
how to
interpret the submitted data from myform.tcl.
Final note:
I just wanted to illustrate the very basics of how
to make
languages other than English work with OpenACS / Aolserver /
PostgreSQL. In
order to make a true OpenACS i18n under nsd8x, codes in OpenACS have
to be
modified accordingly. My suggestion is that we could modify the proc
set_the_usual_form_variables to run the proc ns_formfieldcharset
internally so
we could minimize the code editing to the OpenACS tcls.
And in case you are interested, I am an aD 3-week
boot
camper in Berkeley, CA in June 2000. Also let me know if you already
knew me
from the boot camp :P
Further reading: Aolserver3_2+ad12
i18n support