Forum OpenACS Q&A: Sort order for Japanese text

Collapse
Posted by Henry Minsky on
We recently discovered that to get lexical sorting in Japanese to work properly, you need to create your database and run your Postgres server with the environment var LANG=C. This causes sorting by hirangana and katakana (half and full width) to work properly.

We have the db configured to use Unicode, all translation from SJIS is done in AOLserver. Here's the configuration I used to init the database and to run the server:

export LANG=C
/usr/local/pgsql/bin/initdb  --encoding unicode  -
D /usr/local/pgsql/data

and we also have this set when we launch Postgres server in rc.local

su -l postgres -c 'export LANG=C; /usr/local/pgsql/bin/postmaster -N 
200 -B 1000 -o "-S 2000" -S -D /
usr/local/pgsql/data '

Collapse
Posted by Henry Minsky on
Here's the config options I used for this database and server
./configure --enable-locale --enable-recode --enable-multibyte --enable-unicode-conversion
             --with-maxbackends=64 --with-tcl  --with-perl --with-openssl --with-CXX --enable-syslog



export LANG=C
/usr/local/pgsql/bin/initdb  --encoding unicode  -D usr/local/pgsql/data

Collapse
Posted by Hans Gaasenbeek on
Interesting! But... what kind of LANG= should I use for European characters like é, è, ë, ê, etc.? Isn't there, in general, an installation option which renders all characters in languages like German, French, Norwegian, Swedish etc. correctly?

And... If it is needed to configure Postgresql again, is there a possibility to do this with the rpm installation as well? (Sorry, these rpms which work very well, especially on RH 7.1, have made me lazy. Not only that, mixing rpm with tarball installations does not work very well...)

Collapse
Posted by Hans Gaasenbeek on
Of course the characters I meant are not available here 😉)
E 'accent grave', 'accent aigue', 'accent circonflexe', a Umlaut, etc. if my French and German is right...
Collapse
Posted by Jonathan Marsden on
Hans, I posted a workaround that ensures the C locale is used for the PG database and server using PG 7.1.2 RPMs in another thread recently. Essentially you can do
  echo LANG=C >>/etc/sysconfig/i18n
and then install the Postgresql RPMs for the "first time" (ie you must NOT have a PG database already existing, rm -rf /var/lib/pgsql if necesary). Once the postgresql-server RPM is installed and you start PG for the first time, you can then remove the LANG=C line from the /etc/sysconfig/i18n file once more.