Tom mentions UTF-8 in an OpenACS site. I've been running a site with OpenACS 3.x in UTF-8 for the last 6-8 months (yad2yad.huji.ac.il); one of the requirements was that it work in Hebrew, Arabic, and English.
Getting PostgreSQL to work in Unicode wasn't hard at all; just pass the --encoding flag when you create the database. And once I put in the HackContentType flags described in someone's posting, the entire site worked just fine without modification. We're using the news, bboard, and chat modules in UTF-8, and everyone is pleased and impressed.
The few problems that I had were:
- Getting ns_sendmail to work correctly with UTF-8. I ended up modifying modules/tcl/sendmail.tcl to encode e-mail in UTF-8.
- Making sure that HTML forms would work for input in UTF-8. This normally happens if the encoding is set correctly, but testing this and double-checking that every page had the right content-type was tough.
- Here and there, people have been having weird problems with encoding that we can't easily duplicate. The data looks close enough to Windows-1255 (i.e., Hebrew and English) that I suspect user error, but that's a pretty lame excuse when your users are elementary school students.
I haven't yet had a chance to look into i18n and Unicode issues in OpenACS 4.x; that's one of next week's challenges. Now that I think about it, I wouldn't mind seeing a global parameter that sets the encoding for pages and for outgoing e-mail.
And of course, none of what I've written here is true i18n; it's just a matter of ensuring that the right text can potentially appear on the screen.