Forum OpenACS Q&A: Japanese patches yes, username w/ J. characters no

I am having a problem where the Japanese patches have been applied
successfully and Japanese pages display properly, but, when trying to
add yourself as a user, the use of Japanese characters in the username

Is this a design issue?

Should the user be able to enter Japanese characters?  My thinking is
that perhaps the user should be constrained to simple ASCII, since
OpenACS uses the username to send email.
Thoughts, experiences, suggestions most welcome.

From what I remember of the mail RFCs, aren't i18N email addresses legal?  Not the domains, but just the <addr>@domain field?  I remember lots of incompat with MTAs that don't hew to this, but the RFCs say go..
Let me look this up.
I would say that for japanese, you should restrict the email
address to ascii, or you'll have trouble. The user's name
however should allow kanji. In japanese you also need to
add a field to the user info which contains the pronunciation (in
kana) of the user's name, since the kanji are ambigious and
also people sometimes don't know the obscure ones.

I have a note about how to convert half-wifdth kana to full width,
using a tcl routine. Sometimes people enter the pronunciation in
half width (aging Microsoft charset) and it screws up the
sort order unless you canonicalize it to full width kana.

If you are supporting a Japanese "community" style site, you don't want to let them submit japanese characters in the email field. As Todd mentioned, the mail RFC says double byte usernames before the "@" are theoretically possible, but I've never actually seen this in practice... everyone just uses ascii.

As Henry said, you will want to add some additional fields for the katakana spellings (furigana) of people's names. I've found that you will end up adding the following to the users table:


Then stick the kanji names in first_names and last_name.

You probably want to sort on family_furigana_name.

Also, keep in mind that last name comes first, so you will want to adjust this in the UI.

One other thing, for double byte varchar and char fields, you will want to triple the size of the field. This isn't as much of an issue for the name fields, as names are only 1-3 kanji.