Forum OpenACS Q&A: Response to i18n woes on a very simple site
Another Unicode encoding, known as UCS-2, uses two bytes for every character. The good news is that every character is a fixed width. The bad news is that it's not backward-compatible with ASCII.
However, an ASII document converted into UCS-2 is still readable by humans: It begins with two marker bytes, followed by the UCS-2 characters. Each UCS-2 character consists of a null (zero-value) byte followed by its ASCII equivalent. So "ABC" becomes "marker mark null A null B null C".
For what it's worth, Microsoft systems seem to have standardized on UCS-2. The open-source languages and systems that I use, such as Perl, Python, Tcl, and PostgreSQL, instead seem to have adopted UTF-8.