Forum OpenACS Q&A: Mysterious characters appearing: A-circumflex, A-tilde

Having problems with random characters (most often uppercase A-tilde and uppercase A-circumflex) appearing when entering text through certain form fields or upload operations.

I'm running OACS 3.2.5/AOLServer 3.4.2/Postgres 7.2.4 on BSD.

The only pattern I've definitely identified is that two spaces in a row seem to produce an uppercase A-circumflex character.  For example, this happens when I upload a CSV file with a field that contains two or more contiguous spaces.

These characters also appear at times when using the browser-based WYSIWYG editor, called HTMLArea, referred to in a recent post (https://openacs.org/forums/message-view?message_id=124773).  However, it doesn't seem to happen when using a regular textarea HTML tag.

I've even resorted to the ugly alternative of trying to filter out the characters after the text has already been inserted into the database.  But I've had very little luck matching  or re-mapping the characters.

Has anyone else seen this?  Any suggestions?

Maybe they are "gremlins" (as defined by barebones software). Gremlins are strange characters that often appear when moving/transfering text between different character standards, such as ascii, utf, iso etc; or in the case of file transfers, such as between platform standards such as apple and mswindows.

best wishes for identifying and quashing the culprit originating characters! =)

AOLserver 3.4.2 does not have the OpenACS internationalization patches, 3.3+ad13 and 4.0 do. I don't recall exactly, but this may have something to do with your character set problems.

What character set your PostgreSQL database using? What character set is your "HTMLArea" editor using?

Depending on your answers to the above, getting all your content straightened out may well be highly non-trivial. But, I'm no expert there and don't have a good list of links. Some people here definitely have more in depth experience in this...

I have see EXACTLY this same problem.  By using AOLserver 3.3ad13 the problem went away.  Thus I recommend you do the same if possible.
Thanks for the responses.  Torben's explanation sounds suitably mysterious for such a nebulous problem.  But I guess misery loves company, because it's a relief to know at least one other person has seen the same thing.

I'm using HTMLArea straight out of the box, and I had assumed that it would inherit the character set of the page it's embedded in, which is iso-8859-1.

I didn't set up the server and can't change it myself, but I'll see if we can add the patches to AOLServer or switch to a different version.  I'll also find out what character set Postgres is using.

Postgres is set up using the default character set and probably mostly default options.

Does anyone know if the OpenACS internationalization patches can be applied to AOLServer 3.4.2 and where I can get the most recent version?

Collapse
Posted by Don Baccus on
I think your best bet is to try Patrick's approach, move to AOLserver 3.3 + ad13.  Ars Digita addressed initialization issues AOL hadn't dealt with (they're in the standard AOLserver 4.0 which is in beta though)

Can you set this up on a test server to verify if it fixes the problem, then get the production server changed?

Thanks, Don. Unfortunately, I have limited control over the server configuration, but setting up a test server is a good idea. If it works, then I can build my case to make the change.

In case anyone else needs a kludgy workaround for this problem, you can filter your input and re-map the characters (I couldn't get any kind of regsub to work).

This will give you a quick list of the codes needed to do the re-mapping (I couldn't find a list of these anywhere!):

for {set i 0} {$i <= 800} {incr i} {
    ns_write "Code $i = \\$i <br>"
}
For example, Code 755 = Ã

When you know the codes for the problem characters you can re-map them like so:

[string map [list \755 {}]]
This would replace the uppercase A-tilde characters with nothing.
Walter, yes, people have forward ported the internationalization patches from 3.3+ad13 to later 3.x servers, I think including 3.4.2 and 3.5.x. But, very few people use those patches. And the AOLserver team decided to ignore them (a long time ago now) in order to focus their effors on 4.0.

It should be both easier and better for you to simply switch to either 3.3+ad13 or 4.0. Note that OpenACS officially supports only 3.3+ad13, but will also support 4.0 soon (I think as soon as it's officially released, and has nsopenssl working).