I'm working on how Greenpeace Planet sends locale-specific character encodings, and was hoping for some feedback on one of the implementation details. Eventually, acs-lang will need to address the same problem, and I thought a shared solution would be better.
For the impatient: I need to make sure that every call to ns_return or the like (ns_returnnotice) supplies the correct mimetype/character encoding for a given locale, and I'm not sure what the best place to make the change is.
The gory details:
The basic problem is this: If your site only features a single character set, such as iso-8859-1 for Western European languages, it is easy enough to configure AOLserver to automatically return the right character encoding. If your site features a mix of languages that use different character sets, acs-lang / gp-lang records a character encoding for each locale and can tell you which one to use. Now, you want want to send the right character set to the client. What does "sending" a given character encoding to the client mean? You want
- the http headers that are sent to correctly specify the character encoding along with the mime type in the "Content-Type" line,
- ditto for the mime type / character encoding specified in the meta tag in the portion of the html and
- the bytes that are sent to the browser need to be correctly encoded.
Assuming you've written a procedure that will return the locale-specific charset, number two is easy - you just need your templates to call the procedure when they write the meta tag. One and three are a little more complicated. Fortunately, Rob Mayoff wrote a document that explains the messy details. Unless you want to use ns_write to specify the headers yourself, what you need to do is specify the character encoding explicitly when you call ns_return or one of its cousins like ns_returnnotice. Thus
ns_return 200 "text/html; charset=shift-js" "bla bla"
will include the character set in the header and tell AOLserver how to encode the data.
It looked like there was another option: you can access the output headers as an ns_set through [ns_conn outputheaders] at any point in the thread before you return something to the browser. The problem is that ns_return appends a mimetype to the output headers... so if you try to stuff in the mime type ("Content-Type") beforehand, you'll wind up with a second "Content-Type" line created by ns_return.
So what's the best way to include the character set? The easy solution seems to be to just include a modified version of doc_return in the gp-lang package - doc_return seems to be what the templating system calls when it returns a normal page. Then, of course, you have to make sure non-templated pages call doc_return (regretably an issue with Planet). At any rate, this seems like a partial solution at best, since the ACS core returns lots of error pages and so on that don't call doc_return (the error pages are hardcoded in english of course). The only other thing that comes to mind is trying to hack ns_return its cousins. Any thoughts?
Request notifications