Forum OpenACS Q&A: ns_sendmail and utf-8

Collapse
Posted by Reuven Lerner on
Is there a known problem with ns_sendmail and UTF-8 encoded strings?
I'm running OpenACS 3.x, and my .tcl page successfully received UTF-8
that a user submitted into an HTML form.

The problems begin when I try to send that UTF-8 data in e-mail using
ns_sendmail.  I use a Content-type of "text/plain; charset=utf-8".
Unfortunatly, the letters get largely scrambled in transit, and when
they arrive on the other side, they're no longer in legal UTF-8.

I was convinced that the problem had to do with the HTML forms, but
that's clearly no longer the case.  Should I be encoding e-mail as
UTF-7 to avoid such problems in e-mail?  I'm using qmail as my
underlying MTA, and I thought that I could use UTF-8 without any
trouble -- but perhaps I'm wrong.  And this happens even when I send
to another user on the same host, so we're not dealing with any other
networks or MTAs.

It's possible that I need to tickle another configuration parameter
in nsd.tcl and/or my invocation of ns_sendmail.  But where?

Any ideas and insights will be welcomed!

Collapse
Posted by MaineBob OConnor on

Hi Reuven,

This is not a direct answer but it may get you headed down the right path. I've had problems with sendmail and line feeds in the output.... SEE this discussion:

ns_sendmail CR/LF lost in some messages

In particular see where I solve the problem on Oct 15th in the last post.

Also, I have used this enhanced code for sending email:

set x_head [ns_set create]
ns_set update $x_head "Errors-To" "errors@rocnet.com"
ns_set update $x_head "MIME-Version" "1.0"
ns_set update $x_head "Content-Type" "text/plain;
     charset="iso-8859-1""
ns_set update $x_head "Content-Transfer-Encoding" "8bit"

...

  regsub -all "
" $body2 "" body2

  if [catch {
     ns_sendmail $email $my_email $subject $body2 $x_head
     } errmsg] {
           ns_log Notice "EMail Failure: $email 
 Error: $errmsg"
     } else {
       ns_write "$count
" }

Good Luck
-Bob

REF: https://openacs.org/bboard/q-and-a-fetch-msg.tcl?msg_id=0002jp

Collapse
Posted by MaineBob OConnor on

Oops, the line above lost it's .
It should be:

ns_set update $x_head "Content-Type" "text/plain;
 charset="iso-8859-1""

-Bob

Collapse
Posted by MaineBob OConnor on

And this correction... the preview shows it fine but going into and out of the database, the gets lost.

regsub -all "
" $body2 "" body2

-Bob

Collapse
Posted by Reuven Lerner on

Bob, your answer was indeed quite helpful! I now have UTF-8 e-mail working just fine.

Bob put me on track to modify the SMTP headers. I set the "Content-Transfer-Encoding" header to "8bit" in outgoing e-mail. This ensured that I was getting the right number of bytes on the receiving end, but they weren't quite the right ones.

Thanks to Google, I found an article by Henry Minsky in which he described how he had to rewrite part of ns_sendmail in order to set the encoding on the outgoing socket. I figured that I might as well try it, and modified modules/tcl/sendmail.tcl such that it now sets the encoding to utf-8.

Sure enough, things now work! Thanks again for the help.