Forum OpenACS Q&A: ns_sendmail stalls out
Error: Expected a 220 status line; got: Expected a 220 status line; got: while executing _ns_sendmail ...etc etc
After this happens, for several minutes port 25 appears choked (or my MTA -- exim -- is choked) since subsequent attempts by the system to deliver mail will generate this in the aolserver log:
Error: could not connect to "localhost:25"
Otherwise, when sending individual messages or smaller groups of email, everything works fine.
I see in this thread https://openacs.org/bboard/q-and-a-fetch-msg.tcl?msg_id=0000Jd&topic_id=11&topic=OpenACS that ns_sendmail doesn't use the MTA but connects to port 25 directly. However, I also see in my server's log files (/var/log/exim/mainlog) logs of all the mail that ns_sendmail is sending, suggesting that ns_sendmail *does* engage the MTA.
Furthermore, in this thread https://openacs.org/bboard/q-and-a-fetch-msg.tcl?msg_id=0001DG&topic_id=11&topic=OpenACS QMail is described as a solution for high through-put email needs. Why would this be if ns_sendmail doesn't use the MTA? Am I misunderstanding ns_sendmail or MTAs?
A similar sort of problem has been reported here before https://openacs.org/bboard/q-and-a-fetch-msg.tcl?msg_id=0001DT&topic_id=11&topic=OpenACS but that thread included no explanation or solution.
So what might be going on? At what level is the problem?
- Is this a problem with slow DNS responses during email delivery leading to gridlock at port 25? I'm not running BIND locally but rather getting DNS from my ISP.
- Is there some inherent limitation with how fast ns_sendmail can run? Do you have to put kludgey code in loops to slow them down? (I presume that sounds fairly idiotic; sorry.)
- Is this an exim configuration problem? (I see no error messages in the exim logs though.)
- Should this propel me to move to qmail? Of if exim otherwise works OK, should this propel me to somehow use exim for outbound mail instead of having ns_sendmail talk to port 25 directly? And what is the best way to do that -- overload ns_sendmail to call qmail instead, or just brute-force rip out all the ns_sendmail calls in the code by hand?
It's definitely a problem with your MTA - after all, for several minutes afterwards it is refusing connections entirely.
I don't know whether exim's faster or slower than qmail etc etc. However I do know that a properly configured mail server can keep up with a lot of ns_sendmail traffic. Openacs.org, for instance, spams more than 30 users every time one of us posts to the bboard ...
I'm running this at the end of a DSL -- 784-1.5Mdown/128up. So maybe that's it.
Glad to know that ns_sendmail *does* use the default MTA. But if the problem is my uplink bandwidth, does qmail's queueing mechanism allow a better buffer against this limitation than does exim? And is the OpenACS site running on a big pipe in addition to using qmail?
I believe qmail has a good queueing mechanism. Also if your MTA is running through tcp_wrappers it might stop functioning. For some weird reason tcp_wrappers will detect multiple connections to it as a DOS attack. This has happened to me, I tried to spam using ns_sendmail but then tcp_wrappers killed port 25 because it thought that it was a DOS attack.
I started that thread msg_id=0001DG that mentions use qmail. We didn't. Instead, we use postfix.org and regularly send out over 12,000 messages in one loop through the db using ns_sendmail and other than slowing down the server for a while, it works great and has never stopped in over 9 months. (knock on wood!).
One, I recall that ns_sendmail does have some limit on the number of email addresses you can hand it in a single ns_sendmail call. Give it too many addresses, and it will break. Unfortunately I don't remember why it breaks, or what piece of the process was causing it to break. I think I last tangled with that problem more than a year ago, but I assume it still exists. If anybody remember why that problem exists, or if it's been fixed, please chime in.
Two, your use of the phrase "doesn't use the MTA" is confusing.
ns_sendmail is actually implemented by a handful of pretty simple Tcl procedures, and it always sends email by connecting to port 25 and talking to the MTA via SMTP on that port. So ns_sendmail always "uses the MTA". In most cases, ns_sendmail is configured to connect to port 25 on the local box, but you can configure it to try to connect to any MTA in the world, if you want. So it would be more precise to say that ns_sendmail always uses an MTA.
However, ns_sendmail definitely never sends email by calling
any of the MTA unix commands, like
qmail-inject. So ns_sendmail certainly does not use the
MTA's unix command-line API in any way.
One interesting question is if there's any reason to prefer using a
unix command like
qmail-inject over ns_sendmail's more
general method of SMTP to localhost on port 25. I know Bernstein's
qmail docs discuss how hideously slow the SMTP protocol can be, so for
large volumes of email,
qmail-inject might be better, but
I don't really know. And I'd bet that AOLserver and qmail running on
the same box can each churn through a lot of ns_sendmail SMTP
traffic on port 25 before the inefficiency of SMTP becomes a problem.
Given your success with postfix, Bob, I'm going to persevere with exim. Maybe the fact that exim has been running from /etc/inetd.conf and not as a daemon is the problem. (I've gotta figure out why debian installed it that way this last time.)
Interesting point about tcp_wrappers, Jun, though I'm not doing that.
Thanks for all the helpful comments!!
ns_sendmail builds and tears down a socket connection every time you call it ...
I think I must be sleepy but it was not tcp_wrappers but inetd. That is your problem. Look at the logs and it will say like shuting down port 25 blah blah for 10 mins. You have to run exim not on inetd or you will never get to spam a lot of emails. I am sure inetd is your problem. I also suggest that you look into qmail.