Forum .LRN Q&A: notifcations keeps pumping out messages

Collapse
Posted by Ben Koot on
Hi folks,

ANNYBODY FAMILIAR WITH MAIL LITE, PLEASE HELP !!!

sorry for shouting but this is becoming a nighmare....

One of my clients posted a message in his bugtracker, and from that point on he keeps receiving mail notifications, every 2-3 minutes. Does anybody have any idea how to stop this.?

I have closed the bug, uassigned the client form it, stopped the server for half an hour, but the messages keep coming.

I have now temporary closed the account which should halt sending messages

Thanks for your hints

Ben

Collapse
Posted by Matthew Geddert on
I had something similar things happen a while ago... and it didn't have anything to do with openacs/aolserver/etc. Especially if the server was stopped and the message kept coming I would seriously think that its a mail server might be at fault. What happened to my schools server is that the partition that the mail queue was on had errors and linux (wisely?) attempted to fix itself and automatically remounted the partition read-only. And thus the message remained in the mail server queue, my MTA would read it, send it, attempt to delete, which it couldn't and then attempt to send again. The same hard drive partition remount read-only problem later happenend with the mail storage maildirs (which were accessed through pop). So the email client (i.e. outlook or whatever you use) would download via pop, tell the mail server to delete the message, which it couldn't do and then download the same message again at the interval the computer was set to download the mail in (which being every 2-3 minutes sounds like this could be the case).

To fix this or check if this is the issue you could look at the server logs or dmesg. You can also log into your server, stopping the mail program and unmounting the partition that contains the mail queue and/or mail storage. Then run fsck or whatever program you need to run for your partition type and mount it again with read-write. The other way of doing this if you don't know what I'm talking about is to restart your server since this should automatically do the same steps. Mind you if there are serious hard drive partition problems then this might mean that your server will not reboot. Good luck.

Collapse
Posted by Ryan Gallimore on
This also happened to me once on a forum notification that was stuck in the queue. I think sendmail kept failing (but still sending) due to memory errors so the item was never removed from the queue. I fixed it by clearing the queue but never came up with a real solution. Meanwhile, a user received over 1,000 of the same notificaton. Ouch.
Collapse
Posted by Ben Koot on
My client received over 2000 mails. We checked the system and
found an issue with the mail server itself due to a change recently made to improve the anti-virus scanning. Fixed now ... beyond that, I there doesn't seem to be anything else wrong with the system, we've shutdown and checked the mail
queues,and nothing stuck in there ...

Thanks for your help folks
Ben

Collapse
Posted by Ben Koot on
EMERGENCY STOP

Client just confirmed receipt of another 700 mails. So still no solution. Is there no way to kill the mail loop?

Thanks
Ben

I think we're now down to about 3000 mails AAAUUCH

- Closing down the account does not help
- There's no way to fully remove the email from the system.

logical thinking would cause a simple function like that (so overruling users contribution in the system) would kill the mail process.

Dr Spock 😉

Collapse
Posted by Ben Koot on
Advice from my hosting service... Based on the default info in the mail light info ... "please note that the below changes to Postfix's main.cf file will break your
mail system, so I wouldn't recommend doing them ..."

User Documentation for ACS Mail Lite

Acs Mail Lite handles sending of email via sendmail or smtp and includes a
bounce management system for invalid email accounts.

When called to send a mail, the mail will either get sent immediately or placed
in an outgoing queue (changeable via parameter) which will be processed every
few minutes.

ACS Mail Lite uses either sendmail (you have to provide the location of the
binary as a parameter) or SMTP to send the mail. If the sending fails, the mail
will be placed in the outgoing queue again and be given another try a few
minutes later when processing the queue again.

Each email contains an X-Envelope-From adress constructed as follows:
The adress starts with "bounce" (can be changed by a parameter) followed by the
user_id, a hashkey and the package_id of the package instance that sent the
email, separated by "-". The domain name of this adress can be changed with a
parameter.

The system checks every 2 minutes (configurable) in a certain maildirectory
(configurable) for newly bounced emails, so the mailsystem will have to place
every mail to an address beginning with "bounce" (or whatever the appropriate
parameter says) in that directory. The system then processes each of the bounced
emails, strips out the message_id and verifies the hashkey in the
bounce-address. After that the package-key of the package sending the original
mail is found out by using the package_id provided in the bounce adress. With
that, the system then tries to invoke a callback procedure via a service
contract if one is registered for that particular package-key. This enables each
package to deal with bouncing mails on their own - probably logging this in
special tables. ACS Mail Lite then logs the event of a bounced mail of that
user.

Every day a procedure is run that checks if an email account has to be disabled
from receiving any more mail. This is done the following way:

* If a user received his last mail X days ago without any further bounced
mail then his bounce-record gets deleted since it can be assumed that his email
account is working again and no longer refusing emails. This value can be
changed with the parameter "MaxDaysToBounce".
* If more then Y emails were returned by a particular user then his email
account gets disabled from receiving any more mails from the system by setting
the email_bouncing_p flag to t. This value can be changed with the parameter
"MaxBounceCount".
* To notify users that they will not receive any more mails and to tell them
how to reenable the email account in the system again, a notification email gets
sent every 7 days (configurable) up to 4 times (configurable) that contains a
link to reenable the email account.

To use this system here is a quick guide how to do it with postfix.

* Edit /etc/postfix/main.cf
o Set "recipient_delimiter" to " - "
o Set "home_mailbox" to "Maildir/"
o Make sure that /etc/postfix/aliases is hashed for the alias database

* Edit /etc/postfix/aliases. Redirect all mail to "bounce" (if you leave the
parameter as it was) to "nsadmin" (in case you only run one server).

In case of multiple services on one system, create a bounce email for each of
them (e.g. changeing "bounce" to "bounce_service1") and create a new user that
runs the aolserver process for each of them. You do not want to have service1
deal with bounces for service2.

Collapse
Posted by Ryan Gallimore on
Okay, but I wasn't using postfix when I had a mail stuck in the queue... just using the sendmail binary.
Collapse
Posted by Ben Koot on
I am just using default oacs 5.2.2 and only reposted the system docs explaining our mail system. I am realy stuck right now. It could be the documnetation is out of date, I don't know.

Ben

Collapse
Posted by Matthew Geddert on
Are you certain that its openacs and not your mail server? I'm geussing your number one goal is to get the messages to stop for that user. To make sure you can do the following:

1. check the acs_mail_lite_queue and make sure there is no message in there that is being sent but not deleted. if there is one in there that is being sent and re-sent you could manually delete it
2. change this users email address to another one (possibly yours) and check that email address to see if messages are now routed there (in which case you can be pretty sure it is in fact openacs that is sending the mssages). This way you will get the annoying messages instead of your client and this makes it less stressful to figure out the problem.
3. if you've done step two and if that didn't fix it you can do a full pg_dump of the database then grep for that users email address. This will let you find any other tables that might contain this persons email address in openacs - which you can then manually update to your address with psql.

If this persons email address is NOT in the pg_dump, and still getting email messages, after a restart of postgresql and aolserver then there is clearly something wrong with your mail server setup, and that's something different to tackel, at least we will know what to look into... I have had mail stuck in the queue with sendmail, postfix and qmail (i haven't used other MTA's), I am guessing this is the problem, but the above steps should help us isolate where the problem is... in all seriousness, it COULD be that this users email service is having problems, and not your server, but I wouldn't tell them that unless you are sure its the case...

Collapse
Posted by Carl Robert Blesius on
Had an OpenACS related mail bombardment almost a year ago, but I am pretty sure we committed a fix.

It had something to do with the contents of a forums notification not being quoted correctly and a period on a single line caused the bombardment (it just happened to be in the middle of the message, which was interpreted as “end of message” causing the first part of the notification to be sent out before OpenACS was finished writing it to the filesystem IIRC).

Although it does not sound like your problem, check if the message has a single period on a line.

Ben, I hope you are not torturing this user as you figure this out. If you do not know how to get to the root of the problem, figure out a way to use your thumb (like http://en.wikipedia.org/wiki/Hansje_Brinker) in the meantime: e.g. change that users email address in OpenACS to a temporary gmail account (this is assuming the problem is caused by OpenACS) so it acts as a big bucket.

Collapse
Posted by Ben Koot on
Here's what's hapening ...
http://www.timedeskblog.com/page/
Collapse
Posted by Matthew Geddert on

As far as I know, ff the message id:

Message-Id: <mailto:-1704821747.1144160402.oacs@timedeskblog.com&;gt;

is the same for every message being sent you are sending one message from openacs (which is being dealt with wrong by your mail server). if the message id is different you are sending multiple messages from openacs.

By the way, you should edit the page you posted since it contains your users email address and its not good form to post somebodies email address on the internet with a link that can find it (because they will get WAY more spam).

Sadly this message doesn't help (me at least) to know exactly where the problem is. What have you tried from the suggestions above?

Collapse
Posted by Ben Koot on
We have rerouted to a hotmail account, so the stress is gone. It does seem to look like a problem on the client's side. Will keep you posted.

Thanks sofar
Ben

Collapse
Posted by Malte Sussdorff on
The same happend to me with a notification from lars blogger after updating to the latest CVS checkout. Will look if a restart helps and get back to you later.
Collapse
Posted by Malte Sussdorff on
Okay. as the message-ids and the bounce addresses are different, this definitely looks like an OpenACS Problem. Will dig into this later.

==============

Apr 22 09:03:55 ipx10216 postfix/cleanup[24250]: 137752703B9: message-id= mailto:-1681110056.1145689435.oacs@www.sussdorff.de>;
Apr 22 09:03:55 ipx10216 postfix/qmgr[1052]: 137752703B9: from=bounce-446-BB768B9094CC4BD4B19AD87C09A109263981E3FF-@www.sussdorff.de>, size=1268, nrcpt=1 (queue active)
Apr 22 09:03:55 ipx10216 postfix/smtp[24251]: 137752703B9: to= mailto:sussdorff@sussdorff.de>;, relay=in1.smtp.messagingengine.com[66.111.4.73], delay=0, status=sent (250 Ok: queued as 8CAEF2D10)
Apr 22 09:04:55 ipx10216 postfix/pickup[24176]: 137612703B9: uid=501 from= mailto:bounce-446-1B913602AF0DF977F44848FD9F9094B443ACFC58-@www.sussdorff.de>;
Apr 22 09:04:55 ipx10216 postfix/cleanup[24250]: 137612703B9: message-id= mailto:-1621110986.1145689495.oacs@www.sussdorff.de>;
Apr 22 09:04:55 ipx10216 postfix/qmgr[1052]: 137612703B9: from=bounce-446-1B913602AF0DF977F44848FD9F9094B443ACFC58-@www.sussdorff.de>, size=1268, nrcpt=1 (queue active)
Apr 22 09:04:56 ipx10216 postfix/smtp[24251]: 137612703B9: to= mailto:sussdorff@sussdorff.de>;, relay=in1.smtp.messagingengine.com[66.111.4.72], delay=1, status=sent (250 Ok: queued as 28462ADBC)

Collapse
Posted by Malte Sussdorff on
Found the error. I had an old version of mail-tracking installed and apparently the upgrade script was missing. Dropping the datamodell and recreating it again helped a lot (and yes, I did not need the old data).
Malte, unless this is unique to your installation(s) please add an upgrade script or post a bug and mark it "fix for .LRN 2.2".