I'm having some very strange behavior on one of my OpenACS installations (
http://www.usbakery.com)
First of all, the site itself seems to die randomly every 24 hours or so. I'm running OpenACS 5.1.2, Aolserver 4.08, and the newest nsopenssl.
The extra packages I'm using are:
file-storage
edit-this-page
notifications
oacs-dav
postcard
survey
What is strange is I don't see any unusual behavior in the error.log. The site is very simple, with no real customization except for templates and so on.
The site just stops responding. The Aolserver processes are still running, however.
I should have telneted to the port to see what they show. I'll try that next time.
I've got the etc/keepalive script running, and it also doesn't seem to run as advertised. It takes between 3-20 minutes to actually restart. I wonder if the wget command, which tries to restart a number of times, is being too cautious or something. An older version of the keepalive script didn't seem to have these problems.
Here's how I have it set up:
usb@www:~/usb-site/log$ crontab -l
3 1-23 * * * /usr/lib/postgresql/bin/vacuumdb --analyze usb > /dev/null 2>&1
3 0 * * * /usr/lib/postgresql/bin/vacuumdb --full --analyze usb > /dev/null 2>&1
30 0 * * * /usr/bin/pg_dump -f /var/lib/aolserver/usb/database-backup/backup.dmp usb
*/4 * * * * /bin/sh /var/lib/aolserver/usb/etc/keepalive/keepalive-cron.sh
30 0 * * * /usr/share/analog-5.32/analog -G -g/var/lib/aolserver/usb/etc/analog.cfg
keepalive-cron.sh uses the keepalive-config script:
# Config file for the keepalive.sh script
#
# @author Peter Marklund
# The servers_to_monitor variable should be a flat list with URLs to monitor
# on even indices and the commands to execute if the server doesn't respond
# on odd indices, like this:
# {server_url1 restart_command1 server_url2 restart_command2 ...}
set servers_to_monitor {http://69.93.192.95 "/home/usb/bin/restart-server"}
# How long the keepalive script waits until it attempts another restart
set seconds_between_restarts [expr 10*60]
And the restart-server script is as follows:
tail -n 100 /home/usb/usb-site/log/error.log | /usr/bin/mail -s "*** USB Server Restart" mailto:myemailaddress@myservername.com
/usr/local/bin/svc -t /service/usb
Is anyone else using the keepalive script exactly as it is from the 5.1.2 installation, and having it work perfectly?
Is anyone else having their servers stop responding like this? It reminds me of the pre-Aolserver 4.08 + nsopenssl HEAD problems we were having earlier...
Any suggestions?