Forum OpenACS Q&A: Re: Openacs process disappearing

Collapse
Posted by Andrew Piskorski on
Jason, FYI, to get the best help here on the Forums, you really should give us a lot more info first thing, e.g.: What operating system and version? What version of AOLserver? (Apparently 4.0.9 from your log.) What AOLserver modules do you have loaded? (In particular, do you have any nonstandard modules?) What version of Tcl? How did you build and install AOLserver and Tcl? What threads and stacksize settings are you using in your AOLserver config file (in the ns/parameters and ns/server/$servername sections.)?

In general there are only three ways AOLserver can be killed: It itself chooses to shut down by Tcl code calling ns_shutdown or the like, it crashes, or some other process (the kernel, the user, whatever) purposely kills it.

Your AOLserver log has no messages at all from AOLserver about shutting down, it was just running normally, and then bang, a minute or so later it's starting up again. That means that AOLserver either crashed or was kill -9'd (SIGKILL). If it had been killed with a plain kill (SIGTERM) or ns_shutdown, you'd see evidence of AOLserver doing an orderly shutdown in its log - which you don't.

Do you have some form of process resource limits configured in your operating system? Memory limits or the like? I bet that's it - some web crawler hits your (otherwise low traffic) site, AOLserver grabs a bunch more memory, and then your OS shoots the process dead with kill -9. (If true, that isn't AOLserver's fault, your OS is misconfigured.) If you're on Linux, check with "ulimit -a". Ideally, run ulimit -a from the script which actually starts AOLserver.

If on the other hand AOLserver is in fact crashing, you want it to leave a core file so that you can figure out why it crashed. If you're on Linux, make sure you do something like "ulimit -c unlimited" in the script which starts AOLserver. That sets no limit on the max size of corefiles - the default is often 0, which means you won't get any corefile at all. You may want to set that as a system-wide default in "/etc/profile".