Forum OpenACS Q&A: More on Keep Aolserver Alive problem

I make reference here to a previous thread, still unresolved.
I have been searching into google for a few days, and lots of pages tell things about "daemontools" and "inittab".
None of them could be useful: they are full of jargon and further links; every new link results in an endless cascade of new pages, new jargon, new links... Plain examples like this and this might let you understand what I mean.
Please, does somebody know what am I supposed to check? And how: what should I type (as root, as nsadmin, as normal user?) at the console, in order to accomplish the task kindly suggested by Jon.
I admit that I did limited practical experiments: I scare unknown commands issued as root on a system that, if spoiled, will take months (to me) to rebuild. (What most of you do in half of an hour, I do in three weeks.)
Collapse
Posted by Jeff Davis on
Luigi, when you see "nsd (defunct)" it means the nsd program started but then exited (and since it is nsd, exiting almost certainly means it exited with an error). That it started at all means that daemontools is probably ok and the real problem is with starting aolserver.

First thing to check is that the user and group (the arguments to the -u and -g flags) you passed to nsd do exist (look in /etc/passwd and /etc/group).

Next, what I would do to try to track it down (all this as root):

 
cd /service 
svc -d yourserver # tell daemontools to stop trying to restart server
change the /service/YOURSERVER/run script from:
exec /usr/local/aolserver/bin/nsd-postgres -it /usr/local/aolserver/YOURSERVER.tcl -u nsadmin -g web
to
exec /usr/local/aolserver/bin/nsd-postgres -ft /usr/local/aolserver/YOURSERVER.tcl -u nsadmin -g web
(the -f tells it to log in the foreground rather than to a log file). Now (still as root) try starting the service directly
/service/YOURSERVER/run
If it starts successfully this way then it is likely that the problem is protections on the log file or log file directory and you should look look in your configuration file (yourserver.tcl) and find out where it is logging (typically ${homedir}/log/YOURSERVER.log and ${homedir}/log/YOURSERVER-error.log where ${homedir} is a varaible set in the config file). Then make sure the directory exists and the user you are running the server as (the -u flag passed to nsd) can create those files there and that there are not already YOURSERVER.log or YOURSERVER-error.log owned by another user.

If it does not start directly that means the server is misconfigured and hopefully you can figure out from the messages you see printed to the screen what is wrong.

If you fix it then change the -f flag back to -i and then restart the server by:

cd /service
svc -u YOURSERVER
svc -t YOURSERVER