Forum OpenACS Q&A: Aolserver 3.5 and daemontools

Collapse
Posted by Tilmann Singer on
I have this slightly offtopic problem where daemontools refuses to start an openacs installation after switching from aolserver 3.3ad13 to aolserver 3.5.6. It just starts a few aolserver threads, which don't write anything to the server log, and can't be stopped afterwards with svc -d.

This was the run file before the change:

#!/bin/sh
source /etc/shell-mods.sh
exec /usr/local/aolserver/bin/nsd -u aolserver -g web -it /web/mig_target/etc/mig_target.tcl

after the change:

#!/bin/sh
source /etc/shell-mods.sh
exec /usr/local/aolserver3.5/bin/nsd -u aolserver -g web -it /web/mig_target/etc/mig_target.tcl

Both run files work when executed directly from the shell. Just when starting the server with svc -u with the second run file, it hangs forever. Even a full server restart doesn't change that behaviour.

Anyone experienced that before?

Collapse
Posted by Jon Griffin on
If you delete ALL instances with kill, does it work then?
I have had that problem when aolserver doesn't kill all its processes.
Collapse
Posted by Tilmann Singer on
You mean all instances of aolserver? Well, they need to be killed with -9 because otherwise they won't die, and after that issuing 'svc -u' has the same effect as described before.

I even played with killing the supervise instance, but without success.

I just set up aolserver 3.5 on another server with daemontools and it works fine. I have absolutely no clue what the difference on this server could be.

Collapse
Posted by Tom Jackson on

Can you do a 'ps axww|grep readproctitle' to see what the output of daemontools is? You probably have a permissions issue that doesn't show up. To kill AOLserver you might need to do 'svc -k dir; svc -d dir;' to get it to stop.

Collapse
Posted by Tilmann Singer on
Grepping for readproctitle only returns this:

root      2097  0.0  0.0  1292  284 ?        S    10:11  0:00 readproctitle service errors: ...ile does not exist?supervise: fatal: unable to start log/run: file does not exist?supervise: fatal: unable to start log/run: file does not exist?supervise: fatal: unable to start log/run: file does not exist?supervise: fatal: unable to start log/run: file does not exist?supervise: fatal: unable to start log/run: file does not exist?supervise: fatal: unable to start log/run: file does not exist?

But that also remains when switching back to aolserver 3.3ad13 and killing all the hanging 3.5 processes, so it is some different problem I guess.

Issuing 'svc -k' before 'svc -d' has no effect, the processes just hang and need to be killed -9.

Collapse
Posted by Tom Jackson on

svc is trying to start log/run, probably /service/yourbuggyservice/log/run? You probably have a log directory in the same directory as your main run file? svc thinks you want to direct the output of this process to the log process (I'm guessing). You might try a different directory for your main run file.

The contents of readproctitle will not change if you were to shut down the offending process, it just stores the old data so you can look at it.

Collapse
Posted by Tilmann Singer on
Thanks for the explanation - there was indeed a /web/myalmostbugfreeservice/log directory, where daemontools put in a supervise directory. Seems that a directory with the name 'log' has a special meaning for daemontools - I didn't know that.

The problem with starting up aolserver 3.5.6 persists though.