Forum OpenACS Q&A: Re: AOLserver 4.0 Install instructions

Collapse
Posted by Gustaf Neumann on
Where i come from, the word "academic" has a positive connotation. now i understand at least, where you are after. You are right, when peaking around in the code, it is surprising that the registering a callback that seems nowhere to be called, helped ... but it did. NsRunAtReadyProcs() is defined extern, so a module might call it, but i don't see this either. My suspicion is that - since the problem looked like a race condition to me - registering the callback has the side-effect of serializing some threads. The mutex in the callback registration might have this effect.

If you have time to investigate further, i would suggest to take out the patch and try to reproduce the bug in a clean-house environment. It seems related to threads with a larger footprint (nobody saw it except openacs applications).

Collapse
Posted by Tom Jackson on
I agree with your use of the word "academic", it appears to apply here: I don't have a crashing server or anything, I was somewhat surprised that 4.0 and 4.5 could share a patch like this and have it actually work.

The fact that this helped out somehow (in a big way) is important to know.

I also noticed that NsRuAtReadyProcs() is extern, but it is only defined in nsd.h, not include/ns.h. This means, as I just learned, that it can't be in a module, it has to be compiled into libnsd. Btw, the only place I found any *Procs() called was in nsmain.c.

The registered callback is TriggerDriver, and the use of TriggerDriver has changed. The similar SockTrigger used to be conditional during Sock Close, but now TriggerDriver gets called anytime there is mutex unlock on the driver pointer structure (except in ns_driver query, where it is called inside the mutex). Maybe an important one is the fact that TriggerDriver gets called if, from Tcl, you query the driver! So some new Tcl diagnostic code could unstick the driver thread, maybe like a scheduled proc.