Forum OpenACS Q&A: Re: carnage blender is live on aolserver 4 beta5

Collapse
Posted by Andrew Piskorski on
Without looking more closely at the code, I suspect the correct fix may be to:
  1. Add something at the beginning of Ns_MutexDestroy to first locks the mutex if it is not already locked, in order to protect from a mistake in the user's Tcl code. However, no, I don't think that is guaranteed to be a complete solution, as Ns_MutexLock does not (and should not) call Ns_MasterLock, but it should greatly reduce the window during which a failure could occur. In practice I bet it would eliminate all instances of the server crashes Jonathan saw.

  2. Possibly add something like Jonathan's patch as well.

Um, actually, my previus post was probably somewhat in error. While I think it's true that there is a race condition in Ns_MutexDestroy (see above), I don't think calling Ns_MasterLock before NsLockFree would do anything to fix it. Ah, yep, Jonathan, you already figured that out.