Forum OpenACS Q&A: Killing an Aolserver thread?
is there a way to kill an Aolserver thread without
shutting down the Aolserver process?
I searched the Aolserver docs without finding any command to do so.
Using /admin/monitoring I can see the running threads and it would be fine to be able to kill a looping or resource consuming thread without a complete shutdown.
that pthreads (the underlying threading library that AOLserver uses) does
not have a thread kill command. It does have a command, pthread_cancel I
think, that asks a thread to quit.
I have been planning to look into this as I have some long running page
requests. They are not consuming very much resources but they seem to run
forever. I'm hoping that a cancel command would do the job for them.
Even worse, if that thread has locked some mutex, the mutex will remain locked. Since only the thread which locked the mutex can unlock it, this means that the resource the mutex was guarding remains locked for the lifetime of the process. In other words, forever.
How bad that is depends on the nature of the resource, but it's definitely bad and might turn out to be a disaster.
So don't cancel threads. Pthreads have pthread_cleanup_push and pthread_cleanup_pop mechanism which is intended for cancelation cleanup work, though. But that's mainly for the library authors who want to write a decent software. Applications rarely bother with it (as there's not much point). As far as I know, AOLserver doesn't have that mechanism in place. And I don't think it should.
Again, don't just blindly cancel threads. What you want to do is cancel a certain Tcl interpreter. That's a completely different thing. The fact that it runs in a separate thread is just an implementation detail.
Now, I'm not aware of a mechanism which you could use to cancel the interpreter. The easiest and safest thing to do would be to add a piece of code in your long running job which would periodicaly check some shared variable. In case that variable is set, just exit.
In case you want to cancel a thread which takes too much resources because there's just a plain bug somewhere, I think you'll have to fix that bug. Multithreaded programs don't leave you much options there.
Conn #: 84127 Client IP: 10.1.12.132 State: running Method: ? URL: ? n
seconds: 1425576 bytes: 0
I have something somewhere that runs forever sometimes. Very likely a bug
in my code. Fixing it is a great idea. However it takes anywhere from a day
to several days to happen. Would be much preferred to get a "Script timed
out at ns_db gethandle within etc." or some such error.
Would this require rewriting tcl to process an event loop while commands are
running? (whether or not that is even a good idea is a separate question)
The mozilla project has a comment on why thread killing is not a good idea. The link is: http://mozilla.org/projects/nspr/tech-notes/abruptexit.html.
However, due to the stateless nature of HTTP and the lightning speed of Aolserver, we can live without and simply shutdown on demand.