Forum OpenACS Q&A: Killing an Aolserver thread?

Posted by Claudio Pasolini on
Hi all,

is there a way to kill an Aolserver thread without shutting down the Aolserver process?
I searched the Aolserver docs without finding any command to do so.
Using /admin/monitoring I can see the running threads and it would be fine to be able to kill a looping or resource consuming thread without a complete shutdown.

Posted by David Walker on
As far as I know, no.  From our discussion on the AOLserver list it appears
that pthreads (the underlying threading library that AOLserver uses) does
not have a thread kill command.  It does have a command, pthread_cancel I
think, that asks a thread to quit.

I have been planning to look into this as I have some long running page
requests.  They are not consuming very much resources but they seem to run
forever.  I'm hoping that a cancel command would do the job for them.

Posted by Drazen Kacar on
Pthreads have pthread_cancel function, which can be used to cancel a thread. However, when you cancel a thread all resources acquired by that thread remain acquired. That means that nothing will release the memory, so you effectively end up with the memory leak.

Even worse, if that thread has locked some mutex, the mutex will remain locked. Since only the thread which locked the mutex can unlock it, this means that the resource the mutex was guarding remains locked for the lifetime of the process. In other words, forever.

How bad that is depends on the nature of the resource, but it's definitely bad and might turn out to be a disaster.

So don't cancel threads. Pthreads have pthread_cleanup_push and pthread_cleanup_pop mechanism which is intended for cancelation cleanup work, though. But that's mainly for the library authors who want to write a decent software. Applications rarely bother with it (as there's not much point). As far as I know, AOLserver doesn't have that mechanism in place. And I don't think it should.

Again, don't just blindly cancel threads. What you want to do is cancel a certain Tcl interpreter. That's a completely different thing. The fact that it runs in a separate thread is just an implementation detail.

Now, I'm not aware of a mechanism which you could use to cancel the interpreter. The easiest and safest thing to do would be to add a piece of code in your long running job which would periodicaly check some shared variable. In case that variable is set, just exit.

In case you want to cancel a thread which takes too much resources because there's just a plain bug somewhere, I think you'll have to fix that bug. Multithreaded programs don't leave you much options there.

Posted by David Walker on
Here is the information I get.
Conn #: 84127 Client IP: State: running Method: ? URL: ? n
seconds: 1425576 bytes: 0

I have something somewhere that runs forever sometimes.  Very likely a bug
in my code.  Fixing it is a great idea.  However it takes anywhere from a day
to several days to happen.  Would be much preferred to get a "Script timed
out at ns_db gethandle within etc." or some such error.

Would this require rewriting tcl to process an event loop while commands are
running?  (whether or not that is even a good idea is a separate question)

Posted by Tom Jackson on

The mozilla project has a comment on why thread killing is not a good idea. The link is:

Posted by Claudio Pasolini on
Thank you all for answering: now I understand better the problems behind threads, but I insist to think that it would be fine to have a tool like those used to monitor and manage tasks in the CICS systems of the past.

However, due to the stateless nature of HTTP and the lightning speed of Aolserver, we can live without and simply shutdown on demand.

Posted by Kevin Lawver on
If you're inside that thread, you can use ns_thread yield, which closes it out almost like return.  I played around with using ns_thread yield plus a thread id, but never got it to work.
Posted by Drazen Kacar on
No, thread yield means that the thread will voluntarily give up the CPU. So the OS will remove it from the processor and schedule some other thread. But the thread which yielded remains runnable and will be scheduled on some available CPU eventualy.