I have been using an AOLserver 3.3 / Postgres 7.0.1 / pgdriver 2.0.1
combination for a while now and keep encountering a problem that I
can't diagnose.
Generally, the system runs well, and I can hit the web server /
database "hard" and there are no problems. After a period of idleness
though (possibly after some of the OpenACS scheduled fuctions have run
in the meantime), I find that any attempt to load a web page simply
hangs. (Cancelling the page load shows that a partial load has
actually taken place, and the only missing element seems to be style
information.) Further investigation shows that both AOLserver and
Postgres are still running, but there seem to be an excessive number
of threads for each. One (sometimes two) of the Postgres threads (not
the parent process) is always found to have used a ridiculously large
amount of CPU time.
Stopping AOLserver and running ipcclean has no effect. Killing any of
the Postgres threads leaves all the others running, and they can only
be cleared by killing individually, deleting the pid file and
restarting Postgres. The quickest fix is to kill one of the Postgres
threads (not the parent) with AOLserver still running, which has the
result of collapsing Postgres back to its parent thread and killing
AOLserver. Restarting AOLserver then brings everything back to working
order again.
Any ideas on what might be causing this, or how I might go about
diagnosing the problem further would be appreciated.
Thanks.