Forum OpenACS Q&A: Re: OpenACS Keepalive how to do it for windows based installation

What i can see from the the log snippets are two things, most likely unrelated:

- the first error is triggered from you scheduled procedure 25. You can figure out, what this is either from nsstats.tcl, and if you do not have nsstats installed, from ds/shell (or nscp) by running there "lindex [ns_info scheduled] 25". Normally, "ad_return_complaint 1" should be followed by an error message. You should first figure out what your sched proc 25 is before digging into this to find the right spot easier. However, while this error is not intended behavior, it is most likely not related to the seconds error, which happened nearly 5 hours later.

- the second error happened in "im_company_internal_helper" in the package intranet-core, which is most likely a ]PO[ package, doing some timesheet calculation. Ask the PO-people why this fails. The snippet does not show, when exactly this error happened. We see from the snippet that at 10:10:34 the server is stopping. This is a strange time for a scheduled restart, most likely, someone stopped the server manually. The server was restarted 2min 16secs later - also most likely manually. The problem seems to me that a second server instance was started before the first server instance exited.

When the server stops, it shuts down the network connections (as shown in the log) and shuts down all other services such as scheduled procedures (it tries to finish these). Then you should see a message like "Notice: sched: shutdown complete", ... "Notice: driver: stopped: nssock", and finally "Notice: nsmain: AOLserver/4.5.1 exiting". The exact detail of the messages depend on the configuration and logging detail.

As we can see from the log at least the exit-message is missing. It looks to me, as if you have more or less 2 servers running, the first one hanging in shutdown state (but which as stopped accepting requests) and the second one waiting for the resources of the first, not being able to receive requests.

Since you are running under windows, you are most likely using the compiled version of Maurizio, which has the Tcl shutdown details already deactivated, which are discussed in the thread Brian mentioned. Since this version performs already "an exit without running all pending operations", the worst thing which might happens are truncated operations - but no hangs.

Are you performing scheduled restarts? If not, you should figure out, who or what is stopping your server (it does not do this by itself) and check there the stopping/starting conditions.