Forum OpenACS Q&A: Re: Re: Re: Re: Re: ad_schedule_proc seems to be failing

Let me try and explain:

During AOLserver startup, if MaxOpen or MaxIdle is a positive number (not zero which means "forever") then AOLserver schedules a job to check to reset the database connections at [expr [clock seconds] + $MaxOpen] (as it were).

After May 12, 2006, the current time since the beginning of the epoch, plus a MaxIdle/MaxOpen setting of 1 billion seconds resulted in a scheduled event that overflowed a 32-bit signed integer. (It wrapped around and became a negative value.)

From what I gather from the AOLserver list, on Solaris this leads to a hard crash in some pthread function call. On Linux it just seems to forever hang up processing of scheduled events (because it can't cope with a negative time and every negative number is less than any positive number).

On Linux people who don't have MaxIdle or MaxOpen set at 1000000000 or who haven't restarted AOLserver since May 12th won't have experienced the problem. (For someone with a 1 billion setting who last restarted on May 11th then AOLserver is scheduled to reset the database connections in mid-January 2038 right now...)

A setting of 100 million, instead of 1 billion, wouldn't have exposed this condition on AOLserver 3.x for another twenty-eight years or so. Zero is the right value to use now. (Apparently 1 billion was chosen, instead of zero, due to some bug in the Oracle driver or the Oracle client libraries... 1 billion being "effectively" forever... until this month!)

Thanks Michael! Very clear explanation.

Brian

Yay for this thread and OpenACS. My AOLServer restarted for the first time today since May 12th 2006 and I saw the error. I found this thread and fixed it quickly.