Janine, since apparently you can't talk to the people maintaining
Oracle on the other end, and are stuck doing things solely on your
end, the main problem seems to be, "Once Oracle goes down, how do I
know when it's back up, reliably and ASAP?"
I would write a special purpose script to do exactly that. I don't
know whether you could get AOLserver to work for answering that
question, but writing a script using Oratcl or even Sqlplus to
repeatedly try to connect to Oracle is definitely feasible.
I'd change your OpenACS code so that anytime the Oracle connection
breaks, you serve some sort of, "Sorry, our Oracle database is down
for maintenance." message, then immediately fire off the to script to
poll Oracle until it's back up. Once the script says Oracle is back
up, call ns_shutdown to restart. (Of course, if due to some bug or
other problem the script always says, "Yes, Oracle is up.",
but AOLserver's db connection is still broken, you'd go into an
infinite loop of AOLserver restarts - ugh - so you need to handle
those sorts of cases.)
I would change your external Keepalive to check only a *.tcl page
which does not need any database connectivity at all. (Or just turn
Keepalive off entirely, you probably don't really need it.)