Forum OpenACS Q&A: Oracle Ate my Server
Hi all, I'm having a problem with my pool connections to Oracle. My setup is as follows: Oracle 8.1.7 RedHat 7.2 AOLserver/3.3.1+ad13 ArsDigita Oracle Driver version 2.6 OpenACS 4.5.1b Basically, I started my web-process, and had it running for a few days. My site began to get very slow, so I ran "top" to see what was going on. I had an Oracle process which had been running for 1708 minutes and was taking up 99% of the CPU, 5% of Memory. I downed the webserver to see if that would fix it. It didn't. I checked my v$session table in oracle to see what connections were being made to Oracle, and there were three entries for OpenACS, one of which was performing a high number of Physical Reads and a high number of Block Gets. This process had the same process ID as the one from my "top". I did a few tests. I ran the following TCL commands through a tcl page on my server: set pool [db_nth_pool_name $db_state(n_handles_used)] ns_log Notice $pool set dba [ns_db gethandle pool1] ns_log Notice $dba set dbc [ns_db gethandle pool3] ns_log Notice $dbc #Notice I rearranged the pools because pool2 fails set dbb [ns_db gethandle pool2] ns_log Notice $dbb set output "Results: Pool: $pool, DB: $dba $dbb $dbc" ns_return 200 text/html $output had the following results: [05/Apr/2002:12:45:54][8196.7176][-conn4-] Notice: pooln [05/Apr/2002:12:45:54][8196.7176][-conn4-] Notice: nsdb1 [05/Apr/2002:12:45:54][8196.7176][-conn4-] Notice: nsdb2 [05/Apr/2002:12:45:54][8196.7176][-conn4-] Notice: RP (280.098 ms): error in rp_handler: serving GET /pooltest.tcl ad_url "/pooltest.tcl" maps to file "/web/demo/www/pooltest.tcl" errmsg is no access to pool: "pool2" [05/Apr/2002:12:45:54][8196.7176][-conn4-] Error: GET /pooltest.tcl no access to pool: "pool2" while executing "ns_db gethandle pool2" invoked from within "set dbb [ns_db gethandle pool2]" Looking at top, the oracle process was still there. Next I edited my Aolserver tcl file and changed my tcl library to point to an empty directory, i.e. such that it would not source OpenACS Tcl files, and switched on EnableTclPages. I reran the above code(without the set pool), and all handles were obtained without errors. This time the oracle process disappeared when I checked it with "top", thus fixing the problem. It looks like somewhere along the way, pool2 is being allocated and not released. I found it strange that when I downed my web-server and restarted, the same process was there, and pool2 was still not available.
Nor could I replicate it later. It happened once then never again.
It really had the feel of being an Oracle problem.
Of course your problem could be very different. Do you have lots of users or content on your site? It may be that you hit a query that does an unqualified join on two large tables.
If your site's a development site with little content, though, as much as I hate to say it you may've run into an Oracle problem.
I had a similar problem, and I found that the answer to my problem was to uninstall the webmail module. It makes some calls to java packages that must be downloaded from Sun's website and then loaded by hand (as is very well documented in the webmail module). If the classes aren't present, oracle's internal JVM runs amok.
The problem is also solved if you install the appropriate java classes they are: javamail, and the Java Activation Famework .
Don't know if this will help you or not...
I've taken webmail off, and everything seems to be fine now.