Hi Gustaf,
Thank you for your answer. I've had this same tought as you, but I don't think this is the issue. I have 100, 50 and 50 connections por pool1, pool2 and pool3, and may maxthreads is setup to 35.
Checking the threads currently running on the system, I have the following list:
:throttle 1141053776 NS_THREAD_DETACHED 03:00:53 AM on 03/03/2011 NULL NULL
::bgdelivery 1164183888 NS_THREAD_DETACHED 08:15:22 AM on 03/03/2011 NULL NULL
-sched:idle8- -sched- 1166289232 NS_OK 08:25:04 AM on 03/03/2011 p:0x7ff42aa5c760 a:0x8
-sched:idle7- -sched- 1155791184 NS_OK 03:50:18 AM on 03/03/2011 p:0x7ff42aa5c760 a:0x7
-sched:idle6- -sched- 1153685840 NS_OK 03:12:50 AM on 03/03/2011 p:0x7ff42aa5c760 a:0x6
-sched:idle5- -sched- 1151580496 NS_OK 03:05:45 AM on 03/03/2011 p:0x7ff42aa5c760 a:0x5
-sched:idle4- -sched- 1149475152 NS_OK 03:04:45 AM on 03/03/2011 p:0x7ff42aa5c760 a:0x4
-sched:idle3- -sched- 1147369808 NS_OK 03:02:45 AM on 03/03/2011 p:0x7ff42aa5c760 a:0x3
-sched:idle1- -sched- 1143159120 NS_OK 03:01:45 AM on 03/03/2011 p:0x7ff42aa5c760 a:0x1
-sched:idle0- -sched- 1094764880 NS_OK 03:01:15 AM on 03/03/2011 p:0x7ff42aa5c760 NULL
-sched:22- -sched- 1145264464 NS_OK 03:01:45 AM on 03/03/2011 p:0x7ff42aa5c760 a:0x2
-sched- -main- 1084238160 NS_OK 03:00:05 AM on 03/03/2011 p:0x7ff42aa5c970 NULL
-nssock:driver- -main- 1130555728 NS_OK 03:00:52 AM on 03/03/2011 p:0x7ff42aa4beb0 a:0x6292b0
-main- 719976160 NS_THREAD_DETACHED 03:00:05 AM on 03/03/2011 NULL NULL
-default:9- -main- 1107396944 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:8- -main- 1105291600 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:7- -main- 1103186256 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:6- -main- 1101080912 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:5- -main- 1098975568 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:4- -main- 1096870224 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:3- -main- 1092659536 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:20- -nssock:driver- 1132661072 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:2- -main- 1090554192 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:19- -main- 1128450384 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:18- -main- 1126345040 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:17- -main- 1124239696 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:16- -main- 1122134352 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:15- -main- 1120029008 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:14- -main- 1117923664 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:13- -main- 1115818320 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:12- -main- 1113712976 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:11- -main- 1111607632 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:10- -main- 1109502288 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread
-default:0- -main- 1086343504 NS_OK 03:00:52 AM on 03/03/2011 ns:connthread 7482 200.198.213.70 running GET /_stats/threads.adp 0.7389 0
When the cache problem happens however, the DB queries seem to last forever:
[03/Mar/2011:10:32:57][31076.1166289232][-sched:9-] Warning: db_exec: longdb 928 seconds nsdb0 select dbqd.acs-mail-lite.tcl.acs-mail-lite-procs.acs_mail_lit
e::sweeper.get_queued_messages
The problem doesn't happend with scheduled procs only. Regular DB calls show errors:
[03/Mar/2011:09:51:35][31076.1132661072][-default:20-] Warning: db_exec: longdb 929 seconds nsdb1 0or1row dbqd.acs-tcl.tcl.request-processor-procs.rp_lookup_
node_from_host.node_id
[03/Mar/2011:09:51:35][31076.1132661072][-default:20-] Error: Error in include template "/var/www/ct-gcie/packages/xowiki/www/portlets/portlet-minimo-comunid
ade": Database operation "0or1row" failed
(exception ERROR, "could not receive data from server: Connection timed out
")
The problem with caching is: when it grows to the limit, the cache flush operation seems to last forever and the DB call to reload the cache goes to a timeout.
I still have no clue about the reasons for this behaviour.