Forum OpenACS Q&A: Server melting down
serv: no free connections, dropping this one, total so far: 0
Is this a configuration error or what? I can't think of a few thousand visitors killing OACS this easily. Hardware or db shouldn't be the bottleneck since there's plenty other sites on the same server that worked fine all the time.
One thing I noted was that the size of nsd got quite high in a few minutes, somewhere up to 170MB. After restarting it was down to some 15MB.
Good values are reportedly very hardware and operating system specific, but since most of us are using Linux with a 2.4.x kernel, it would be nice to see AOLserver tuning reports there. I haven't seen any though.
Since you are sharing the server with other OpenACS instances, maybe setting minthreads 5, maxthreads 30, threadtimeout 600 would be good for you. If it was the only site on the box, perhaps minthreads 35, maxthreads 35 would be more appropriate. But I don't really know for sure, I just made up those numbers.
After that the site started working perfectly well, serving pages with up to 100 simultaneous connections (tried it with ab).
Maybe I should put them a bit lower, though, I'll keep on testing.
We just put our new annual report online for our members to fill out. On a dual 800 MHz Xeon (AOLserver 3.5.6) with 2 GB of RAM, Linux kernel 2.4.22 we have
ns_param MaxThreads 40 ;# Maximum threads ns_param MinThreads 40 ;# Start this many on server start ns_param ThreadTimeout 60 ;# Shut down a thread if idle after 60 seconds ns_param MaxConnections 100 ;# Queue and service this many conns. ;# If over this many conns, send 503 Server Busy ns_param MaxDropped 0 ;# Disable auto shutdown of server in case of ;# too many Server Too Busy messages.
And each of the main/log/subquery database pools have 55 connections configured. I did it that way because I didn't want a thread to wait on a connection if by some chance all the threads hit the DB at the same time.
Our nsd RSS sizes are hovering around 719 MB and have been as high as 1.1 GB. That doesn't seem normal to me, but things are running OK, so I'm not (yet) worried.
We are seeing OK performance. I think it should be better but I'm having a hard time telling if the database or the webserver is the bottleneck. Suggestions for determining that will be appreciated.
I would not be surprised to find performance increases if you dropped the threads to 20 and the db pools to say 15 apiece.
You could try using the nstelemetry.adp and loading the page a few times when the server is busy to see how many threads are actually busy.
My guess: unless you are serving more than 5000 visitors per day the lower counts I recommend will be more than adequate. A better number for RAM usage would be under 200; one client serves 50GBytes + per month of db-intensive pages and stays at about 100-115MB for the nsd8x process size.
You should probably bump up the amount of shared memory that Postgres uses and the buffers settings, etc. should also be increased.
I'll do some more tuning. BTW we are on Oracle, not PG. Why should DB connections in each pool be less than the number of threads?
And does anyone know a good way to tell if our bottlenecks are the webserver or the DB? Both are on different boxes. The Oracle server is 126.96.36.199, 4 GB, dual 2.6 GHz Xeon, hyperthreading turned off. One RAID 10 array.
I have nstelemetry installed, but am having trouble interpreting it.
Presumably a connection thread ends up spending a large fraction of its time feeding content back to a slow modem user over TCP/IP, and the thread is not holding a database handle during that time (well, as long as you haven't botched your Tcl programming it's not). So it seems reasonable that most people use a number of threads substantially larger than their number of db handles.
I suspect an Oracle connection is also pretty heavyweight, so you are probably sucking up lots of RAM for no good reason in database connections which you never use. Hopefully it will all just be swapped out most of the time, of course.
But all that is just my supposition. Some kind of nice Linux, AOLserver, Oracle, PostgreSQL, and OpenACS performance test suite, and good repeatable results from it, would be nice to have. :)