Hi Gustaf and Tom,
Thank you very much for your replies. I'm sorry for the time I've taken to post, but I had some personal issue to solve.
With our tests and observations, we are writing some kind of benchmark howto document that should go somewhere in the community, maybe in the XoWiki instance. Our tests had three branches of observation:
1 - AOLServer (OpenACS?) Tunning
2 - PostgreSQL Tunning
3 - SO Tunning
I'm going to try to make a brief description about the specific and generic observations we could find:
<h4> AOLServer (OpenACS? Tunning) </h4>
Our first issue with AOLServer was the 4.5.0 parameters problem that somebody fixed with the file /acs-tcl/tcl/pools-init.tcl as said in this post in Tom's message. With a simple update we where able to solve this issue.
The other problem was with the thread creation process wich, as you just said, is a mixed XoTcl + AOLServer problem. The most important thing we've realized is that the creation and destruction process consumes a lot of I/O operations. To improve the I/O performance we've tried to change the file system, but it had no effect due to the most important thing we've find out: DON'T EVER USE VM's IN PRODUCTION SITES.
Our server was based on Xen VM's and it was impossible to have I/O performance with Virtual Machines. The whole thing about it is that there's no virtualization process able to split the blocks and inodes completely when using virtual machines, so all the I/O is shared between the VM's in the Cluster. It's a little bit different from what happens with logical partitions possible in some hardwares, such as IBM's big servers. In that case the files are completely separated and the partition works with no I/O issues.
Based on this observation, we've switched the production environment to a dedicated server with the specifications described in the first post of this thread, and most of the problems with threads creation and destruction are gone.
The next step was to adjust the configuration file. I guess the biggest challenge to everybody using OpenACS is the best relation between number of users X maxconnections X requests X maxthreads. This is the part where we are stuck right now. According to this Gustaf's Post on AOLServer list:
the question certainly is, what "concurrent users" means (...) note, that when talking about "concurrent users", it might be the case that one user has many requests concurrently running. To simplify things, we should talk about e.g. 10 concurrent requests.
Then you need
- at least 10 threads
- at least 10 connections
The number of database connections is more difficult to estimate, since
this in an application(oacs) matter. In oacs, most dynamic requests need
1 or 2 db connections concurrently.
I know this is not like a cookbook, but using his observation we could think about one thread, one connection and 2 db connections for each request. With these parameters, concerning that the tests are going to perform all the requests we configure at the same time, the results are most likely following this logic. When we set the maxthreads to 150, there's a little bit of degradation in memory trying to serve 100 simultaneous requests. This degradation is over when you set the parameters minthreads to 80 and maxthreads to 120. What we can get from here is that one thread is able to serve one request in the best performance adjustments.
However, when you send these setting to production, there's maybe the most difficult problem: to estimate the number of requests per user. Maybe one patch in xotcl-request-monitor could answer this question, but we are still thinking about the best way to do it. We could also see that the number of connections per user is very different from the relation 1 X 1, and this another thing we are trying to find the best relation.
<h4> PostgreSQL Tunning </h4>
All the tests we are performing until now consider DB and OpenACS in the same box. The goal is to find out the performance limit to this setting, so we can remove PostgreSQL from the machine and measure how it gets better.
There's a lot of material for PostgreSQL in the Internet, and I'm not one specialist myself, but I guess there are some specific observations that could be done.
Everybody says that Internet applications are doing most of the time SELECT operations in the database. This is a myth. If you consider number of operations maybe it can be true, but it's not when you consider execution time and resources usage. The number of TPS (Transactions Per Second) in an application using OpenACS is very large, and that's a good thing. In the old Internet you where creating content to people see. The new paradigm says the opposite: now the users create the content, and that's why I use OpenACS. There's just no better option to build social and collaborative networks in the market.
Concerning this matter, most of PostgreSQL tuning involves transaction improvements. I can't still understand completely the pools mechanism that OpenACS uses for PostgreSQL, and I guess some improvement in this area could make my tests better.
The most important thing to adjust here is the shared memory mechanism. We've seen that, if you put a too big number in PostgreSQL and OS, the memory shared between PostgreSQL and AOLServer can cause the system to crash under stress situations, and that's not a good thing. The I/O becomes a problem with a large number of INSERT and DELETE operations, mostly because the thread creation process is also heavy for the system.
The conclusion is: if you want to have the best performance, you really have to split AOLServer and PostgreSQL in different boxes. The exact point to do it (DB size, number of users) is what we are trying to find out.
<h4> OS Tuning </h4>
Maybe this is the most difficult part to be done, because Linux has a lot of options that can be changed. A better analysis concerning resource usage is necessary so we can have better numbers. I'm going to put here a list of parameters we are changing in a Debian GNU/Linux Etch system:
# echo 2 > /proc/sys/vm/overcommit_memory
# echo 27910635520 > /proc/sys/kernel/shmmax
# echo 32707776 > /proc/sys/kernel/shmall
# echo deadline > /sys/block/sda/queue/scheduler
# echo 250 32000 100 128 > /proc/sys/kernel/sem
# cat /proc/sys/fs/file-max
753884
# echo 16777216 > /proc/sys/net/core/rmem_default
# echo 16777216 > /proc/sys/net/core/wmem_default
# echo 16777216 > /proc/sys/net/core/wmem_max
# echo 16777216 > /proc/sys/net/core/rmem_max# su - postgres
postgres@nodo406:~$ ulimit 753883
There are some other things, such as Kernel changes and TCP/IP configuration we are changing, but I don't think we have the better adjusts yet, so let's wait a little longer.
That's it. Sorry about the long post, but I guess I should give the community some qualified feedback, concerning all the help you guys always give to us.