Forum OpenACS Q&A: Can I run serveral aolserver against one database?

Hi,

we use pound as a reverse proxy and for load balancing. Currently I have one aolserver connected to the databse (dynamic) and a static server to serve images, css and the like.

Has someone successfully tried to run several dynamic servers on different machines against a machine where the database is installed? Does OpenACS support that?

Greetings,
Nima

Collapse
Posted by Malte Sussdorff on
Sure, that's what the cluster mode is for (if I remember correctly). Ask AIESEC (through contacting Harish at azri.biz) or GPI (those I know for sure) for details.
Collapse
Posted by Denis Roy on
That is no problem at all. From the database's point of view, it doesn't matter whether the incoming TCP connections come from one or multiple web servers (unless you have access restrictions, of course).

One possible and scalable configuration and most likely also the most typical one would be:

  • a load balancer
  • two or more (physical) web servers
  • one database

The load balancer takes care of redirecting incoming requests to the web servers which in turn share the same database. This is a scalable solution since you can add more web servers if need be. The big advantage here apart from balancing the load is a much higher availability of your application since you have redundant web servers (i.e. a web server can go down or be taken down for maintenance while the remaining web servers continue to serve requests).

The problem you will face is the synchronization of the util_memoize cache on the different web servers. If some cached data gets modified on one web server, it has to somehow get updated on the other one, too, which is what cluster support in OpenACS kernel parameters is for. If you turn it on, the caches of all web servers in the cluster will get updated automatically.

Since we have been running OpenACS clusters for some time now, let me share some more thoughts which might be helpful for you:

Load Balancer

With the introduction of a load balancer, you also introduce another single-point-of-failure. If availability is important to you, you might want to look into redundant solutions (e.g. via heartbeat from the Linux High Availability project). (Of course, you also have a single-point-of-failure with your database, but this is more complex to solve via clusters or replication that the load-balancer issue.)

Web Servers

From your other posting I know that you are running quite a big machine for your web server. I personally think that it is cheaper and also better in terms of redundancy to use small but more servers for your AOLserver instances.

Also, there are some things to take into consideration that I think were not really well covered in the original design of the OpenACS cluster system:

Content Repository

CR files in file system: If you store your files in the file system, you potentially have a problem since your users access different physical web servers. If a file gets uploaded to the file system, you have to make sure that also users which access the other web server can access it. Some solutions include:

  • mounting the CR directory via NFS which would again be a single-point-of-failure and will also easily saturate one web servers hard disk through-put so you would have to have really fast disks or a separate file server
  • using rsync to mirror files across all web servers which can become difficult to manage with a large amount of files (we have about 80+GB) up to a point it doesn't make sense anymore
  • modify the application and have a patch for CR so that new files don't only get stored in the server's own file system but also in the ones of the other servers.

util_memoize cache synchronization

If you have a large number of concurrent users, usually one of the bottlenecks are limitations to the maximum number of threads available. At the same time, many cache updates will eventually happen since you have high activity on the system with many concurrent users.

The problem with the current design of cache synchronization is that the source web server uses http_gets to call a URL on the destination server. This means that one thread is blocked for each updation on both the source and destination web server. Ideally, such an update call only takes a fraction of a second but I have seen this taking a lot longer under load.

This can quickly become a vicious circle: high activity -> high load -> few threads available -> many cache updates -> even less threads available for incoming user requests.

All in all, I can just say that we are very happy with such a flexible and scalable setup. You can easily increase response times as well as availability.