Forum OpenACS Q&A: Re: OpenACS clustering setup and how it relates to xotcl-core.

Dear Marty,

you are raising many interesting questions! The short answer is that the cluster support is just partly implemented - simply because with high numbers of cores and a suitable naviserver configuration one one gets a very high scalability (up 1000 permission checked OpenACS requests per second) - and getting all of OpenACS with all its packages fit for clusters is some work, and using cluster make debugging more complex.

If one has many long running connection requests, one should use multiple connection thread pools (have you set this up), and one should consider reducing the blocking time of the requests (what times are you experiencing). Are you using the xotcl request monitor? e.g. the long-calls page gives you a good overview.

found that having the database on a separate VM than the naviserver instances

i do not think this is in general the case, but depends on the memory of the VM, on the number of cores given to the VM, etc. The local communication performance within a machine is always faster in terms of latency and throughput than via TCP/IP. When the a VM is under-configured for the required load (has only e.g. 2 cores, or is running out of memory, not enough shared buffers, ...) then maybe.

how much of the cache flushing over the cluster has actually been implemented

Caching happens on many levels: per-request caching, per-thread caching, caching via ns_cache and caching via nsv. The first one is not relevant for cluster considerations, per-thread caching is intentionally lock-free, but requires measures like a restart (used for things that "never change"). For the other ones, there is the special handling as provided by util_memoize and the custerwide operations. Using util_memoize for everything is not good, since it does not scale, since there are many long locking operations that limit concurrency (check on nsstats the lock times and sort by "total wait"; it is not unlikely that 'util_memoize' cache is top), so the caching work of the last years was devoted in defining multiple caches and cache partitioning.

The clusterwide support should work for you out of the box. Historically, it was developed in xotcl-core, but has been moved partly into acs-tcl since this is useful everywhere (some more generally useful stuff will be moved in future versions from xo:: to the core locations). Slightly simplified it works like the following: instead of writing

::acs::clusterwide nsv_unset $array $key

one can use

::acs::clusterwide nsv_unset $array $key

to execute a command on all machines of a cluster. So prefixing the cache clearing operations with ::acs::clusterwide should do the job.

Be aware that intra-server talk for cache clearing operations can become quite heavy, and that the communication as done right now is not how it should be done. ... many things can be done there, but first it has to be clear that it is worth the effort.

The first check is usually to figure out the local bottlenecks and the configuration, this is usually much less work than going towards the clustered solution.

Hope this answers some of your questions
all the best
-g