Forum OpenACS Forum Summaries: Re: OpenACS on Docker

Collapse
9: Re: OpenACS on Docker (response to 5)
Posted by Frank Bergmann on
Hi Guys,

mildly disappointed

I just saw this thread today. The goal of this docker container was to provide an easy one-step installer for relative noobs, and I think it serves it's purpose. It's just a kind of lightweight VM.

I guess what Malte wants is a scalable installation.

I'm sympathetic with that thought, but I understand that scaling OpenACS/]project-open[ instances really depends on the database rather than the NaviServer CPU load.

But to my knowledge databases don't scale on massive parallel shared nothing architectures without sacrificing consistency.

So my though is be move towards AWS or Google Cloud with their managed PostgreSQL services. I believe Brian showed me some impressive performance figures. Then just use Kubernetes or similar to spawn a few NaviServers behind an IP based load balancer. That shouldn't be too difficult...

Cheers
Frank

Collapse
10: Re: OpenACS on Docker (response to 9)
Posted by Gustaf Neumann on
OpenACS really depends on the database rather than the NaviServer CPU load.
it really depends. NaviServer is especially good on scaling over multiple cores, and since the availability of cores increases continuously, this is usually not a bottleneck. If one is running a server where 80k user log-in per day, and one has to serve ~9k concurrent users (as we had in peek-corona times), then the CPU load is substantial.

But to my knowledge databases don't scale on massive parallel shared nothing architectures without sacrificing consistency.
This is in general true (especially for massive scalability) and is a consequence of the CAP theorem. There is a constant progress for high availability pg, load balancing (see e.g. [1]) but this helps mostly for improving availability (as always, it depends on the load pattern: it is much easier to scale with many readers than with many writers). But don't expect e.g. a pgpool-II installation to be any faster for smaller user numbers compared to a configuration where pg and nsd run on the same machine with high bandwidth communication.

For large scale applications we see currently rather bottlenecks with huge content repositories, containing double- and triple-digit TB on files that must be highly available, especially when no shared disk fail-over or block-device replication are available. In case there is interest, i've done some work on integrating MongoDB for such purposes, which scales horizontally.

Concerning the Docker image: from my point of view, it is most useful for quick tryout scenarios and for easy standard deployments. I would not expect it solve scalability issues right now. When using pg e.g. in the AWS cloud, i would not expect it to scale better. When using multiple nsds connecting to the same OpenACS instance, be aware that out-of-the box, there will be no cache consistency. Running multiple e.g. po instances all with one nsd against one pg isntallation (what pg call "cluster"), this will work easily.

[1] https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling