Forum OpenACS Development: Ideas to speed up lang::message::cache

This post is just untested musings, so feel free to ignore!

We use quite a lot of message keys (38,000 on our dev environment), and have noticed that loading these at server startup is, by far, the slowest part of server startup. For example, on my dev server, NaviServer currently takes 390 seconds to startup, 165 of which is spent on lang::message::cache. https://openacs.org/api-doc/proc-view?proc=lang::message::cache&source_p=1

lang::message::cache is quite a simple proc, so I'm wondering if anyone has any clever suggestions on how to speed this up?

A couple of thoughts that occur to me, which I haven't yet tested out:
Use something else instead of db_foreach - I seem to recall that db_foreach is not recommended for long running queries, is this correct?

Split into separate threads per package_key - not quite sure how to implement this.

Is there any way of "bulk loading" nsvs from a list or array, which might be quicker than setting them one at a time? I don't see anything like that mentioned here https://naviserver.sourceforge.io/n/naviserver/files/nsv.html

Any other ideas?

Brian

Collapse
Posted by Gustaf Neumann on
Dear Brian,

hmm, i don't think, loading the nsv is the problem, but rather the db performance of that machine.

i did a quick check on our production site with ~60k message keys (about twice as many as on your site) . Loading these keys with the current OpenACS takes 1.03sec, without loading the nsvs, it takes 0.95 secs, so the difference is little. This is tested with current OpenACS and the database PostgreSQL 10.9 on the same Linux machine running CentOS 7.6 (Xeon 6154 CPU @ 3.00GHz). The full start time of the server is 37sec with 126 packages (time between "NaviServer ... starting" and NaviServer ... running").

Quick check on a different site (openacs.org): 10996 message keys, 142ms with nsvs set, 122ms without. (Ubuntu 16.04, Xeon 4114 CPU @ 2.20GHz, PostgreSQL 9.6.4), full startup time 13sec.

These times are better by some order of magnitude. Can it be that you have the DB and NaviServer on different machines, and there is a networking bottleneck?

-g

Collapse
Posted by Brian Fenton on
Hi Gustaf

thanks for the response, your information was very helpful. I did some timings, and yes, it does appear to be the database that is at issue here. The DB and NaviServer are indeed on different virtual machines, so this is something we can look into more closely.

Thanks as always for the insights!
Brian

Collapse
Posted by Chris Kill on
Thanks for this post so much.