We're running into some scaling issues on a .LRN site with a large number (>33,000) of site nodes.
Each call to site_node::update_cache takes about 10 seconds. When creating a new .LRN class, this proc is called 22 times. Creating a new class takes about 6 minutes during which the nsd process maxes out the CPU.
Most of the time in site_node::update_cache is spent in the four calls to "array set" at the top and in the four calls to "nsv_array reset" at the bottom. This is not surprising given the size of the data structures involved.
The question that occurs to me is, do we have to deal with the entire site map atomically, or can we cache at the individual node/url level?
Looking back through old code, we actually did it this way up until late November 2003. Only with r1.48 of site-nodes-procs.tcl did we begin to deal with the site map as a whole. Timo's commit message says "populate site-nodes-cache by resolving urls in tcl instead of using the slow plsql". I'm not clear yet on whether this change strictly requires whole-map caching instead of individual-url caching. But from a scaling pov I think we want to go back to individual caching if possible.
Comments?