Forum OpenACS Development: Re: Scalability in site node initialization routine

Collapse
Posted by Ola Hansson on
When I am running this command in the OpenACS shell...
time { site_node::get_children -all -package_key curriculum -node_id _main-site-node-id_ } 10
...I get the following timings on my Athlon XP2500 / 1 GB ram:

- 10 nodes
existing proc: 1700 micro seconds/iteration
changed proc: 800 micro seconds/iteration

- 1123 nodes
existing proc: 170000 micro seconds/iteration
changed proc: 85000 micro seconds/iteration

- 2233 nodes
existing proc: 340000 micro seconds/iteration
changed proc: 180000 micro seconds/iteration

So the speed is only doubled and the gain seems to be decreasing as the number of nodes increase. This doesn't seem too worthwhile but I don't know ...

One other thing I think generally slows down "get_childern" unnecessarily (although not in this case) is the repeted foreach loop for the child urls that passed the filter when you specify a specific element to be in the returned list instead of the default url. It would be rather easy to move (or copy) that part of the code and let it exist inside the "if $passed_p" clause in the main filter loop, I think.

Also, is it intentional that "-exact" is not being used in the call to "get_children" in the last foreach, or is it by misstake? I fail to see why this call should differ from the one in the filter loop, at least.

Collapse
Posted by Jeff Davis on
I committed I revised version of site_node::get_children to HEAD which is about 3 times faster on my local install (with about 100 nodes).

The big win was not doing the array set twice, and not copying the full list of child_urls unless you needed to. And in a somewhat gratuitous optimization, I got rid of the regexp to figure out if a node was an immediate child (which I think saved 2usec/node).

the -exact did not really matter since it tries an exact match first and since it got the names from the nsv in the first place there is always an exact match (although I guess if you deleted a node while this call was running you might get the wrong result there). And in any case I switched it to do a direct nsv_get.