I've added a proc "ds_profile" to acs-developer-support, which will support in profiling code to find performance hog.
You invoke it like this:
ds_profile start name_of_suspicious_block
...
ds_profile stop name_of_suspicious_block
The name_of_suspicious_block is simply used to identify which start/stop commands match. It'll count the number of invocations of the block, calculate the total number of miliseconds spent in during those invocations, and calculate the average time per invocation.
The results will be output at the bottom of the page, if you have developer-support turned on.
I've used it to tune logger, where I have a page with 140 entries that take 15 seconds to load.
I've already made some optimizations as a consequence of this. For example, lc_numeric, which does thousand-separator formatting of numbers, calls lc_get, which is used to get the information on locale-specific formatting from the message catalog.
By eliminating the -locale switch and replacing it with a normal Tcl-style optional argument - {locale ""} - that alone shaved about 1 whole second off of the page load. I also inlined the nsv lookups to the message catalog, which sped up things even further.
So beware of switches to Tcl procs that gets called all over hte place. Processing switches isn't all that expensive, but when the proc is invoked many times (like lc_get getting invoked about 1500 times), it does add up.
I'll keep profiling and tuning, and I'll encourage you all to do the same, to track down our main performance hogs.
The conventional wisdom used to be that if only you avoided DB calls, you were fine. But it's not quite like that anymore.
/Lars