Forum OpenACS Q&A: Re: Using list-builder and order by

Collapse
Posted by Richard Hamilton on
I think that it is worth noting on this thread that Don has done a lot of recent work (late 2004) on the paginator that is integrated into list-builder to address the performance issues whilst retaining the caching capabilities.

Here are his three most relevant posts:
____________________________________________________________
1)
Richard ... while openacs.org was down for upgrade this weekend I worked on pagination for the forums package, using list builder, and now understand it thoroughly.

I also understand how to make it paginate very quickly, and think I can hack it up to do so without having to modify any of the client code.

I think this will make use of Jon's pagination code unnecessary, i.e. we'll want to recommend use of list builder and deprecate use of the existing paginator.

The reason it's so slow now is that it uses the paginator but inefficiently. Rather than go into endless details I think I'll just fix it.

I would actually like to investigate writing a version of list builder that combines the multirow functionality afterwards. The current mechanism - call a list builder proc which then sets global structures for magic tcl procs you call embedded in .xql files to modify your query in ways that fit the needs of the listtemplate tag - seems like a gawdawful kludge to me. By combining the functionality it could, among other things, do reasonably efficient pagination using the display query itself wrapped appropriately in LIMIT/OFFSET rather than require a separate prepare-for-pagination query as is necessary now.
____________________________________________________________
Then later:

2)
The paginator, which listbuilder is using, requires you to first fill a cache of row ids for the entire data structure being displayed. You then query display information using specific row ids from that cache.

Filling the cache is very slow if you have a large datastructure, for instance 5,600 threads in a forum like we do in the openacs Q&A forum. The query itself's not too bad, it's pulling the rows out of the rowset and stuffing them into an nsv cache variable that's taking most of the time.

The design philosphy behind the paginator seems to be "pay a fairly steep up-front cost so accessing every page afterwards costs roughly the same".

This isn't good. Normally people are going to reference more recent threads, blog entries, bugs sorted by some criteria, etc. LIMIT/OFFSET and Oracle ROWNUM tricks are faster for early entries in the rowset than those at the end, but that's OK given common usage patterns. Besides both Oracle and PG implement these constructs quite well, we don't really care if accessing the very first thread in the openacs Q&A forum takes a couple of tenths of a second longer than accessing the most recent one.

To put it bluntly the paginator's a bit evil and I'll just rewrite list builder to not use it, except perhaps the bits that generate that nice navigation bar.
____________________________________________________________

and finally:

3)
I've commited my improvements to the list builder's pagination code. Essentially I've sped it up by creating a separate paginator instance for each PAGE GROUP rather than one for the ENTIRE LIST. This means that on openacs.org's Q&A forum, rather than filling a cache of 5500 entries whenever the cache is flushed, we're only filling 330, using LIMIT+OFFSET/ROWNUM tricks. We then get the benefit of having cached the keys for the current page group until there's another post in the case of forums (not all clients of the list builder enable caching, of course). I've also gotten rid of the Tcl-based sorting of the keys returned for use in the page display query's IN (keys) clause, a minor speedup. There's probably still a lot of room for improvement, i.e. I can make lighter-weight list builder specific code to execute and cache the pagination query, etc. I'll poke at it periodically. Currently the OpenACS Q&A forum takes about 1.6 seconds to display if the cache is flushed. It took about 18 seconds using the existing list builder. When the cache is available it takes about 750 milliseconds. Much of that seems to be running the listtemplate tag as the shorter the number of rows, the faster it goes. Notice that performance of this can be impacted by changing the number of rows per page, and page groups per navigation widget, so I'll probably add these as parameters to forums at some point. You get quicker page displays by shrinking those numbers. And of course I haven't added the performance improvements to forums itself that I have in mind ... those will help, too.
____________________________________________________________

So the current position is that List Builder pagination now works acceptably and is scaleable. However Don can think of ways to further improve things.