Forum OpenACS Improvement Proposals (TIPs): TIP #71 (Rejected): Add Jon Griffin's new Paginator Procs to Core Release
a) Add Jon Griffin's new paginator code to future releases of OpenACS. Information available here:
b) Add support for this new paginator into List Builder.
1) It works really well.
2) The current paginator does not.
3) Is a lightweight alternative to list builder for large data sets where simple pagination and navigation is all that is required.
1) Pagination in List Builder at present does not work well.
2) Supporting this within List Builder would help to avoid divergence in 'look and feel' and general display code structure which would increase workload later.
...so that the procs and queries are sourced at server start-up.
Once someone gets it working with the existing code I can make sure all the parts are there.
The most important thing is to get my query procs which do all the work in the DB and I can do paginated sorts of 20K pages in under one sec on both PG and Oracle.
The rest of the stuff, while nice if it is easy to add, can be added later or just a switch.
Again, the DB procs for paginating results sets in the DB will make your paginations much quicker.
Also, you are volunteering to do the work, right?
Jade: I'm going to look into this with forums starting tomorrow, if all goes well (I'm busy today).
Everyone: why are e-mail responses broken on this box? I remember seeing some comments regarding qmail earlier but ignored them ...
How can we provide examples in the core packages of correct usage of the technique
Who will write documentation showing developers when to use multirow vs pagination vs list-template?
In the pagination approach, how does one sort, filter, or do bulk operations? Who will add Jon's documentation of his procs to the standard docs?
To be complete the new pagination procs should replace the existing ones in acs-templating and work with list-builder.
Pagination in list-builder should be fixed, perhaps by integrating what makes Jon's pagination so good. Otherwise we'll fall under the confusing "use list-builder because it's nice, but use this other (jon's pagination) way if you have a large data set".
I am proposing a two stage process.
It is not a lack of willingness that makes me hesitate to volunteer to do the integration, purely a lack of knowledge! I will agree do the work, but I have no idea at all how long it will take me. I have to learn how to actually use list builder first, then work out how the pagination works and then try to modify it in line with Jon's code (which looks elegant and simple at first sight).
But 'in at the deep end' is often the best way!
It needs to be done since the current situation is no good for anyone.
When merging Jon's code, why not rename it (or place it in a namespace) such that there will be no conflict between the existing code (not in the core) and the new core code? At the same time deprecate the non-core code so that developers are aware that it will be removed at some point. In the mean time, no code should break.
With the willingness of Rocal and Richard guaranteed to work on it for 5.2, I approve (for 5.2).
However, looking throught the list builder it seems to me that its job is to poke around in the variable scope of the calling tcl and the display template to facilitate user modification of the underlying query. It seems from the comments that the intention was always to paginate in the db and so I wonder whether our supposed problem stems from a misuse of list builder rather than anything wrong with the code. If so we need to establish whether list builder builds queries using the postgres 'offset' and 'limit' keywords and then progress accordingly.
If it does then we should try to find where the performance issues or bugs stem from.
If it does not we need to change the paginator procs to support this way of doing it.
So in essence I gues I am now recommending voting against my own proposal! Sorry.
I will do some more work and then propose something more suitable.
I also understand how to make it paginate very quickly, and think I can hack it up to do so without having to modify any of the client code.
I think this will make use of Jon's pagination code unnecessary, i.e. we'll want to recommend use of list builder and deprecate use of the existing paginator.
The reason it's so slow now is that it uses the paginator but inefficiently. Rather than go into endless details I think I'll just fix it.
I would actually like to investigate writing a version of list builder that combines the multirow functionality afterwards. The current mechanism - call a list builder proc which then sets global structures for magic tcl procs you call embedded in .xql files to modify your query in ways that fit the needs of the listtemplate tag - seems like a gawdawful kludge to me. By combining the functionality it could, among other things, do reasonably efficient pagination using the display query itself wrapped appropriately in LIMIT/OFFSET rather than require a separate prepare-for-pagination query as is necessary now.
I knew that you were working on this (secretly hoped you would post here!!).
I can't tell you how relieved I am to hear that you have worked through this issue because it would have taken me many days and a lot of experimentation before I fully understood it. I had concluded that there seemed to be a lot of surplus munching going on and was thinking along the lines of proposing to do what you have just suggested.
From my reading so far, even if you specified all the page parameters correctly, the paginator was still asking postgres to run through the entire unfiltered dataset (actually more than once) before arranging the required page.
It seems to me that the best way to paginate data is to ask postgres to do it (as in Jon's code). We don't need to import Jon's code to achieve this but as you say, we would need to make sure that the result of the user clicking on a url is a query that returns only the data required for the selected page, and in the correct order - and nothing more.
I have used 'Limit' and 'Offset' for paginating a 3000 row dataset and it is a really fast way to restrict the results. So any means of exposing this functionality through list builder gets my vote.
If there is anything that I can do to help please let me know.
Filling the cache is very slow if you have a large datastructure, for instance 5,600 threads in a forum like we do in the openacs Q&A forum. The query itself's not too bad, it's pulling the rows out of the rowset and stuffing them into an nsv cache variable that's taking most of the time.
The design philosphy behind the paginator seems to be "pay a fairly steep up-front cost so accessing every page afterwards costs roughly the same".
This isn't good. Normally people are going to reference more recent threads, blog entries, bugs sorted by some criteria, etc. LIMIT/OFFSET and Oracle ROWNUM tricks are faster for early entries in the rowset than those at the end, but that's OK given common usage patterns. Besides both Oracle and PG implement these constructs quite well, we don't really care if accessing the very first thread in the openacs Q&A forum takes a couple of tenths of a second longer than accessing the most recent one.
To put it bluntly the paginator's a bit evil and I'll just rewrite list builder to not use it, except perhaps the bits that generate that nice navigation bar.
I'm actually rather astonished that this thread seems to have generated so much debate. The List Builder's built-in pagination sucks, this has been well known for a long time. Lars wrote all of List Builder, and if I remember correctly, he himself has said in these Forums that its built-in pagination sucks and should be fixed.
Jon Griffin's paginations procs are not "new", they've been around for at least a year now (maybe two). Several people have used them (I have not), and everyone who has seems to agree that they are a major improvement over the existing lousy List Builder pagination.
Joel, your heart is in the right place, but you're putting up straw men. If the toolkit is broken, it needs to be fixed - and AFAICT everyone who's examined the issue seems to agree that the List Builder pagination is definitely broken. If there is insufficient, contradictory, or just plain misleading documentation, then that is also a breakage that needs to be fixed - but that second breakage is in no way a justification for stopping someone from fixing the first breakage.
By all means, change the proc names or do whatever else is necessary or desirable to integrate Jon's pagination implementation smoothly into OpenACS - but do it! Now, if Don has figured out an even better solution, way cool, but please, somebody pick one of the improvements available and actually put it into the toolkit.
list builder pagination will be efficient and we don't want another pagination method because people should use listbuilder
Malte, I'll try to look at the bugtracker in the next few days, still plan on concentrating on forums through the next couple of days.