Forum OpenACS Q&A: How do I cache a big database request?

I have a db_multirow that seems to take a while, or at least the page takes a long time to return.  I want to cache the database result, and just have it get updated when the contents of the database change.
First, I looked at util_memoize, but I don't see an easy way to get it to work with db_multirow.
Second, I thought to myself that perhaps the database (postgresql 7.2.3) is smart, and caches the results in RAM.  So I went to examine the log and see how much of the delay is due to database and how much to tcl and the RP and how much to network transmission time.  But all of the log entries (in debug mode) corresponding to a request have exactly the same timestamp ([20/Feb/2003:18:17:44][15424.5126] - what exactly does the second block mean?).

So:
1) How do I use util_memoize, which takes the results of a function call, in conjunction with db_multirow, which works by stuffing a variable?

2) Does the database cache this sort of thing?  If so, how can I tell?

Collapse
Posted by mark dalrymple on
[20/Feb/2003:18:17:44][15424.5126] - what exactly does the second block mean?

15424 is the process ID of the aolserver writing that log statement, and the 5126 is a thread ID (which doesn't necesarrily map to anything useful), but can be handy in correlating log entries with the activity of a particular connection thread.

Collapse
Posted by Dave Bauer on
You can wrap the query in a tcl proc that takes the criteria for the query as parameters. You can util_memoize the call to that tcl proc. For example, from ETP:
    set code "etp::get_pa [ad_conn package_id] $name $content_type"
    array set pa [util_memoize $code $max_age]
This memoizes the attributes of an ETP page. Its from the etp::get_page_attributes proc.
Collapse
Posted by Peter Marklund on
Joel,
you can use util_memoize in conjunction with util_multirow_to_list (to stuff query result in the cache) and util_list_to_multirow (to reconstruct the multirow) see this thread:

https://openacs.org/forums/message-view?message_id=77562

Basically what I did was I created a proc that returns the list  representation of the query result, wrapped that proc in util_memoize and then applied util_list_to_multirow to get the multirow back.

For cache control, see /acs-admin/cache.

Collapse
Posted by Don Baccus on
I'm adding the db_* caching API used at Greenpeace and the Berklee School of Music to contrib/misc in the OpenACS 4 tree.

This allows you to cache a multirow like this:

db_multirow -cache foo foo {query}

The only caveat is that as written you must use "$var" rather than the bindvar approach normally used (":var") for any Tcl variable in the query that change the result.  This is because the query string itself (after Tcl vars are replaced with their values) is used as the cache key.

It's this kludgey approach that makes it not quite ready for prime time in a general way, but it has worked really well for GP and Berklee and is extremely simple to use.

It has a sledge-hammer approach to flushing - call db_flush and the given cache pool is flushed of all entries.

For a general caching db API we need a better way of specifying cache keys and a less obnoxious approach towards kill cache entries.  This is why I'm putting it in contrib/misc for now.

But you may find it useful as is in the short term!