Forum OpenACS Development: Response to Cacing of database queries ...

Collapse
Posted by Michael Bryzek on
Don,

Your last post brought up a few new thoughts/questions:

It seems like you want to provide a generic user interface where the site wide admin can set global parameters related to caching. Is this realistic/desirable in practice or is it just a simple quick solution? I have found that the caching I've added tends to add confusion to the user experience. The only way I've been able to address this confusion is by explicitly providing controls for each thing that I've cached. Let's say the user is editing a live press release. If press releases are cached, I would want the cache controls directly available from the UI used to edit the press releases. Most of my users tend to create/edit items and then jump over to the public site to see how they look. If caching is enabled, they might be confused as to why their changes are not visible.

We could provide a more intuitive user experience by tying cache flushing into the user interface for the cached object. With these controls, I'm not sure we need global parameters like a default refresh time. Maybe your users have requested different functionality than mine - if so, please share :)

I agree that using composite keys is harder than it first seems. I'm also having a hard time understanding the real-life requirements for caching groups of objects together. In my experience, when I need to cache groups of objects together, I end up caching the end result of the top-level object only. For example, if I am going to cache my home page, and it contains a bunch of other expensive operations, I am perfectly happy caching the end result of building the home page (i.e. the string containing the page contents) and refreshing the entire string every N minutes. This strategy may address most of our needs.

I'm also having a hard time understanding what dependencies you want to track between different cached queries. Everything I can remember caching has had pretty simple dependency trees and I have been perfectly happy tracking those dependencies myself. I would also be weary of implementing cache flushing based on package dependencies. It is perfectly acceptable, and quite likely, that my package will depend on acs-kernel as a package, but one individual query I cache will itself have no dependencies on acs-kernel.

For our redhat ccm project, we use OSCache from OpenSympony as part of our caching strategy. Their concept is to cache a fragment of a JSP for some period of time or until an explicit flush is signalled. They also provide an events-based framework to receive notification when various cache events happen (e.g. notify me when this site note is flushed from the cache). If we added similar functionality to OACS, then as a developer, I can flush my own caches when something gets flushed from acs-kernel by registering a call back.

This stategy has worked quite well for us so far. Here's a possible example of how this might work in OACS:

set user_name [util_cache -key user_name_cache_${user_id} -refresh 60 {
  return [db_string select_user_name {
    select display_name from acs_objects where object_id = :user_id
  }]
}]
The inner block could be as complicated as we need, but the end result would be a single cached string. The developer is responsible for any dependencies and key management. Key management might be a bit more work for the developer at first, but down the road it makes it explicit how to flush entries. Explicitly defined keys also makes it clear how to flush multiple entries at one time.

To summarize:

  • Do we need a generic "Cache administration" page or should this functionality be integrated with the user interface for each object we are caching?
  • Can we do away with auto-generated cache keys? They add complexity to the caching system and creating them explicitly shouldn't be too difficult.
  • Is "cache a group of objects and refresh together" a unique requirement or is it something we can support by saying "Cache the top-level object only?"
  • Is it really too much work for developers to track their own cache dependencies?
  • Should we add support for caching a portion of a generated html page?