Forum OpenACS Development: Should we timout permissions cache?

I was just wondering if we should maybe timeout the permissions cache in a similar fashion as is done with client properties (ad_set_client_property)? After all, with lots of users, packages, and privileges quite a few permutations would be created. Say 10,000 users, 20 packages, and on average 3 privileges, that makes 600,000 entries. Of course, we are only caching a 1 or a 0 so maybe it's not too bad, I don't know.

The timeout could be a parameter that defaults to the session timeout. Or is this completely unnecessary - is it sufficient that ns_cache gradually purges the oldest stuff (this works no?) it has in the cache?

Collapse
Posted by Jeff Davis on
the util_memoize cache is a size limited cache so will not get too large and does LRU flushing. It might make sense to make this a seperate cache from the util_memoize cache and make it a little smarter (and possibly think about just bulk flushing the entire cache on "dangerous" ops like revoking admin).

Right now the permissions cache is really not ready for prime time and I would discourage people from using it unless they either a) don't care about security, b) don't really ever change permissions or c) are extremely careful to look at how their particular site works and think through the consequences of a permissions cache that does not really flush properly.

Collapse
Posted by Don Baccus on
In my view the low-level permissions caching stuff is *extremely* fragile and would second everything Jeff has said.

For one thing if permissions change you're going to get different results from queries that look at permissions directly vs. calls to the Tcl API.  And it would be very inefficient to funnel all permissions checking through the Tcl API.

I think the permissions work I'm doing will speed permissions sufficiently so that the one-off perm checks caches by the Tcl API won't really be necessary.  I mentioned in another thread that I have safe and consistent caching of site nodes implemented at Greenpeace but simply forgot to migrate it over to OpenACS in time for 4.6 (bozo!)
This removes one perm check hit per page and helps a lot without sacrificing security.

At the higher query level - for instance a query that returns the contents of a folder - intelligent high-level caching is not only much more efficient but could be managed closely by the package in question to ensure consistent results.

Also my results with my partal rewrite of the permissions system are quite promissing, 0.5 seconds to return a folder with 50 files fully checked for permissions (vs. about 33 seconds in the 4.6).  That's on a P500 Celeron, on a more modern machine we'd be looking at a couple tens of a seconde for the full query and that's without even trying to juice it further (which can be done, I want to attack file storage with a vengence after 4.7 comes out).

file storage has historically been one of our worst performing packages so if it can be made to be fast, I'd say anything we'll provide can be made to be fast.

OK, Peter, now that I'm scared by the low-level perm caching code that came from SloanSpace V2 I'll try to remember to e-mail you my db_* caching API soon!

Collapse
Posted by Don Baccus on
On the size of permissions table you're being somewhat pessimistic.  The inheritance system and group-wide granting of permissions helps to cut the size down considerably.

SloanSpace V2 (dotLRN) has about 11,500 users, 16,000 mounted package instances (all those classes and forums inside classes etc), almost 550,000 objects and only 211,000 rows in the permissions table.

"only" 211,000 :) Hmmm... still much, much lower than your estimate, though.

In the rewrite I've done all queries are able to now take full advantage of indexes, so for SloanSpace V2 we're looking at 18 index probes to find a particular row - log2(211,000) rounds off to 18 I think if my mind's in order today.  We've got a join against one other large table and a couple of small ones, it's not that bad.

Collapse
Posted by Peter Marklund on
Don,
thanks for the statistics from SloanSpace and the promising figures for the new optimized permissions API. I look forward to using it!

Jeff,
I guess that for my homepage at least b) and c) apply, and maybe even a) 😊

I would have to test the optimized API to really tell, however, it still seems to make sense to me to cache permission_p. For instance, permission_p is invoked at least once per request by the request processor on the requested package instance. I like the fact that on my homepage I now have zero over head queries invoked by the request processor.

I'm maybe missing something, but I don't see why caching is so fragile. Aren't packages using the Tcl API to grant and revoke permissions? If not, how hard would it be to make them do that?

Collapse
Posted by Jeff Davis on
Peter, the thing you are missing is that in a lot of cases (in fact almost every case) the permission is not granted directly, a given user has or doesn't have it by virtue of being in the right group and the object generally inherits the permission from another object.

To get it right, if you add or remove someone from a group you would have to flush all permissions for that user (strictly speaking you might be able to hold on to some but I think that would cost more than just flushing them all). If you change permission on an object you would need to flush all the ones on that object or any object that inherits permissions from that object.

As it stands, granting and revoking direct permissions via the tcl api will flush properly, and I made a couple other operations flush as well but really getting it right is quite hard.

Collapse
Posted by Peter Marklund on
Jeff,
you are right of course. I see now how caching permissions is really quite complex.

It may not be feasible to come up with a flushing scheme that will keep db and cache in sync at all times. However, I believe it is the case that permissions change very seldom. I also believe that for the majority of sites it is perfectly acceptable to have the cached permissions lag the db permissions by some time (say one or a couple of hours). If I'm not mistaken this approach was used by Don for both SloanSpace and Greenpeace.

How about adding another parameter for the timeout of the permissions cache? I don't know what the default value should be, maybe 60 minutes?

/Peter

Collapse
Posted by Dirk Gomez on
Peter,

A permissions cache would make most sense as long as you are "near" a certain object while surfing. I think I'm near an object for about 5 minutes  - maybe.

We are talking about a default value anyway right? A programmer can set a timeout of 60 minutes for a subsite index page which I am likely to hit way more often.

Make the default small.

-- Dirk

Collapse
Posted by Jun Yamog on
Hi,

Is there a possibility to just cache on a request period only?  What I mean by this is that.  Browser asks for a page, server process a query, caches it.  While processing the page encounters the same query.  It gets from the cache.  Once the server finishes the response.  The cache gets flushed.  It seems the life spane is small.  But on permissions it may be useful.  Or maybe not.

Collapse
Posted by Dirk Gomez on
Why would one page ask the same permissioning query twice - Isn't that a flaw in the programming logic?

If not, then ad_conn could be the place.

Collapse
Posted by Don Baccus on
Well ... I *know* how to make most of the request processor permissions checks go away for busy public sites ...

cache whether or not a site node's "world" or "registered user" readable, check those first in the RP and only issue the explicit user check if they fail and the user_id is non-zero (i.e. they're logged in), kicking them out if the cached check fails and they're not logged in.

The only question is cache flushing.  There are a couple of possibilities:

1. hook it up to "performance mode" and tell people any perm changes on a site node will require a server restart or give them a "resynch cache" admin link (this is what's required if you change other attributes of a site node)

2. Have the site node map's "set permissions" link include a return_url to the perm page that tells it to flush the sitenode cache after a site node's perms have been changed.  This would make caching safe as long as people change perms using the site map admin UI, which is the logical place to do so.  One would have to assume things like changing security_inherit_p would also be done from there.

Collapse
Posted by Jun Yamog on
Hi Dirk,

I am thinking of queries like Peter mentioned on another thread like the site nodes.  Which I think is queried twice.  Anyway I think your suggestion of ad_conn is a good place to put those small cached items.  Or ad_set_client_property.