Forum OpenACS Development: Should we timout permissions cache?
The timeout could be a parameter that defaults to the session timeout. Or is this completely unnecessary - is it sufficient that ns_cache gradually purges the oldest stuff (this works no?) it has in the cache?
Right now the permissions cache is really not ready for prime time and I would discourage people from using it unless they either a) don't care about security, b) don't really ever change permissions or c) are extremely careful to look at how their particular site works and think through the consequences of a permissions cache that does not really flush properly.
For one thing if permissions change you're going to get different results from queries that look at permissions directly vs. calls to the Tcl API. And it would be very inefficient to funnel all permissions checking through the Tcl API.
I think the permissions work I'm doing will speed permissions sufficiently so that the one-off perm checks caches by the Tcl API won't really be necessary. I mentioned in another thread that I have safe and consistent caching of site nodes implemented at Greenpeace but simply forgot to migrate it over to OpenACS in time for 4.6 (bozo!)
This removes one perm check hit per page and helps a lot without sacrificing security.
At the higher query level - for instance a query that returns the contents of a folder - intelligent high-level caching is not only much more efficient but could be managed closely by the package in question to ensure consistent results.
Also my results with my partal rewrite of the permissions system are quite promissing, 0.5 seconds to return a folder with 50 files fully checked for permissions (vs. about 33 seconds in the 4.6). That's on a P500 Celeron, on a more modern machine we'd be looking at a couple tens of a seconde for the full query and that's without even trying to juice it further (which can be done, I want to attack file storage with a vengence after 4.7 comes out).
file storage has historically been one of our worst performing packages so if it can be made to be fast, I'd say anything we'll provide can be made to be fast.
OK, Peter, now that I'm scared by the low-level perm caching code that came from SloanSpace V2 I'll try to remember to e-mail you my db_* caching API soon!
SloanSpace V2 (dotLRN) has about 11,500 users, 16,000 mounted package instances (all those classes and forums inside classes etc), almost 550,000 objects and only 211,000 rows in the permissions table.
"only" 211,000 :) Hmmm... still much, much lower than your estimate, though.
In the rewrite I've done all queries are able to now take full advantage of indexes, so for SloanSpace V2 we're looking at 18 index probes to find a particular row - log2(211,000) rounds off to 18 I think if my mind's in order today. We've got a join against one other large table and a couple of small ones, it's not that bad.
thanks for the statistics from SloanSpace and the promising figures for the new optimized permissions API. I look forward to using it!
I guess that for my homepage at least b) and c) apply, and maybe even a)
I would have to test the optimized API to really tell, however, it still seems to make sense to me to cache permission_p. For instance, permission_p is invoked at least once per request by the request processor on the requested package instance. I like the fact that on my homepage I now have zero over head queries invoked by the request processor.
I'm maybe missing something, but I don't see why caching is so fragile. Aren't packages using the Tcl API to grant and revoke permissions? If not, how hard would it be to make them do that?
To get it right, if you add or remove someone from a group you would have to flush all permissions for that user (strictly speaking you might be able to hold on to some but I think that would cost more than just flushing them all). If you change permission on an object you would need to flush all the ones on that object or any object that inherits permissions from that object.
As it stands, granting and revoking direct permissions via the tcl api will flush properly, and I made a couple other operations flush as well but really getting it right is quite hard.
you are right of course. I see now how caching permissions is really quite complex.
It may not be feasible to come up with a flushing scheme that will keep db and cache in sync at all times. However, I believe it is the case that permissions change very seldom. I also believe that for the majority of sites it is perfectly acceptable to have the cached permissions lag the db permissions by some time (say one or a couple of hours). If I'm not mistaken this approach was used by Don for both SloanSpace and Greenpeace.
How about adding another parameter for the timeout of the permissions cache? I don't know what the default value should be, maybe 60 minutes?
A permissions cache would make most sense as long as you are "near" a certain object while surfing. I think I'm near an object for about 5 minutes - maybe.
We are talking about a default value anyway right? A programmer can set a timeout of 60 minutes for a subsite index page which I am likely to hit way more often.
Make the default small.
Is there a possibility to just cache on a request period only? What I mean by this is that. Browser asks for a page, server process a query, caches it. While processing the page encounters the same query. It gets from the cache. Once the server finishes the response. The cache gets flushed. It seems the life spane is small. But on permissions it may be useful. Or maybe not.
If not, then ad_conn could be the place.
cache whether or not a site node's "world" or "registered user" readable, check those first in the RP and only issue the explicit user check if they fail and the user_id is non-zero (i.e. they're logged in), kicking them out if the cached check fails and they're not logged in.
The only question is cache flushing. There are a couple of possibilities:
1. hook it up to "performance mode" and tell people any perm changes on a site node will require a server restart or give them a "resynch cache" admin link (this is what's required if you change other attributes of a site node)
2. Have the site node map's "set permissions" link include a return_url to the perm page that tells it to flush the sitenode cache after a site node's perms have been changed. This would make caching safe as long as people change perms using the site map admin UI, which is the logical place to do so. One would have to assume things like changing security_inherit_p would also be done from there.
I am thinking of queries like Peter mentioned on another thread like the site nodes. Which I think is queried twice. Anyway I think your suggestion of ad_conn is a good place to put those small cached items. Or ad_set_client_property.