Forum OpenACS Development: permission_p revisited

Collapse
Posted by Andrew Grumet on
There is a lot of good thinking in this memorable thread about acs_permission.permission_p: https://openacs.org/ bboard/q-and-a-fetch-msg.tcl?msg_id=0001Sm&topic_id=12& topic=OpenACS%204%2e0%20Design.

I'd like to revive that discussion since I'm tuning permission_p with real data (87,000 objects, many are 5 generations deep etc etc).

Right now permission_p is a series of 5 queries. The proc returns when it hits its first 't' or else returns 'f' after running all 5. Here is a quick summary with an indication of relative cost:

  1. Group permission: expensive
  2. Rel segment permission: expensive
  3. Direct permission: cheap
  4. Public permission: expensive
  5. Public-like permission: cheap
The cheap queries are cheap for the simple reason that they access just acs_object_grantee_priv_map without extra joins. Just for fun, here are some measurements I made using tkyte's timestamping code:
    GROUP_    REL_SEG     DIRECT      PUBIC PUBLIC_LIKE
---------- ---------- ---------- ---------- -----------
.281416667 .200166667    .003875 .272708333  .002166667
These are the average number of seconds spent on each query over about 24 calls.

Moving on...what can we do to make things better? One option is to go after the individual queries. This may yet bear some fruit.

[Aside: One thing I don't grok is the join with the users table in the "public" query. When a non-registered member of the public makes a request, the auth system will return 0 as their user_id. But the users table doesn't have an entry for user_id=0, so this query really is checking for registered users, right? But if that's the case, why is it called "public" and why is the check for grantee_id=-1 (the public) ?]

Another thing we can do is re-arrange the queries for maximum efficiency. Since the public-like query is cheap and will very often return 't' (most sites have a lot of public pages), there may be a general benefit from putting it first. We might as well also put the other cheap query---direct permissions---near the top. After that the results would probably be mixed.

A third approach would be to either add extra functions or parametrize permission_p to give developers more control. If for example, I am willing to say that a particular permission should never be granted to the public, can I use that information to my benefit by calling a different function or skipping some of the queries?

I wanted to put these out ideas out here because I know there has been a lot of good thinking and this may not be reflected in CVS. I hope to generate discussion so we can recapture the good thinking, make some calls and get the changes in.

We have a big database of migrated content so I'm happy to do some trial and error and report back here.

Any takers?

Collapse
Posted by Barry Books on
I've got 600,000+ objects with lots of indirect permissions. I don't have any problems with the current permision_p function unless I put it in a where clause. I think the correct solution to that is don't put any plsql functions in a where clause unless you have a function index on it and creating a function index on permission_p is probably a bad idea.

I've found that acs_object.name and site_node.url are slower than permission_p in many cases.

Collapse
Posted by Andrew Grumet on
Hmm...interesting.  In our case we don't have permission_p in
the where clause, we're just getting a list of id's and
permission_p for each one.  Example: we display on the
homepage a list of classes I am taking.  We want to display a
little "admin" link to the administration page for each clas I am a
professor for.  So we compute whether or not I have admin
privilege for each class.  This can can maybe half a second per
row, bogging things down if I happen to be a professor for a
number of classes.

How long does an average permission check take on your
system?

Collapse
Posted by Yonatan Feldman on
here at openforce we have played around with reordering the individual queries and have noticed significant improvement in performance. we didn't commit this because we noticed differences in the oracle and postgresql versions of the function and i didn't
have time to try to figure out which one was correct.

the ordering we chose was direct (very fast), then public, then the rest.

Collapse
Posted by Barry Books on
just looking at a few I'd say about between 25 and 120ms. Even in a select clause that returns a lot of rows you can get in trouble. It might be better to join with all_party_privilege_map (I think that's the name)
Collapse
Posted by Don Baccus on
<i>all_object_party_privilege_map</i>
<p>
Yes, joining with this as part of your query that selects classes may well be faster.  It's worth a shot and since you're in measuring mode already, Andrew, you're a good candidate to take that shot :)
<p>
If you have time and want to report back it would add to our knowledge of how to work around permissions performance issues, regardless of the experiment being a failure or success.
<p>
As far as reordering goes ... you have some good, solid data and it sounds like OpenForce is getting some, too.  Jon Griffin and I both played around with ordering many moons ago and settled on what you see now.  I find it interesting that OpenForce appears to be discovering that the optimal order may differ between PG and Oracle.
<p>I don't have an answer for your question about the join with the users table in the public query.  It looks very strange, doesn't it?
Collapse
Posted by Vadim Nasardinov on
Hey Andrew,

Are you using the Java or the C version of Tom Kyte's code?

http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:341817437103

Collapse
Posted by Andrew Grumet on
Re: joining directly with the big view.  Thanks for the suggestion!  Running the join is slower for one case at least, and only returns rows where a permission record exists (we also want rows if the user doesn't have permission).

I'm tied up with other stuff today, but I'll try to hit this again over the weekend.

Collapse
Posted by Andrew Grumet on
Vadim -- java.
Collapse
Posted by Yonatan Feldman on
Just wanted to clarify that we didn't notice a performance difference between Oracle and PG, we never tested PG. I wanted to commit the reordering changes but decided not to when I saw that the Oracle and PG versions of permission_p were different. The Oracle version has one more check (5 instead of 4) and they are already in different order. I didn't want to mess around with this back then so we just left it alone.

If someone wants to look into which one is correct or what is going on that would be great.

Collapse
Posted by Andrew Grumet on
Okay, so I just had the brilliant idea to look at the hardware I'm running my dev server on. It turns out that the machine has only 500Megs ram and one cpu. In other cases where performance was acceptable we had gigs of ram and multiple cpus, and this will certainly be the case with our prod environment. So I think we're out of emergency territory.

Nevertheless there is probably still room to improve permission_p:

-- rip out "public" check or figure out why it's there
-- move fast checks to top

Don, how do you suggest we proceed?

Collapse
Posted by Andrew Grumet on
The pg code for the public-like check is a little more transparent. Here is the logic in pseudo-code:
   IF (permission is granted to "the public") AND
      ( (person is logged in) OR (person is not logged in) )
   THEN
     return "true"
The pg code and oracle code compute the same results, the former with one block and the latter with two.

This is moderately confusing because we have an odd mix of data model logic and application logic. The auth system uses user_id=0 to designate someone who is not logged in*. The data model uses party_id=-1 to designate "the public" which includes everyone, logged-in or not.

Whether "non-logged-in person" and "the public" are semantically equivalent is an interesting debate but besides the point. Anyone with a rudimentary understanding of logic will quickly see that "a AND (b OR (not b))" is just "a". Thus we can elminate this part of the logic from both incarnations.

The only other significant difference between oracle and pg versions is ordering. Pg puts the fast stuff up front

  1. public
  2. direct
  3. group
  4. relseg
...which I think is the way to go.

*this despite the fact that select count(*) from users where user_id=0 returns no rows

Collapse
Posted by Andrew Grumet on
I glossed over a subtle detail above that deserves mention. The check for logged-in users implicitly assumes that public permissions can only be granted to parties with a user_id. Following my suggestion above could change the input/output behavior of permission_p in the following case:
  1. the party_id argument represents something other than a user
  2. the application calling permission_p assumes that only user parties should get permission via grants to "the_public"; groups and other types of parties should not get permission
In the existing system, public permissions to not extend to these non-user parties, so this part of the permission_p query would return "false". In the modified system, "the public" is equivalent to "any party" (which probably makes more sense).

Regarding the impact of such a change, though we often *grant* permission to non-user parties, I don't know of any cases where we *query* whether these types of parties have permission. Does anybody else?