As I had a long commute today, I had some time to think about
it and here's the result. Consider it food-for-thought.
For those who did not RTFA :) - there are (at least) two issues
identified in the referenced threads:
a) There is no clean way to make sure that e.g. the original
poster of a forum-message gets admin rights to his/her own
message, other than explicitly assigning the privilege
on the specific message
b) Checking permissions is relatively slow, despite all the
optimalizations. And, mostly because of those very same
optimalizations, the system is relatively complex.
To avoid flamewar-like responses like I saw in other permissionsystem-related threads: I'm *not* saying it is *too* slow or *too* complex, as I've done no benchmarks myself and have not found any good benchmarks to support such a claim. I'm just kicking around ideas. Feel free to ignore me.
If you read the forums, the topic gets a fair share of
attention and a lot of good ideas are tossed up. By combining several ideas in a clever manner I
think we can simplify the permission system and substantially gain on performance at the same time.
The basic problems are that:
* there is no concept of ownership, requiring quite a lot
of direct, explicit privilege assignments. Beside being
a pain in itself, it bloats the permissions system.
* the all_object_party_privileges_map depends on
three huge flattened trees: context, parties and privileges.
The size of this beast grows polynomial with the number
of users & security contexts (read: objects). This means that the performance
of the permissions system severly declines with
increasing numbers of users and security contexts -
or more in general the size of the site.
Basically my idea would be to:
1) Introduce the concept of ownership
- reduces the need for explicit, direct permissions
2) Switch to an exclusively role-based access control system
- de-couples the permissions check from the parties hierarchy
3) Switch to a limited set of 'horizontally stored' privileges
(one-row-per-object, I think I found a solution to Don's
implicit privileges problem).
- de-couples the permissions check from both the privilege
hierarchy and the context tree
The end result should be a permissions system where the cost of
CHECKING permissions is (almost) unrelated to the size of the site
or the number of users. Of course, no free lunches, the cost probably
moves to a certain extend to the ASSIGNING & CHANGING of permissions.
But this happens less frequently, is generaly not noticeable by the
end-user and I think it will not be too bad once implemented decently.
Each of those topics is explained in more detail below.
Introduction of ownership
By introducing object ownership, the issue as raised by dave
in the mentioned thread disappears and it will significantly
cut back on the need for directly assigned (non-inherited)
We can then just assign privileges to 'the owner' - whomever
that happens to be(come).
Proposal: add an 'owner' field to acs_object, which defaults
I'm aware of the fact that this has a 4 byte (on most platforms)
penalty per object, which is potentially a big number.
However, modifying_ip & modifying_user do not seem to have a lot
of serious usage (besides the kernel there are only 6 packages
that refer to them, and mostly just because the fields exist -
not because they're really used).
Besides that, I believe that those belong in an audit-log rather
than as attributes on an object - but that's just my opinion.
Anyway may be we can drop those - the nett-saving would be
somewhere around 12-20 bytes per object.
Switch to RBAC
Step two would be to never assign privileges directly to users,
but to roles (or groups - roles are essentially groups of course).
You may have 20k users, you'll never have 20k roles. I think that
even in a very large organisation this number would not be more
than a few hundreds (and that would be a very bad case).
BTW: if I say roles are groups, I think of very simple
flat groups, not the complex things the oacs group system
can accomodate. I want just a simple mapping table that maps
users & roles.
Which roles a user can act in can easily be cached in a user/roles mapping table. It only
changes when a user's group-membership gets changed or if
he gets assigned other roles. Most users will have
only one or two roles and they normally do not change often.
This would also solve Dave's original problem
from the mentioned thread - there is always a standard role
Smart use of subsites (which I'm sure large site-maintainers
already do), combined with the use of roles rather than
assigning privileges directly has two advantages:
* it decouples the performance of the privilege system
to a very large extend from the number of users (coupling it
instead to the average number of different roles used per object,
which will probably be near-constant even for large sites).
* cut back on the number of privileges to assign in general
- which eases administration as a bonus
Worst case is that you're sometimes forced to create a role that
will only be used by one person. Imo that's not a big deal.
(pay attention, this is a big deal)
Then step 3, both Don and Lars had ideas with respect to
"horizontal" privileges (storing the actual privileges
in the same row as the object_id/grantee combination with
one row per object).
For my line of thought using Don's bitstring idea (1 bit per priv)
is the best. Don's problem was the implicit and
I guess we can solve that by having three privilege columns per
object/grantee pair. All identical bitstrings. One containing the
direct and explicitly set privileges, one containing the inherited
privileges and one containing all privileges (calculated by or'ing
the explicit & inherited privileges and determining which privileges
are implied by the privileges then found). This 'all_privs' field
then becomes the 'inherited_privs' field of the direct descending
object (or security context).
There is a penalty of course when permissions are changed for a
given object, as the implicit & inherited privileges are to be
recalculated. I've not yet looked into how hard that would be but
I can't imagine that it is really huge as we essentially do it now on
every perm check.
The permissions table then would have rows (for each and every
(object_id, role_id, explicit_privs, inherited_privs, all_privs)
** This would decouple the effort of checking for permissions
from the size of the context tree **
However, the size of the permissions table would grow linear with
respect to the number of objects and also linear with the
average number of different roles that have privileges assigned
on a certain object. This is by no means polynomial anymore, but
still far from constant.
Therefor, I would like to store a real security 'context' field for every
object - the context where that particular object actually inherits its
E.g. a message in a forum has its context_id set to
the object_id of the forum package, which has its context_id set
to (most likely) the object_id of the subsite. Now, if the subsite
grants 'read' privileges to everyone with the role 'member', and
both the forum and the message don't have direct privileges assigned,
the 'real' security context for the forum message should be
the object_id of the subsite, not of the forum package
(Because of the 'owner' role, there is probably no need to
assign direct privileges to individual forum posts.)
I'd say, rename the current context_id to parent_id (as it is mostly
(ab)used in that way already and in HEAD there is already a parent_id defined for this purpose) and use the context_id to really indicate
the security context. (see also http://openacs.org/forums/message-view?message_id=162608
By doing so, we could eliminate all objects from the
permissions table that don't have directly assigned privileges WITHOUT
having to join against the context hierarchy (I'm not even sure
there is a use for an explicit context hierarchy anymore in this case).
With respect to custom privileges: afaict custom privileges
only have meaning locally to a package. So we could use
a 64 bit bitstring to store privileges, with the first
32 reserved for global, standardized privileges and the other
32 for local package use. That way each package can define up
to 32 custom privileges without any penalty. The kernel
just stores & retrieves them without having to know their
meaning (you effectively ask the kernel "is bit 34 set or not?").
Putting it all together
At the end we would have a table with rows like:
(object_id, role_id, explicit_privs, inherited_privs, all_privs)
In this table you have only objects that have set
direct privileges (as opposed to inherited privs).
Figuring out which privileges a certain user has on
a certain object would then involve the following steps:
- figure out which roles a user has (fast, can be cached by
storing the user/role mappings in a seperate table which is
kept up to date by triggers)
- figure out in which security context the object lives
(easy, defined as an attribute on the object itself)
- get the 'all_privs' columns for the relevant object/context
- check if the correct bit is set or not
The size and performance of this system would only depend on two factors:
- the total number of objects that have direct privileges assigned
- the average number of different roles per object where privileges
are assigned to.
This means that the relation between performance of the permissions system and the size of the site should be near constant (or log2(N) in worst case, as Don mentioned earlier "Huge tables are more a space problem, though, not necessarily a significant performance problem given the log2(N) performance characteristics of queries on simple tables with proper indexes.")
This new query would replace the acs_object_party_privilege_map and should
be both much simpler and faster. Without sacrificing any of the functionality
in existance today.
Migration from current situation
The good news is that the actual permissions checking mechanism
would remain the same. Just the way we compute
the answer will differ. So most application code does not have
to be touched at all.
The worse news is that the way privileges are assigned
is substantially different, as you can no longer
assign privileges to users - only to roles. The better news is that
this is highly centralized (if I'm correctly informed).
The only real issue I see (besides the work itself :) is the
implementation of the 'ownership' concept, as that is
sprinkled throughout the code of several packages.
Gentlemen, load your guns and enjoy yourself by shooting big gaping holes in my little thought-experiment ...