Forum OpenACS Development: Bug in .LRN/openacs group memberships
I have found a problem in .LRN (at least)
where when an administrator from the group is removed as administrator the rows in party_approved_member_map are not removed.
This may exist in OpenACS also. I have not tested yet. (Most likely it would not appear in OpenACS since it doesn't use rel-segments heavily by default).
If you run this query
select r.rel_id,m.segment_id,m.group_id,m.* from (select m.*, s.* from party_approved_member_map m, rel_segments s where s.segment_id = m.party_id and s.rel_type='dotlrn_admin_rel') m left join acs_rels r on (r.object_id_one = m.group_id and r.object_id_two = m.member_id) where r.rel_id is null;
If you get any rows it shows there are rows in party approved member map for a group membership without a row in acs_rels for that membership relationship.
The result is a user is still part of the admin relational segment while not a member of the group.
The query returns 0 rows on openacs.org.
select r.rel_id,m.segment_id,m.group_id,m.* from (select m.*, s.* from party_approved_member_map m, rel_segments s where s.segment_id = m.party_id and s.rel_type='dotlrn_admin_rel') m, acs_rels r where r.object_id_one(+) = m.group_id and r.object_id_two(+) = m.member_id and r.rel_id is null;
How are the admin rels added/deleted in this case, is it a special pl procedure for dot_lrn? Maybe post them if this might cause the problem.
Also, could you print out the lines showing the results, not just the count, I'm having trouble figuring out your query. How do you select an acs_rel with rel_id null? Is there some magic in this query?
I built the query from the ground up to show the error. Thanks for catching that. I'll try it across all rel_types to see what happens.
To print out the acs_rels with null rel_id, it does a left join. The point is, there is a row in party_approved_member_map but the user is not a member of the group according to acs_rels. The rel_id is null just filters out the ones that have a missing acs_rel.
I will post more when I look into how the rels are added and deleted and see if I can reproduce this.
Here is the function that is having the problem:
create or replace function party_approved_member__remove_one(integer, integer, \ integer) returns integer as ' declare p_party_id alias for $1; p_member_id alias for $2; p_rel_id alias for $3; begin delete from party_approved_member_map where party_id = p_party_id and member_id = p_member_id and tag = p_rel_id; return 1; end;' language 'plpgsql';
There are rows in party_approved_member_map where tag is 1 instead of the rel_id. I have not figured out where these came from yet.
we just looked into the problem on our learn server. Although the original query (https://openacs.org/forums/message-view?message_id=1172094) returns a few thousand entries, we seem to have a different situation, since we have there a value for tag of 0 and not 1 as in your case. For us, all these entries have party_id = member_id and tag == 0. These entries come most probably from parties_in_tr(), which adds identity rows when parties are created. These parties are not users in our case. Since we delete only seldomly parties, these entries are there, they would be removed trough parties_del_tr(). So, I tend to believe that for our site, everything seems fine (but we have many differences from dotlrn, even in the data model).
Offhand I would say that there is some confusion with tag.In principle agreed. The name "tag" is bad, it is as well bad, that there is no fk. Others might know better than me (i am not a veteran from the ad days), but from looking at the code, i got the impression, the basic idea might have been, that "tag" is some arbitrary "client_data" (for maybe multiple purposes). In todays version it is used only for "0" or a rel_id.
.... the basic problem is that the field does not have a reference constraint ..... data model problem which was somehow abused by certain applications ...
Since the key of party_approved_member_map consists of (party_id, member_id, tag), tag can't be a NULL value, therefore a rel_id for the identity rows should be created (an identity rel) and added instead of the 0 when parties are added. Renaming tag to rel_id would help to avoid confusions. These are just my 2 cents.
This does most probably not help Dave with his original problem. Dave, do you have any idea, where the entries with tag=1 are added? Is there anything special about the party_ids and member_ids (are they equal, ancient, etc.)?
select distinct object_type from acs_objects o, acs_rels r, party_approved_member_map m where m.tag = 1 and m.party_id=r.object_id_one and m.member_id = r.object_id_two and o.object_id = r.rel_id; object_type ------------------------------ dotlrn_admin_profile_rel dotlrn_admin_rel dotlrn_ca_rel dotlrn_cadmin_rel dotlrn_external_profile_rel dotlrn_guest_rel dotlrn_instructor_rel dotlrn_member_rel dotlrn_non_guest_rel dotlrn_professor_profile_rel dotlrn_student_profile_rel dotlrn_student_rel dotlrn_ta_rel membership_rel (14 rows)
Reviewing all the triggers they look ok to me. Everything passes around the actual rel_id from the membership_rel row the trigger runs on.
So partly this looks like a data model problem which was somehow abused by certain applications. If the data model problem was removed, developers would have discovered that things didn't work very quickly, and would have fixed their code. But the API that was listed above is not necessarily in error. It might be the one correct piece in this, hard to say just yet.
I ran these queries with an old 5.0... version of OpenACS and didn't find any tags = 1, no object_id = 1. So I can't add anything to the meaning of the 1.
That is how its "supposed" to work as far as I can tell.
I have many rows besides those also that have tag == 1. Again the problem is the trigger to delete from party_approved_member_map relies on the tag matching the rel_id that is deleted.
I have tried to reproduce the probem using the user interface to add/remove people in a community and change their role, but I haven't been able to reproduce it yet.
create or replace function parties_in_tr () returns opaque as ' begin insert into party_approved_member_map (party_id, member_id, tag) values (new.party_id, new.party_id, 0); return new;
end;' language 'plpgsql';
The more you look at this code, the more questions arise. What is the purpose of the trigger here? the party is a member of itself. Is that why the rel_id is bogus? Maybe there needs to be an identity_rel. party = member, or we are me?
But I still haven't seen any obvious bug, confusion, but it looks more like a mismatch between new and delete.
But the fact that it is a primary key should be a big flag that this needs to be an fk from (I guess) acs_relsNeither 0 or 1 are rel_ids (in acs_rels). So, if there would be a FK for tag, then there should be a different value chosen for the tag in case of the identity (the hypthical "identity-rel" mentioned above, which might be another magic negative value).
The purpose of the trigger is to establish the identity relation (party is approved member of party), which is there for achieving transitive closure (my interpretation, only interesting for transitve memberships).
However, this just source for confusion, not the solution for dave's problem, since we still don't know where the "1"s are from. Are there other dotlrn sites, with
select count(*) from party_approved_member_map where tag = 1;or is dave's site the only one?
Okay, here is an idea. Given the imaginary identity relationship which causes the row to be inserted into party_approved_member_map, I don't think that the original statement of the problem matches up this. The party_in_tr is inserting the identity, who knows why it is there, since it could be assumed that every party is that party. But the only way to remove it is to delete the party. The party_del_tr deletes all rows in party_approved_member_map regardless of tag. But the relational segment triggers only delete the rows which have a actual acs_rels.rel_id. So the imaginary tag prevents the removal of this identity row. Obviously this should be necessary. This is the trigger statement that leads me to this idea:
delete from party_approved_member_map where party_id = old.segment_id and member_id in (select element_id from group_element_index where group_id = old.group_id and rel_type = old.rel_type);
Anyway, in this case, the row is added by a trigger on parties, but 'not removed' by a trigger on rel_segments. This seems correct. But if someone uses a tag which corresponds to a real acs_rel, the identity row may be removed (maybe, not sure, maybe the tag should be changed to the party_id, since we know that id will never be an acs_rels.rel_id, or is there a guarantee that 0 will never be an object_id?)
So my question is: the data exists, but what problem is being caused by it? We never got any actual data rows, so I'm still not sure what would need to be fixed.
On a side note, Dave...the hardest bug to find is one which doesn't exist. But these types of bugs are easier to fix if you consider the possibility! Let's reconsider the actual issue.
It looks like there should be rows in party_approved_member_map without a corresponding membership_rel. The number should equal the number of parties. If it does not then that is where the problem is occuring.
On this one install
There are rows in
which are not identity rows
which do not correspond to any group or relational segement.
That is. A user is a member of the dotlrn_admin_rel relational segment (according to the party_approved_member_map) without being a member of the community.
what happened was
1) user was an administrator of community, but for unknown reason the row in party_approved_member_map has an invalid tag of 1.
2) The user was later removed from the community, and the membership_rel was removed. The row in party_approved_member_map was not removed because the plsql procedure did not remove the row because the tag did not match the rel_id that was deleted.
The actual problem is that the user appears as a member of the relational segment.
This affects permissions as the dotlrn_admin_rel relational segment have admin over the community.
If I'm just confusing the situation, ignore my comments.
No there was not any additional row.
Each dotlrn user has one acs_rel, either a dotlrn_student_rel or dotlrn_admin_rel for each community membership.
A community is a group. This means the user gets one row in acs_rels regarding their membership in the group.
The membership in the group triggers an entry in party_approved_member_map with (group_id,user_id,rel_id) and the trigger also loops through all the relevenat relational_segments. So a user also gets a row with (rel_segment_id,user_id,1). At least that is what happened in the past. Now the row is (rel_segmetn_id,user_id,rel_id).
There isn't a problem since the rows are unique.
The membership_rels_in_tr function (insert trigger)
calls party_approved_member_add function which calls
for the party in the acs_rel.object_id_one
and loops through the relational segments and calls party_approved_member_add_one for each relational segment.
Its only the relational segment rows that have the invalid tag.
I looks like this is legacy data from an old bug and it just needs to be cleaned up. I can't produce a new acs_rel with this problem.