Forum OpenACS Development: tree_sortkey race condition, with fix

I've discovered (well, rediscovered -- see this thread) a race condition in the code that is used to generate tree sortkeys. The result of the race condition is that it is very easy to end up with duplicate tree_sortkeys. The problem can be demonstrated with two psql sessions. Just start a transaction in each, then create a new acs_object under the same parent in each session. The new objects will share the same sortkey. It also crops up in bboards, where posts can sometimes appear in the wrong thread.

The good folks working on dotlrn have come up with a solution that involves storing the maximum child sortkey in each parent row. It's not general, however, in that it doesn't work for root nodes. I've spent the day working on a new version of the tree_sortkey generation that prevents the race condition, based around the dotlrn code. I've also written procs that regenerate the entire tree_sortkey hierarchy in a table to repair duplicates (it could probably be reduced to the damaged nodes) and create the max_child_sortkey field and fill it in appropriately. This code is at

The code I've posted is a first run, and not very well tested at all (plus it takes a long time to run). It will attempt to fix the acs_objects, acs_object_types, and cr_items table (there are several other tables affected as well). The kernel.sql script loads a proc that wasn't in my old checkout of OACS (tree_increment_key), and creates a table used to store the maximum root-level tree_sortkeys for each table. The other three files are each targeted at fixing one table. The first proc in these files, repair_hierarchies, uses a BFS type scan to regenerate the tree_sortkeys. If yours aren't broken, just delete the proc from the file. Next it adds the max_child_sortkey field and populates it with another proc. Finally, it drops the insert trigger and recreates it to avoid the race condition. For ease of review, here's the acs_objects proc:

create or replace function acs_objects_insert_tr () returns opaque as '
    v_parent_sk             varbit default null;
    v_max_child_sortkey     varbit;
    if new.context_id is null then
        v_parent_sk := '''';

        select max_sortkey into v_max_child_sortkey
        from brk_max_root_keys
        where table_name = ''acs_objects''
        for update;

        v_max_child_sortkey := tree_increment_key(v_max_child_sortkey);

        update brk_max_root_keys
        set max_sortkey = v_max_child_sortkey 
        where table_name = ''acs_objects'';
        select max_child_sortkey into v_max_child_sortkey
        from acs_objects
        where object_id = new.context_id
        for update;
        v_max_child_sortkey := tree_increment_key(v_max_child_sortkey);

        update acs_objects
        set max_child_sortkey = v_max_child_sortkey
        where object_id = new.context_id;
    end if;

    new.tree_sortkey = v_parent_sk || v_max_child_sortkey;

    return new;
end;' language 'plpgsql';

Thoughts/feedback/comments/flames welcome.