Forum OpenACS Development: Re: acs_object__new slowness...

Collapse
Posted by Don Baccus on
When I ran my experiments I went all the way up to "LOCK TABLE".

However, now that I think about it ... doing a "SELECT FOR UPDATE" in two separate parallel calls to a function may actually work correctly IN THIS CASE, because we know the key exists at the beginning of each transaction that fires the trigger.  Thus both see (and lock) the same tuple.  Good news - I think :)

When I found the locking problem I was looking at another problem - the denormalized party members table I added in 4.6 that helped speed up permissions checking so much.

Check out the Oracle and Postgres implementations - you'll see that in Oracle I keep one party/membership pair and count the number of times a row's generated through group membership (multiple instances are a natural result of the groups datamodel and implementation).

In Postgres I don't count, I add the duplicate rows, adding to the space burden that results from this denormalization (but not materially slowing down permissions checking).

The reason is that in PG you can't solve the following problem entirely within PL/pgSQL:

If a tuple with key "k" exists in table "t", UPDATE its counter; otherwise INSERT a new tuple with key "k" and counter set to zero.

Logically you can do this (no exception handling in PG, remember):

LOCK t;
UPDATE t SET counter = counter + 1 WHERE key = 'k';
IF (no rows updated) THEN
  INSERT INTO t VALUES ('k', 0);
END;
UNLOCK t;

(in practice you only need a write lock of course)

Buried in PL/pgSQL this code will generate duplicate rows in "t" with key "k".  Done at PSQL it will not (of course!).

The problem is that a statement like "SELECT foo()" is executed atomically - statements inside PL/pgSQL don't change tuple visibility in other transactions because they're not considered "real commands".

So two calls to the above snippet can result in the following (call the two threads "A" and "B"):

A. locks t
B. waits for t because A has locked it
A. tries update, no tuples modified (row doesn't exist)
A. inserts row with key 'k'
A. unlocks table
B. resumes execution
B. tries update - NO TUPLES ARE MODIFIED because it doesn't
see the row inserted by A above!  Because the tuple set visible to the function is computed when its parent "SELECT foo()" begins execution, changes by other transactions aren't visible anywhere within the statement.

F**K!

Of course from PSQL each statement bumps the command counter (counts as a "real command") and doing the above from two PSQL sessions results in B "seeing" the tuple inserted by A, once A releases its lock and B resumes execution.

Oracle, on the other hand, makes tuples from other transactions visible within PL/SQL in such cases, in triggers and procs at least (maybe not functions - see below).

Now one can argue that PG is doing the correct - though damned inconvenient - thing here because a SELECT statement is *supposed* to operate atomically on a set of tuples.  We don't have procedures in PG, just functions, and we don't have triggers distinct from functions (in other words, PG triggers call functions rather than contain their on block of code).

One could argue that functions shouldn't have side effects that bring up issues like tuple visibility (since functions with side effects are "unclean" in theory anyway) but since PG procs don't exist and PG triggers don't have their own block of code we're *forced* to write "unclean" functions where we very, very much want side effects to break the atomic nature of the calling SELECT statement ...

Now at the application level we could call LOCK before such function calls but it is impractical to tell developers LOCK all tables used by all functions called directly or indirectly by triggers that are modified in ways that could cause race conditions within that code.  Right? :)

I haven't raised this issue with the PG crowd because, as I've said, from one point of view the behavior's correct and because we have, well, philosophical differences in some areas and I've gotten tired of trying to explain language implementation issues to them in a way that they can understand.  They tend to be inconsistent and unpredictable.