Forum OpenACS Q&A: Re: RFC: Avoid exposing object_ids in permanent URLs

Collapse
Posted by Jeff Davis on
Ola,

there is not really a performance issue with the constraint, the issue is that if you had two people simultaneously submitting bugs that you could end up with the same bug number for both inserts and so one would work and the other would cause the constraint to fire and the insert to fail. As it stands now, that will throw an error back to the user rather than attempt to insert the row again.

Retrying the insert on that when that fires is ok and would not be a problem up to a point but yeah, it makes writing code harder since you need to know how to handle exceptions properly and the constraint names need to be known (and you are relying on the constraint being there for things to work right etc.)

The point where it would break down would be when the rate of inserts is high enough that the query to find the max number took longer than the interval between inserts. Inserts would then back up and at some point the service just would not work.

Using a table with package_id,type,number and select for update would be better since even with a lot of rows the select for update bit would be fast so you could support a higher transaction rate before it broke down. At some point even that would be a problem and you would have to go to having a sequence, and even sequences can have problems so what you go to is a sequence that can hand out more at a time and cache sequences which comes at the expense of making sequences more gappy and no longer even necessarily sequential. To get an idea of why it matters, take a look at all those wierd options for oracle sequences; they are there for a reason and when you think about how fast getting a number from a sequence is, think also about sites for which that is not in fact fast enough.

The think to keep in mind though, is that these kinds of transaction rates are well beyond what the vast majority of DB's see.

My real concern with things like the code in bug tracker is less that of possible performance problems, it is that it will outright fail for some users in practice, it will be hard to reproduce, and it is buried down inside a plpgsql proc so is likely to be overlooked.