Forum OpenACS Q&A: Thoughts on reworking double-click protection

This is kinda crosspost from a response at ars
I have been thinking about this lately and this seems like a good time to bounce an idea off the community. To prevent double clicks on pages incorporate these steps:
1) On the preceding form pages create an entry in a nsv array with a key value pair of (unique id) / $time.
2) Export the unique id to page with doubleclick protection
3) First thing in protected page check to see if nsv entry still exists
IF NOT: Either page has been hit already or nsv cache purging has removed the allowed action.
IF SO: UNSET nsv array entry and continue on and process transaction

Add a scheduled proc to periodically clean out old nsv entries that have a time value beyond limit (set by configuration variable)

I think this approach saves :
1) A call to the db on preceding page and having to prereserve an id which may/maynot be used ... not all people hit that submit button
2) Page is tagged to preceding page so abandoned submissions dont block other threads of same intent.
3) It can be a more generic approach to the same problem for several places where implementation may be desired.

Someone pointed out that there may be problems with clustered servers but I think that would be limited to servers that are not on round robin setup and are using best available approach. This brings up another item for discussion: An approach for clustered server communication that would be generic enough for this and other issues / session mgmt, etc.

Just some thoughts on how I will probably implement but I'm open to suggestions / comments.

Collapse
Posted by Don Baccus on
Well, let's see...

#1. unused id's are aesthetically ugly, I suppose, but in practice before we use up 2 billion of them we'll all be running 64 bit processors, and we'll be approaching planetary death by the sun doing a supernova gig on our ass before we run out of (2^64-1) keys.

#2. Not clear what you mean, here.  Who is blocking the thread today?  The thread's done when the page is returned to the user, hitting "submit" does another POST/GET and initiates another thread.

#3. If you need unique keys not tied to the DB, yes, such an approach might be useful.  I've never needed such keys myself, though.

What is broken that needs solving, here?

Collapse
Posted by carl garland on
> What is broken that needs solving, here?

Double click protection is something that has implications that go beyond the ecommerce model and orders. I was hoping to come up with a more generic approach that might be used on registration pages, msg submission pages, etc. While the current implementation proposed by PG in http://www.arsdigita.com/books/panda/ecommerce of:

1.Serve user a dynamically generated order form that includes a unique order ID, pointing to an "insert-order" page.
2.When the user hits submit, the insert-order page will run and insert a row into a database table with a primary key constraint on the order id. If the insert fails (Oracle won't allow two rows with the same primary key), catch it and look to see if there is already an order in the database with the same id. If so, serve the user an order status page instead.
if the insert succeeded, proceed as above

This implementation could be extended but it does create and rely on querying dbs. In my ever attempt to adhere to DCIs attitude of avoiding DB querys as much as possible (ie aolserver tuning http://www.aolserver.com/documentation/tuning.adp
Beware the Database Databases are a bottleneck ) I think where possible it is desirable to move functionality out of the database and into the AOLserver process where it can be done in a reliable and efficient manner. I did not mean to imply that the current implementation blocks the thread of execution just that the new one wouldnt either.
Collapse
Posted by Don Baccus on
As far as avoiding the database, the greatest leverage will come from caching select queries (i.e. memoizing them using the appropriate ACS utility  routines) used to build pages.

This is something that happens a LOT, and often include joins, which are relatively expensive.

Insertion of new stuff into the database is relatively rare, and doing  a select nextval() is one of the fastest things you can do in Postgres.  No table is referenced (unless the code's still using our "dual" view which was added to ease porting of Oracle code).  No "where clause" means no poking at an index to extract rows, etc.  It is really a very fast database operation that in essence doesn't really "hit" the database.

And the time spent doing the "select nextval()" is going to be absolutely swamped by the time to do the "insert" that follows.

So while I agree that avoiding the DB is a good thing, in this particular case the cost is low, and I don't think that avoiding it would noticably increase throughput on working sites.

The thing to do is to concentrate on more costly queries and avoid them (again, usually by judicious caching).  There's a LOT to be gained there, particularly if you're doing queries to summarize recent content, do personalization, manage portals etc on user pages - such SELECTs tend to be fairly complex and therefore expensive.

Collapse
Posted by Don Baccus on
Well, it turns out that aD itself is re-thinking double-click protection, but not for the reason raised by Carl.  They've run into a  couple of interesting cases where the simple protection utilized by the ACS falls apart.
<p>
The article is <a href='http://muc.arsdigita.com/secure-doubleclick-protection.html';>here</a>.