Forum OpenACS Q&A: Everything in the database? (plus Rete, triggers, etc.)

Jonathan, while your argument sounds fine in principle, it is so general that I'm not sure whether it really means anything or not in practice. What specific stuff was being forced into the database and thus killing scalability? And what would you recommend instead?

(And don't just say files for file storage, we all know that stuffing them into the RDBMS as LOBs is silly and always was silly. File vs. LOB is not a useful example of anything.)

I can think of some examples, btw, but they are fairly esoteric. E.g., I worked on a project which shoved real-time options pricing data (from an electronic feed; on only a small number of options however) into Oracle almost solely so that Oracle could fire triggers based on that data.

That actually worked adequately in that case, but handling a wider variety of options may have strained it severely, and larger numbers of (possibly user-defined...) triggers might well have collapsed it entirely. (The production system never saw such loads, unfortunately, and I don't remember what our scalability testing said about data feed and trigger-related loads, or after all these years, whether we even checked that at all.)

Note however, that the "stuff real-time options prices into Oracle and fire triggers" design decision had nothing at all to do with some sort of belief in an "everything in the database" principle. It was simply that Oracle's triggers were the only tool we knew of at the time for doing "active database" type stuff - and indeed, are unfortunately almost the only common such tool.

Oracle's Advanced Queuing software might (or might not) have been better than simply inserting prices into tables. And Jerry Asher later pointed out the massive superiority of the Rete algorithm for such things over Oracle's (naieve) trigger implementation. Stonebraker's Streambase stuff is no doubt also relevent.

So yes, if I was designing a similar major application all over again, I would definitely try to get the "hard active" data-driven aspects out of the (poorly implemented) RDBMS native triggers into some more efficient and more powerful system. (Perhaps I would even write a PostgreSQL module to do so.) Plus at least two other major architectural differences which I haven't mentioned here at all.

But somehow I doubt that's what you meant in your "everything in the database" comments...