Jonathan, while your argument sounds fine in principle, it is so
general that I'm not sure whether it really means anything or not in
practice. What specific stuff was being forced into the database and
thus killing scalability? And what would you recommend instead?
(And don't just say files for file storage, we all know that stuffing
them into the RDBMS as LOBs is silly and always was silly. File
vs. LOB is not a useful example of anything.)
I can think of some examples, btw, but they are fairly esoteric.
E.g., I worked on a project which shoved real-time options pricing
data (from an electronic feed; on only a small number of options
however) into Oracle almost solely so that Oracle could fire triggers
based on that data.
That actually worked adequately in that case, but handling a wider
variety of options may have strained it severely, and larger numbers
of (possibly user-defined...) triggers might well have collapsed it
entirely. (The production system never saw such loads, unfortunately,
and I don't remember what our scalability testing said about data feed
and trigger-related loads, or after all these years, whether we even
checked that at all.)
Note however, that the "stuff real-time options prices into Oracle and
fire triggers" design decision had nothing at all to do with some sort
of belief in an "everything in the database" principle. It was simply
that Oracle's triggers were the only tool we knew of at the time for
doing "active database" type stuff - and indeed, are unfortunately
almost the only common such tool.
Oracle's
Advanced Queuing
software might (or might not) have been better than simply inserting
prices into tables. And Jerry Asher later pointed out the massive
superiority of the
Rete algorithm
for such things over Oracle's (naieve) trigger implementation.
Stonebraker's Streambase
stuff is no doubt also relevent.
So yes, if I was designing a similar major application all over again,
I would definitely try to get the "hard active" data-driven aspects
out of the (poorly implemented) RDBMS native triggers into some more
efficient and more powerful system. (Perhaps I would even write a
PostgreSQL module to do so.) Plus at least two other major
architectural differences which I haven't mentioned here at all.
But somehow I doubt that's what you meant in your "everything in the
database" comments...