Naturally, all cacheing schemes that keep the cache outside the RDBMS
- NSV, memcached, whatever - are inherently non-transactional. This
is fine in many cases, and is keeps the NSV/cache implementation (but
not necessarily the application that uses it!) simple.
What if, instead of the memory cache layer existing as a completely
independent application, you turned it into part of the RDBMS? In
essence, let the RDBMS keep its own in-memory cache of query results,
but let it keep it on many other machines across the network,
not just in it's own RAM. Obviously that means the cache servers and
the master RDBMS have to exchane lots of short messages frequently in
order to keep the network memory cache coherent and transactional, and
getting that right would be critical for performance and scalability.
At least without thinking about it in much more detail, it
sounds feasible to me. There are also obvious potential
synergies with the stuff that the HPC cluster/Beowulf work on,
optimized MPI libraries, low-latency high-bandwith networks and
network drivers, etc. (Thus see also the somewhat related thread
discussing Clusgres, Postgres-R, and the like.)
But, this seems like such an obvious R&D/thesis opportunity that I
wonder whether it's already been done. Anyone know of any such
projects?