James, what RDBMS are you doing this with? Oracle or PostgreSQL?
Presumably your 2 hour load was on testing box used a single RAID 1
volume (2 disks) for everything on that machine, the whole
unix install, database transaction logs, AOLserver log files, etc.?
If so, I think it would be quite interesting, on that same box, to put
in an identical 2nd RAID 1 volume, move the product database tables
(and only those tables) to the new empty volume, and re-run the same
test. Is it now a lot faster? It might be.
There might be cleverer ways to infer the same info, by profiling your
IO numbers or something. Hm, perhaps turn the write back cash on/off
on your test disks. (Normally you want it off so you don't corrupt
your data in a power fail.) I'm guessing that the write back cache
gives a much bigger win for random IO than for sequential IO, so if
during your nightly product table update test, you see a big win from
turning the disk write back cache on, that might suggest that you have
too much non-sequential IO. And of course, we assume that moving the
product tables to their own disk volume would decrease the amount of
non-sequential IO. That's all just a guess on my part though.
I skimmed Oracle's "SAME configuration" article briefly. It all sounds
like good advice, but it doesn't even attempt to answer the most
important lower-level question: When striping across "all disks", how
do you get the best price/performance for those "all disks"?
Also, SAME notes that in the general case, Oracle's IO behavior is
very complicated, and assumes that you aren't able to a-priori figure
out anything really useful about its IO behavior. This is a good safe
assumption in general, but it is not true for your particular
application! You know that you have a very specific, very
special bottleneck in your nightly product update job, and believe
that there are no other significant bottlenecks, so in your case, the
right question to ask is probably, "What's the most economical way to
greatly speed up this one special bottleneck?"
Of course, the cost in your time to figure that out could easily be
higher than the hardware cost of just slapping in a big fat RAID 10
array. But if you were very constrained on hardware costs, those are
probably the questions to ask.