Forum OpenACS Q&A: Would copy speed PG installs of reference data?
INSERT statements. A couple have 30000 or 40000 INSERTs. As a
result, these modules take several minutes to load, and generate a LOT
of screen output.
I'm wondering whether storing the data as delimited text files and
using the psql copy command would speed things up significantly? It
seems to in my ad hoc testing... 42K records are copied into
us_zipcodes in under ten seconds from a file, but 42K INSERT INTOs
take 2 or 3 minutes (and this generates 42K lines of junk to the
screen in the process, unless you use the -q switch, which the
OpenACS4 package installer does not seem to do). Even with -q it
takes 100 seconds,
ten times slower than the copy command.
Is there any reason not to switch to using copy for reference data
files with more than (say) 1000 records in them? The equivalent bulk
load utility for Oracle could also be used, unless Oracle is plenty
fast enough already doing this sort of thing?
So we're trading off speed of loading for easier maintainability and of course an easier porting task if we adopt a third RDBMS in the future.
So there is (I think) no need to store the comma delimited form in CVS or tarballs or RPMs. Just generate it at install time, before doing the copy or Oracle equivalent. That way the editable, developer-preferred, format of the data remains the SQL statements as they are now.
It's not the end of the world if we can't do this, but if I automate
things further, so that non-core packages get installed automatically by my RPMs, the delay during the install will grow, and may worry some novice installers enough that they abort it thinking something "must have gone wrong". I noticed that just a very few packages were responsible for the bulk of the package install time, so I took a quick look, and this seems like a useful way to shorten that delay.