Forum OpenACS Q&A: Re: referencing a specific db pool
There was a post by Markus Schaber on benchmarking various insert methods on the pgsql-performance list. The very short answer is
that copy is the fastest by a large margin and individual inserts are very slow.:
Here are the results of some benchmarks we did inserting 30k rows into a table, using "time psql -f blubb.sql -q dbname": File kingfisher skate 30kinsert.sql 39.359s 762r/s 335.024s 90r/s 30kcommonce.sql 11.402s 2631r/s 7.086s 4233r/s 30kwithids.sql 10.138s 2959r/s 6.936s 4325r/s 30kprepare.sql 8.173s 3670r/s 5.189s 5781r/s 30kdump.sql 1.286s 23328r/s 0.785s 38216r/s 30kdumpseq.sql 1.498s 20026r/s 0.927s 32362r/s Kingfisher is the single processor machine I mentioned yesterday night, skate the SMP machine. The table has five rows (bigint, bigint, double, double, timestamp without time zone). The first of them has a "default nextval('sequence' ::text)" definition, and there are no further constraints or indices. The 30kinsert.sql uses simple insert statements with autocommit on, and we insert all but the first column which is filled by the default sequence. With this test, kingfisher seems to have an irrealistic high value of commits (syncs) per second (see what I wrote yesterday) [1], skate has a more realistic value. 30kcommonce.sql, as suspected, gives a rather high boost by encapsulating all into a single commit statement. 30kwithids gives a small boost by inserting pre-calculated sequence numbers, so it seems not worth the effort to move this logic into the application. 30kprepare prepares the insert statement, and then issues 30k EXECUTE statements within one transaction, the speedup is noticeable. 30kdump simply inserts the 30k rows as a dump via COPY FROM STDIN. (as with 30kwithids, the first column is contained in the insert data, so the default value sequence is not used). Naturally, this is by far the fastest method. 30kdumpseq.sql uses COPY, too, but omits the first column and such utilizes the sequence generation again. This gives a noticeable 15% slowdown, but seems to be still fast enough for our purposes. Sadly, it is not available within jdbc. Thanks for your patience. Footnotes: [1] We suspect this to be some strange interaction between ide, cryptoloop and ext3fs, so that the sync() call somehow does not really wait for the data to be physically written to the disk. (I really can't imagine a crypto-looped notebook harddisk to do more syncs/second than a SCSI-Based RAID in a server machine. We did some small benches on the sync() / fsync() calls that seem to prove this conclusion.) -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich