Forum OpenACS Q&A: Response to Current state of Postgres v. Oracle

Collapse
Posted by Don Baccus on
The answer to the last question is "no".  There is no such database backup API.  pg_dump gives a consistent backup so for your average web/db site it's adequate.  My tcl script for dumping and ftp'ing the result to another machine is included as an example in the documentation.  My site with important data (birdnotes.net) is ftp'd from the server to my box at home, and another box on the east coast, so I feel safe enough.

However ... it's not a huge database.  A million or so data records but no large documents, photos, etc.  Quite modest.  PG falls down when you have huge amounts of data stored because there's no way to do consistent incremental backups.

PG performs worse when you have frequent update or paired insert/delete operations as old rows stick around until they're removed by a VACUUM process.  This hasn't changed.  PG 7.2 has a "lazy vacuum" that runs in the background and knocks off low-hanging fruit, if you will, and as a consequence is much less disruptive to the running system as it doesn't need to lock things down nearly so tightly.

But though this is an improvement it's not a 100% solution.  The keeping of old rows is one thing that makes PG's high-concurrency row-level locking and cheap transactions easy to implement, though.

In practice this isn't a big deal for most sites, since generally most do have off-peak periods and most aren't updating the database contents at full speed 24/7 and causing tables to grow massively large and the evening vacuum exceedingly slow.

Other DBA-ish type stuff such as support for named schemas and the ability to place them on various disk drives and the like rather than do symbolic links are coming in PG 7.3.  This will get rid of one of the biggest complaints about PG by folks with large-scale applications, the difficulting of scattering data and indexes on various spindles in order to decrease the time spent waiting for physical I/O.

Again, though, most web/db sites aren't anywhere near busy enough to worry all that much about this, given that machine with 2GB or 4GB of RAM have more RAM than most database systems had in available disk space not all that long ago!