Forum OpenACS Q&A: Current state of Postgres v. Oracle

We all know by now that PostgreSQL has grown up and continues to evolve
in a positive direction. Within that context, it would be helpful to me
(and probably to others) to get a frank evaluation of its current
robustness relative to Oracle, particularly in terms of reliability and
scalability. I'd appreciate feedback on the following questions:

-How scalable has Postgres proven to be so far? What is the largest
site currently running it in an application similar to OpenACS? What,
if any, scalability hurdles are anticipated?

-How stable has Postgres proven to be, relative to Oracle, in an
OpenACS-like environment? Assuming that having a qualified DBA on-hand
is not an issue, is one less likely to go down than the other?

-How disaster-proof is Postgres, relative to Oracle? How is it in terms
of crash recovery? How good are the back-ups?

Thanks in advance.

Collapse
Posted by Don Baccus on
I'll take a shot:

1. PG is proving to be extremely scalable.  Tim Perdue's planning to
move SourceForge from MySQL to Postgres 7.1 based on performance under
load alone.  It kicked MySQL's butt when he tested with a simulated
load of several users, and throughput stayed steady on his target
server (a dual-processor linux machine with 256MB or 512MB RAM) up to
the 50 or so simultaneous user range, doing a mix of select, insert
and update queries.  MySQL crashed under his stress testing, PG just
chugged away.

That represents excellent scalability.

Note that insert/update performance of Postgres 7.1 will be very much
higher than in earlier versions, for reasons I won't go into here.

2. Stability improves with each release, this much is clear.  PG 7.0
stability seems very good to me, PG 7.1 should be even better.
However, the PG team still finds and fixes potentially nasty,
database-corrupting bugs.  They're becoming exceedingly rare and I
*think* with PG 7.0.3 we're out of the woods, but could be wrong.

I've run into nasty and disturbing Oracle bugs while working on client
sites (or, since I'm a Postgres fan, "encouraging and spirit-lifting
bugs", perhaps?).  But none that threaten corruption of the database -
Oracle's more mature in that respect.

On the other hand, I've been running birdnotes.net for over a year
now, and though it gets little traffic it has accumulated thousands of
census reports containing nearly a million rows of per-species data,
with no problem whatsoever.  The server stays up for months at a time.
Tcl 8x, with its regexp memory leaks, is a bigger threat to stability
than Postgres judging from my experience.

3. Pg_dump backups in version 7.0 require hand-massaging at times
before you can reload them, due to misordering of the definitions of
some elements leading to illegal forward references.  Other than
that, they're fine for databases whose dumps don't exceed the
compressed 2GB limit of a Linux file.  The problem should be fixed in
PG 7.1.

aD and I believe most others who use ACS Classic knock down AOLserver
nightly and do a database export rather than depend on the fancy
backup-and-restore tools available for Oracle.  So this supposed
advantage to Oracle is unused by the average ACS Classic admin.  The
same approach could be taken on a Postgres site if one doesn't trust
pg_dump for some reason: take down AOLserver, then Postgres, dump the
database directory to a backup device, bring everything back up.

Still, backup and recovery is a weakness in PG at the moment, one that
requires a bit of a stone-age mentality to cope with.

OK, now for the record, Michael has asked me this question privately,
and for his particular customer I've recommended they go with Oracle,
based in part on the his timeframe.  His needs would be better suited
by PG 7.1, but that's not due for general release until just about the
time Michael needs to fund work on the customer site.  ACS 4.0's a
potential issue, too.

For other some other scenarios I would heartily recommend PG 7.0, and
if PG 7.1 proves to be a solid as I expect, in six months I'll be
recommending PG 7.1 for just about any project I'm asked about,
particularly after OpenACS 4.0 is released (and ACS 4.0 Classic
matures, for that matter).

Collapse
Posted by Michael Feldstein on
Now that we're up to PostgreSQL 7.1.x, I thought it might be
useful for the community to revisit this issue. How does
PostgreSQL compare to Oracle for ACS-like applications? When
is Postgres the clear winner? When is Oracle the clear winner?
When is it a toss-up?
Collapse
Posted by Torben Brosten on
Another related question:

Does PostgreSQL have a backup api similar to Oracle's EBU or RMAN (preferrably) for applications such as Veritas' NetBackup?

Collapse
Posted by Don Baccus on
The answer to the last question is "no".  There is no such database backup API.  pg_dump gives a consistent backup so for your average web/db site it's adequate.  My tcl script for dumping and ftp'ing the result to another machine is included as an example in the documentation.  My site with important data (birdnotes.net) is ftp'd from the server to my box at home, and another box on the east coast, so I feel safe enough.

However ... it's not a huge database.  A million or so data records but no large documents, photos, etc.  Quite modest.  PG falls down when you have huge amounts of data stored because there's no way to do consistent incremental backups.

PG performs worse when you have frequent update or paired insert/delete operations as old rows stick around until they're removed by a VACUUM process.  This hasn't changed.  PG 7.2 has a "lazy vacuum" that runs in the background and knocks off low-hanging fruit, if you will, and as a consequence is much less disruptive to the running system as it doesn't need to lock things down nearly so tightly.

But though this is an improvement it's not a 100% solution.  The keeping of old rows is one thing that makes PG's high-concurrency row-level locking and cheap transactions easy to implement, though.

In practice this isn't a big deal for most sites, since generally most do have off-peak periods and most aren't updating the database contents at full speed 24/7 and causing tables to grow massively large and the evening vacuum exceedingly slow.

Other DBA-ish type stuff such as support for named schemas and the ability to place them on various disk drives and the like rather than do symbolic links are coming in PG 7.3.  This will get rid of one of the biggest complaints about PG by folks with large-scale applications, the difficulting of scattering data and indexes on various spindles in order to decrease the time spent waiting for physical I/O.

Again, though, most web/db sites aren't anywhere near busy enough to worry all that much about this, given that machine with 2GB or 4GB of RAM have more RAM than most database systems had in available disk space not all that long ago!

Collapse
Posted by Andrew Piskorski on
Don, so Postgres doesn't have anything like Oracle's RMAN utility. But, is there any way at all to do hot physical backups using archived redologs (the "write ahead log" for Postgres, I think) the way you can with Oracle? Or any word on when/if the Postgres folks plan to support that?

(RMAN makes doing hotbackups easier for Oracle, but it's not necessary - for example, Unix Backup & Recovery describes the oraback.sh script which does hot backups without using RMAN.)

Collapse
Posted by Gary Long on
EDB has created an enhanced version of PostgreSQL 8 with Oracle compatibility as a major theme. See www.enterprisedb.com.
Collapse
Posted by Jose Marinez on
Hi Gary,

Does EnterpriseDB run on OS X Tiger? Can I use the same development tools from Oracle and use it to develop for EnterpriseDB?

Thanks.

Collapse
Posted by Gary Long on
Definitely runs on OSX. Depends on what development tools: PS/SQL yes, Oracle Forms no.

Sorry my reply is so late! I missed your post.

Gary

Collapse
Posted by Silviu Dascal on
Hi Guys ,
does anybody know how u can migrate from Postgres to Oracle without third party tools ? Or without using flat files :) ?
Tnks ,
Silviu
Collapse
Posted by Jade Rubick on
I think your answer is also going to depend on how much you're willing to use the full powers of the database.

If you're really using the database to its fullest abilities, Oracle is going to be a winner for you hands down.

If you're looking for easy maintainability, Postgres will win hands down.

If you're competing on price, well, do I even have to mention which will win?

Oracle has some powerful functionality that is unbeatable in certain circumstances. With 11g, you'll now have the ability to pretty much always keep your database up. You'll be able to upgrade pl/sql pacakges without downtime or invalid db objects.

And the performance of Oracle on large large databases is pretty hard to beat, especially when you start talking about analytic functions. Oracle in general latches much less than other databases, which gives it much more scalability. But most sites really won't run into these issues unless they're huge or really pushing the envelope.

Oracle is definitely more mature than Postgres, but Postgres is improving quickly. It's no longer as easy a choice as it used to be.

If you want to understand where Oracle shines, read any book by Tom Kyte.