Forum OpenACS Q&A: Response to Postgres database recovery

Collapse
Posted by Don Baccus on
I suggest you take this question over to the appropriate postgres mailing list and ask for help.  You may be out of luck.  Bullet-proofing of the database against this kind of unexpected shutdown is a major goal of PG 7.1, which will implement a write-ahead logging system (in essence, you get two copies of the database at all times, last backup + write-ahead log == current database + transaction  log).

I'm surprised you did so much harm by shutting down the system, you weren't running with "-F" (no fsynch), were you?  In another thread I pointed out that Linux can only guarantee that blocks make it to the disk drive, which in practice means the drive's cache, and platter writing can lag, causing problems in the case of a crash.  If the system was busy when you shut it down, something like this might have caused the corruption you're seeing.

Normally, unexpected shutdown just causes the loss of uncompleted transactions, not corruption like this.

You can help avoid problems by:

1. Using the postgres script in /etc/rc.d/init.d to start and stop the  postmaster (assuming you're running RH, the scripts move around according to distro).  System startup and shutdown also uses this script, gently killing the postmaster so it shuts down cleanly when you stop the system.  Then you don't have to worry about remembering to do so manually.

2. Doing backups.  The documents have a script I've written for my own  production site that you can modify for your own site.