Forum OpenACS Q&A: database recovery: how does postgres unix work?

We were hacked last Friday. Rackspace informed us that one of our servers had denial of service script running in /var/tmp owned by the user root:

- -rw-r--r-- 1 root cgi 1128 Mar 14 05:26 flood

RPM verification showed that the size, timestamps, and MD5 checksums have changed on multiple essential system files! Rackspace tech support basically removed the hard drive, gave us a new hard drive with a fresh Redhat Linux running on it, and mounted the old hard drive on that.

We tracked the way they got in. At one point we were testing some proprietary software, which used a licence manager called FlexLM -- FlexLM was listening on some strange port and running as root. (BTW, this really bugs me about using proprietary software. The fact that they would use something that has a widely known hack -- presumably to protect them from having us giving away their lame software -- and then not tell their customers about it! In the future, I would attemt to investigate any software we put on the system...But that's another story.)

I was using the server to run -- a OpenACS community site for film makers. Much of the data was in postgres, which wasn't getting backed up. I have a week to get our data off the mounted hard drive before Rackspace removes it. My questions run along the lines of: Where does the data exist when it's not backed up? Is it possible to recover from this? I couldn't find an mention of this sort of situation on google.

Eric Wolfram

Posted by Patrick Giagnocavo on
Eric, you are probably in good shape and will not lose any data.

The postgres data files can all be found under the postgres data directory.

Try running "pg_config --configure" (the old binary, assuming that it matches the MD5 checksums and is not therefore compromised); or if you have exactly the same setup on the new system, just look at where the data files are there. The data directory usually holds postgresql.conf and pg_hba.conf as well - so typing "locate pg_hba.conf" should work.

How to recover: assuming that both Postgres instances on the old and new drives are the same version and revision of the Postgres code:

make a backup of the old data directory, e.g. cp -R /old/data /root/pgbackup-data

su - postgresql (or whatever the postgres user is called on your system)

export PGDATA=/old/path/to/files
pg_ctl start (after a few lines you should see that Postgres is running)

then run psql -l to see if your databases are there.

At this point, the first thing to do is make a backup using pg_dump ; then, if that is successful, start up Postgres using the "new" file locations that reference the new drive and load your pg_dump backups.

Posted by Eric Wolfram on
First of all, I'm glad you think the data is recoverable. And thanks for the tips. I followed this advice. I didn't want to run any executables from the compramised system. I couldn't rpm verify the old postgres because I had installed it from a tar file and I didn't have the checksum number.

So I went to download Postgres 7.2.3. When I tried to RPM postgres, there were too many dependency errors. So I grabbed the tar file and did a gmake and gmake install. I followed the old postgres manual to install. It was all going great and I could set up a database called 'test'. Then I followed your instructions:

bash-2.05$ export PGDATA=/mnt/oldboot/usr/local/pgsql/data
bash-2.05$ pg_ctl start
postmaster successfully started
bash-2.05$ /usr/bin/postmaster does not find the database system.
Expected to find it in the PGDATA directory "/mnt/oldboot/usr/local/pgsql/data",
but unable to open file "/mnt/oldboot/usr/local/pgsql/data/global/pg_control": Permission denied


bash-2.05$ psql -l
List of databases
Database | Owner | Encoding
template0 | postgres | SQL_ASCII
template1 | postgres | SQL_ASCII
test | postgres | SQL_ASCII
(3 rows)

My databases weren't there. Do you think the 'Permission denied' error is a permission problem or something more complicated?

Posted by Eric Wolfram on
Thanks for helping out. This message makes me wonder if it is permissions:

but unable to open file "/mnt/oldboot/usr/local/pgsql/data/global/pg_control": Permission denied

Because don't we want it to look for that file in /usr/local/pgsql/global/pgcontrol and not on the old mount? It's strange, I ran the chown as root and it basically repeats this for everything.

chown: changing ownership of `/mnt/oldboot/usr/local/pgsql/data/postmaster.opts': Read-only file system
chown: changing ownership of `/mnt/oldboot/usr/local/pgsql/data/postgresql.conf~': Read-only file system
chown: changing ownership of `/mnt/oldboot/usr/local/pgsql/.psql_history': Read-only file system
chown: changing ownership of `/mnt/oldboot/usr/local/pgsql/.bash_history': Read-only file system

here's the ls -l

[root@rhumba pgsql]# ls -l
total 28
drwxr-xr-x 2 506 507 4096 Feb 17 2003 bin
drwx------ 6 506 507 4096 May 13 10:06 data
drwxr-xr-x 4 506 507 4096 Feb 17 2003 doc
drwxr-xr-x 4 506 507 4096 Feb 17 2003 include
drwxr-xr-x 2 506 507 4096 Feb 17 2003 lib
drwxr-xr-x 4 506 507 4096 Feb 17 2003 man
drwxr-xr-x 3 506 507 4096 Feb 17 2003 share
[root@rhumba pgsql]#

I don't know users 506 or 507? Hmmmmm. I then cp the data directory to /tmp/data and I was able to chown it to postgres. But then it still gave the similar message:

but unable to open file "/tmp/data/global/pg_control": No such file or directory

Any thoughts?

Posted by Janine Ohmer on
A few things come to mind:

First, is the file actually there? If not, try running

find /mnt/oldboot -name pg_control

and see if it's there anywhere. You might be using the wrong PGDATA directory.

If it is there, what are it's permissions? You probably created a new postgres user on the new drive and it doesn't have the same user id as the old one, which can result in the user no longer owning it's files. If that's the case you'll see a number in the output from "ls -l" where you expected the name of the postgres user to be (or possibly some other username altogether if you happen to have a user on the new system that got assigned the same id).

Another possibility is that the rackspace folks mounted the drive in read-only mode.

Either way, the best fix is probably to copy the whole pgdata directory to your local drive, and then try again (pointing PGDATA at the new location, of course). That way you can chown and do any other fiddling that may be required without making irreversable changes to your only copy of the files. You might need to do the copy as root and then "chown -R postgres:postgres" (or whatever you called your user and it's group) on the copy.

Good luck!

Posted by Janine Ohmer on
Oops, we posted at the same time.

I think the Rackspace folks did mount the drive as read-only, which was probably smart.

It's looking for the file in the right place, based on what you set PGDATA to. But I'm confused by the last error; using a copy should have worked.

What do you get from "ls -l /tmp/data/global", and what did you set PGDATA to on your last attempt?

Posted by Eric Wolfram on
Thanks for the help and reality check. I needed to set it right -- but that brings up a new issue.

bash-2.05$ export PGDATA=/tmp/data_bak/data
bash-2.05$ pg_ctl start
postmaster successfully started
bash-2.05$ FATAL 1: The data directory was initialized by PostgreSQL version 7.2, which is not compatible with this version 7.1.3.

I'm sure I installed 7.2.3, ie:

[root@rhumba local]# ls
bin doc etc games include lib libexec pgsql postgresql-7.2.3 postgresql-7.2.3.tar sbin share src

Why would the system think it's 7.1.3?

[root@rhumba local]# ps auxww | grep sql
postgres 28668 0.0 0.5 5300 1328 pts/1 S 16:43 0:00 /usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data
root 29572 0.0 0.1 1476 460 pts/2 R 19:38 0:00 grep sql
[root@rhumba local]#

I get the feeling that I have enough sql-rope to hang myself. :-) I'm just remembering all the RPM dependency errors I was getting that made me try the 'download the tar version and gmake gmake install' way...

I guess no one said it would be easy. Why does the system think I have 7.1.3 when I just installed 7.2.3?

Posted by Janine Ohmer on
Try a couple of things:

"psql -V" - will give you the version. I'll bet you a quarter it says 7.1.3.

"which psql" - tells you which psql executable is in your path, which is therefore the one that gets executed. I'll bet another quarter it's not the one you just installed.

I'm guessing the box already had PG on it, probably from RPM, and you've added another copy. This is not necessarily bad, but it means you've got to set environment variables to make sure you're using the right one. I don't have a good example handy but I believe they're listed in the OpenACS installation guide. The most important thing is to make sure that the directory containing the right PG executables is ahead of the one containing the wrong ones in your PATH variable.

Another option is to try removing the rogue RPMs but you might get caught in another web of dependencies doing that. It's worth a try, though, since you don't seem to need the old version.

Posted by Eric Wolfram on
Okay, you're right you're right. I had another version going. After messing with the env .bash_profile and .bachrc files for awhile, with no progress, I decided to use RPM -e postgresql -- of course I had to do that to about 4 or 5 things, like postres-server and postgres-lib stuff -- but eventually it was all off. Then I did a gmake install again and recreated the postgres user and deleted the /usr/local/pgsql/data director and restarted the server...after a few things with changing permissions with the lockfiles...LAAAAAAAAA...I can see the old databases!!!!

[postgres@rhumba pgsql]$ export PGDATA=/tmp/data_bak/data
[postgres@rhumba pgsql]$ pg_ctl start
postmaster successfully started
[postgres@rhumba pgsql]$ DEBUG: database system is ready

[postgres@rhumba pgsql]$ psql -l
List of databases
Name | Owner
birdnotes | postgres
film | postgres
mytlc | postgres
new | postgres
roxbox | postgres
scanster | postgres
template0 | postgres
template1 | postgres
test | postgres
(9 rows)

I'm all ready to try a pg_dump and I'm wondering what options to use. One post in this forum says something like:

pg_dump -Ft -b mytlc > mytlc.tar

or would a simple think work like:

pg_dump mytlc mytlc.dump

Posted by Janine Ohmer on
The simple form is what we always use, but I'd be interested to hear if anyone thinks there is a better way.

Just FYI, now that it works you don't actually have to do a dump and restore. Just shut it down, move that data directory to where you want it's final resting place to be, and adjust PGDATA accordingly. Voila. Of course you need the dump command anyway so you can start doing nightly backups, but there's no need to go through a restore cycle unless you just want to.

Posted by Eric Wolfram on
Thank you everyone! I finally got these system restored and the web site is working again. All the data is there. I had to find old versions of postgres, aolserver and friggin libxml files...make install stuff....cvs complications...what a nightmare -- I should have made some docs when I originally installed everythign (but I was just learning.) I ended up looking for old tar files on the internet and finding old docs for my particular installation in cached google pages and (because the sites that used to hold the docs were down or gone like the dodo bird.) After all the frustration, I was very excited when it actually started -- I almost couldn't believe my eyes...

Couldn't have done it without many of you -- reading your old forum posts, docs and chat logs...thanks. I even found one of my old forum posts, which was surprising.