Forum OpenACS Q&A: Backup content-repository-content-files

Due to the sheer amount of data (around 20GB per month) I would like to change the backup behaviour from plain rsync (running out of diskspace on the target server) to something a little bit different. Here is what I envision:

For each month I will have a directory on the target server, which I fill every couple of hours (using cron) with the changes (using rsync) that happenend in this month.

As rsync does not seem to allow me to say something like "only files changed in May 2006", I thought about using find -mtime, drop the resulting list in an include.txt and have rsync use this. But that was not entirely successful either and is a PITA to use.

Which brings me to two questions:

a) Has someone solved this problem already?

b) Would using OpenACS to do this be better and if yes, in which way

-- Use a scheduled procedure to get a list of content-items that have been created since the last call of the procedure and copy them to the appropriate directory on teh backup server

-- Change the content repository to do the same thing upon content upload (so the file is written to /content-repository-content-files and /backup/cr/2006-05

-- something else ?

Collapse
Posted by Nis Jørgensen on
I am not really sure how your proposed solution saves disk space, unless you have previously been storing multiple copies of the same file (in different backups). If that is the case, you may want to look at using hardlinks, for instance using rsync's '--link-dest' option.
Collapse
Posted by Malte Sussdorff on
Ah, I ommitted one very important fact. Every month the monthly backup directory on the target system will be written to a tape and the directory deleted. So on the 1st of May, April backup will be written to the tape and the directory containing the backups for April are being deleted from the backup server.