Forum OpenACS Development: RFC: Separate code and data directories by default in 5.2

1: RFC: Separate code and data directories by default in 5.2

Posted by Joel Aufrecht on 04/16/04 04:12 PM

I propose that we change OpenACS 5.2 so that (as many have suggested) the executable code directories are separated from the data (content repository files, logs, etc) pages.

Reasons

better security. With an enforced distiction, it's harder to put data in through the web interface that OpenACS will then execute
simplifies production backups. In any system with multiple webservers, you want to back up the shared data only once.
simplifies cvs Usually you want different cvs regimes for the volatile data vs the code. This puts all the data in one place. (And probably means the data directory shouldn't be a subdir of the code dir.)

Approach

New recommended directory in /service0/: data. Should this be inside the service0 directory or parallel (like /service0-data )?
Change the default config.tcl to have two root directories, one for data and one for code. Put log and content-repository-content-files directories in the data directory. (others? Photos? Can we do all this in config.tcl or do we have to touch the packages too?)
Add a flag in config.tcl, unset by default. If set, the flag prevents OpenACS from writing anything at all anywhere but the data directories (and tmp?). This blocks upgrades (which could be malicious), dirty code writes, mistaken overwrites, etc.

There are a lot of details missing before this can be a TIP. What else needs to go in here? Where does /etc belong? I think /etc should be in the code area.

3: Re: RFC: Separate code and data directories by default in 5.2 (response to 1)

Posted by Cathy Sarisky on 04/16/04 05:24 PM

This would be very useful. It can be a bit confusing in the current file structure to figure out where all the data is.

1- My preference for directories would be to have a data AND a code directory inside service0.

I agree about etc living in the code area. :)

2: Re: RFC: Separate code and data directories by default in 5.2 (response to 1)

Posted by Jeff Davis on 04/16/04 05:40 PM

I don't necessarily want logs in the same place as data since generally on a prod site I like them to be written to a seperate device from where the content repository data is, but otherwise I agree its a good idea to isolate the instance specific data from the codebase.

Another pet peeve is the existence of the directories in cvs. I hate seeing:

? content-repository-content-files/10
? content-repository-content-files/11
...
? content-repository-content-files/99
? log/error.log
? log/error.log.000
...
etc.

when I do a cvs update.

I don't see how you could enforce #3 without a serious overhall of the code though.

It would be good if it were possible to have the code on a read only partition which argues against having the data directory in the same place as the code.

I like having the content repository and the webserver tmp dir on the same partition so that uploads are just renames (and for the anal that partition could be noexec).

5: Re: RFC: Separate code and data directories by default in 5.2 (response to 2)

Posted by Steve Manning on 04/16/04 09:59 PM

Jeff

A .cvsignore file is your friend. Just pop one of these little beauties into the root of your service dir and stick the following in it:

apm-workspace
content-repository-content-files
database-backup
etc
log
.*

And cvs will ignore those directories (and files starting '.')

If the directories are already known to CVS you may have to make it forget first - probably removing the CVS directories would do it.

Steve

4: Re: RFC: Separate code and data directories by default in 5.2 (response to 1)

Posted by Peter Alberer on 04/16/04 06:45 PM

Hi Joel,

is this just an issue of changing the definition of the "standard" production environment? In my environment for example the content-repository is the only "data"-like store under the openacs-root. All the other things that an openacs-instance needs (database-dir, database-logs, aolserver-access-log, aolserver-error-log, aolserver-bindir,supervise-scripts,...) are lying somewhere on the disks, maybe sym-linked to the openacs-dir. But not necessarily... As far as point 3 is concerned: My cvs way is to create a checkout of the whole openacs tree and symlink the necessary directories to my real openacs-root.(this way i can use "cvs update -Pd" later on to update all levels of dirs)
Did i misunderstood the reasons of your RFC?

6: Re: RFC: Separate code and data directories by default in 5.2 (response to 1)

Posted by Don Baccus on 04/16/04 10:56 PM

Yes, this would be a very good thing to do ...