Forum OpenACS Q&A: Content Repository: Pros & Cons

Posted by Adam Farkas on
Can anyone give me a clear explanation of the advantages and
disadvantages of the "content repository", versus the old (3.x) system
of each module generating its own tables, etc..

I don't really understand the advantages of the new system, or in any
concrete sense, some of its drawbacks.


Posted by Don Baccus on
The biggest advantage is that it already does a bunch of the work for you, which means you have to do less wheel reinventing.

For instance the file storage package, CMS and photo album are all built around a hierarchical folder/file paradigm.  This come more or less for free if you use the content repository (which includes file and folder content objects).

The CR also implements revision tracking for client packages, and in the Oracle case hooks into content processing stuff provided by the RDMBS.

Now ... the toolkit as inherited doesn't leverage the CR as much as it should.  For instance each of the above mentioned packages allows for file uploading and display, but the CR gave no help there.  That's something I've been working on but have yet to incorporate throughout various packages.  Where I have, though, big chunks of code have shrunk to little itty-bitty calls to my additions to the CR's Tcl API.

Another advantage of having it all in one place is that central services like searching, categorization, etc can all work on a common datamodel.  They don't have to "know" if package A implements revision tracking while package B doesn't, etc etc.

It's all about writing less code when you invent a new package, and not having to deal with so many special cases when you invent a central service.

At the moment in some respects, at least, it's more in theory than in practice, but with practice it will hopefully come to equal theory :)

Posted by Alex Sokoloff on
I looked at the Content Repository last fall to support a sort of one-off content management application for a website, one where the site had a fixed page structure, but various elements of the pages - images, articles, and page layouts could be updated. At first the CR seemed perfect for the job - that's one of the main types of functions it's supposed to support, judging by the documentation.

As I got more into the details, though, it didn't seem like there was a very clean way to use it. As far as I could tell, each page needed to be modelled as a container content item, and images and text elements could be child content items. This was necessary because a given page could contain a variable number of images and so on. Working this way, I couldn't see a non-ugly way to preserve/reproduce the history of revisions. When you use Edit This Page, for example, you can look at past versions of a page. But ETP probably doesn't use parent and child items. When you do use hierarchies of cr_items,
only parent and child content items have a relationship to one another, there's no relationship between their respective revisions.

This felt like a design limitation at the time, but I didn't get to pursue the problem as much as I'd have liked to (project canceled) and maybe I was missing something important.

Posted by Don Baccus on
No, the CR doesn't provide that service for you.  You'd have to build on top of it.  I think doing so would be easier than starting from scratch, though ...

I'm not sure the CR's really the right level to do this.  It would be nice to have a central service that did, but shoving more and more higher-level functionality into the CR makes it less and less approachable if all you need is the simple stuff.

Posted by Stephen . on
Con:  Some time must be invested to learn the content repository
data model and procs etc. The prevalence of 'lite' packages
suggests that it is too much trouble to learn how to use the
centralised services at the moment. The porting effort has made
more example code available as packages are moved to the
content repository, which will help.
Posted by Jun Yamog on

A great data model once you get used to it.  It gives you somewhere to start and an initial mindset how to store in contents.  I especially like the ability to store in the file system.  I have used to ability to reduce the storage needs of binary files.  Changes on the title and description can be done without copying the binary data since you can just copy the file name of the previous version.  Ala "ln -s" in Unix. Thanks to Luke P for pointing this out to me.

Ability for other packages to use your data such as search.


Poor documentation but I believe Roberto Mello is on the way to rescue.  This is evident as file-storage, news, etp, etc. have different strokes in inserting and fetching data.  Some directly inserts into the cr_items and cr_revisions, still some use the CR db apis.

You have to aware not to touch the content of other packages intentionally.  So delete cr_items/cr_revisions will nuke out the content of other packages too.


Regarding the parent and child relationship I think its possible to use the parent_id column of cr_items.  Right now the parent_id is used to reference the folder_id of cr_folders.  But from what I remember there is no constraint for to use another reference.  For example the parent_id of cr_items can point to another item_id for cr_items.  Thereby grouping cr_items.  Or you can just create a cr_folder and treat each folder as a group of items.  CR is generic enough for you to use to the way you need it, which is a great advantage.

Having this generic tables like cr_items, cr_revisions, acs_objects may at first be confusing.  But for me I think it a right step, except for the performance penalty since this generic table will handle many packages.  The negative effect is more rows of data in a single table and joins to package specific tables.