Forum OpenACS Development: Ideas for Content Management

Posted by Luke Pond on 05/28/01 11:30 PM

Howdy,

I've written a proposal for a content management package I'm planning to build with OpenACS4. It's posted at http://www.musea tech.net/etp/editthispage.html.

The requirements it presents are sufficient to build many of the websites we've been getting RFPs for lately. These are information- providing organizations that have reached the limit of what they can do with static pages, and are looking for a system that helps them get organized and work more efficiently. Since I see this package as something that's potentially useful to the whole community, I'd appreciate it if you could use this thread to tell me what you think.

Thanks!

2: Response to Ideas for Content Management (response to 1)

Posted by Dave Bauer on 05/29/01 03:28 PM

Please read this [1] and this [2]. I think we are on to something here.

[1] https://openacs.org/forums/message-view?message_id=20744
[2] https://openacs.org/new-file-storage/one-file?file_id=122

3: Response to Ideas for Content Management (response to 1)

Posted by Don Baccus on 05/29/01 05:42 PM

I like the idea of a "content management light" such as Luke's describing, and it seems like they've had some success with an earlier version with an existing customer.

4: Response to Ideas for Content Management (response to 1)

Posted by Dave Bauer on 05/29/01 06:35 PM

It would be good, but I don't see it mentioned that when a page is published, a current version could be written to the filesystem to reduce database lookups. That way revisions aren't read out of the database except when they are changed.

With regard to comments, they could either be retrieved from the database or written to the published file.

With this type of system the pages could actually be rendered into HTML if there were no parts that needed to be queried from the database on each page read.

5: Response to Ideas for Content Management (response to 1)

Posted by Domingo Alvarez duarte on 05/29/01 08:34 PM

Netscape 6 seem to have the same funcionality for editing in place html contents as well, it's a nice idea.

6: Response to Ideas for Content Management (response to 1)

Posted by Don Baccus on 05/30/01 12:14 AM

Comments should be managed by general comments regardless of where the actual paper's stored.

I don't have time to go back and look at the paper at the moment, but when I did read it I had the impression that the edited pages were stored in the filesystem, with auditing provided by CVS.

7: Response to Ideas for Content Management (response to 1)

Posted by Rafael Calvo on 05/30/01 07:21 AM

Great work luke.
requiremenst 20.10.1 and 2 could be eliminated having 20.10.5. I've read that Netscape 6 will make the DHTML for "wysiwyg editing in a form" much easier (compatible with IE).
we discused some of this in this thread.
there are some tools like webeditpro that makes this possible. How about adding a requirement that makes pluging these tools easier? I am thinking in something where we can add weeditpro (or whatever we write later) to all the forms by just turning on some parameter.

8: Caching (response to 1)

Posted by Luke Pond on 05/30/01 05:00 PM

A "Render to HTML" feature isn't possible because the whole design hinges on the site itself being used as its own editing interface. So at a minimum the page has to determine whether or not to present the "Edit this page" link.

The final requirement (for performance) says that there needs to be some caching, so that if you request the same page multiple times, we're not sending many identical queries to the database to get the content for that page. That probably involves writing a scheduled proc that goes through the cache and culls anything that has expired - we don't want all the content requested in the lifetime of the server process hanging around in memory.

With cached content from the db for each page, performance benchmarks should be competitive with any other templated page. And since the templating system compiles ADPs into bytecode, they perform pretty well.

9: Response to Ideas for Content Management (response to 1)

Posted by Talli Somekh on 05/30/01 07:31 PM

Don Baccus said: "Comments should be managed by general comments regardless of where the actual paper's stored.

I don't have time to go back and look at the paper at the moment, but when I did read it I had the impression that the edited pages were stored in the filesystem, with auditing provided by CVS."

Don,

ETP will be fully stored in the DB. For UPO, our CMS hack used CVS and the file manager because that was the easiest and fastest way to build it, considering we were using 3.2.4.

talli

10: Response to Ideas for Content Management (response to 1)

Posted by Don Baccus on 05/30/01 07:59 PM

I see, my misunderstanding, then. I confused the description of the existing client-driven piece with your plans for the more general piece.

You should probably be using the content repository, no?

11: Response to Ideas for Content Management (response to 1)

Posted by Luke Pond on 05/30/01 08:36 PM

Probably I should be using the content repository instead of defining my own content item and revision tables. There are features in the CR data model I know I don't want, such as folders and templates, but that doesn't mean I can't go with the flow and use the cr_items and cr_revisions tables.

Michael Slater's post today over <a href="/bboard/q-and-a-fetch-msg.tcl?msg_id=0001jz&topic_id=13&topic=OpenACS%20CMS">here</a> mentions a technique for creating a categorization taxonomy for content items, and that alone is enough to make me want to use those tables.

I also need to store images in the database - what is the general opinion on using the cr_revisions table for this as opposed to rolling your own? Again, the possibilities for categorization are compelling to me. Has anyone had experience with creating a huge image catalog using the CR datamodel?

Originally I was trying to keep the ETP package as simple as possible, so naturally I avoided relying on the 28 files and 10000 lines of code in the acs-content-repository/sql directory. Of course I also recognize the benefits to be had from code reuse, and perhaps using the CR doesn't necessarily mean more complexity. If anyone can list the other benefits it brings here, that would be useful.

12: Response to Ideas for Content Management (response to 1)

Posted by Don Baccus on 05/30/01 09:16 PM

We've been encouraging use of the CR in part to keep the differences between RDBMS engines in regard to their handling of large binary and text data buried in one place.

It also allows storage in the file system rather than in the db, and packages might want to (optionally) expose that choice to system administrators when the package is mounted, etc.

And we'll be building our sitewide search facility on top of the CR (in PG's case), so clients will get that for free.

Arguments against:
<ul>
<li>Complexity. As you've mentioned, the CR comes complete with folders and the like. Not sure I would've made that decision myself.
<li>Unknown Scalability. In theory, indexing helps a lot here, i.e. every time you double the number of items the cost of accessing an item goes up by a factor of log2. For large binary content the cost of delivering it is high regardless of any overhead in accessing it. The LRU cache employed by any reasonable RDBMS should keep localized references in RAM, and if the usage pattern is such that you start getting disk I/O this would be true even if various packages were rolling their own.
So, I don't see any particular reason why the CR won't scale reasonably.
</ul>
So at this point the CR seems to be a win rather than a lose. As time goes on we may learn otherwise, but that's the current thinking.