Content Repository Design

ACS Documentation : Content Repository

I. Essentials

Feature Requirements Document

II. Introduction

Serving content is a basic function of any web site. Common types of content include:

Journal articles and stories
Documentation
News reports
Product reviews
Press releases
Message board postings
Photographs

Note that the definition of content is not limited to what is produced by the publisher. User-contributed content such as reviews, comments, or message board postings may come to dominate active community sites.

Regardless of its type or origin, it is often useful for developers, publishers and users to handle all content in a consistent fashion. Developers benefit because they can base all their content-driven applications on a single core API, thereby reducing the need for custom (and often redundant) development. Publishers benefit because they can subject all types of content to the same management and production practices, including access control, workflow, categorization and syndication. Users benefit because they can enjoy a single interface for searching, browsing and managing their own contributions.

The content repository itself is intended only as a common substrate for developing content-driven applications. It provides the developer with a core set of content-related services:

Defining arbitrary content types.
Common storage of content items (each item consists of a text or binary data with additional attributes as specified by the content type).
Establishing relationships among items of any type.
Versioning
Consistent interaction with other services included in the ACS core, including permissions, workflow and object relationships.
Categorization
Searching

As a substrate layer, the content repository is not intended to ever have its own administrative or user interface. ACS modules and custom applications built on the repository remain responsible for implementing an appropriate interface. (Note that the ACS Content Management System provides a general interface for interacting with the content repository).

III. Historical Considerations

The content repository was originally developed in the Spring of 2000 as a standalone data model. It was based on an earlier custom system developed for an ArsDigita client. Many of the principle design features of the original data model were also reflected in the ACS Objects system implemented in the ACS 4.0 core. The content repository was subsequently rewritten as an extension of ACS Objects.

V. Design Tradeoffs

The content repository is a direct extension of the core ACS Object Model. As such the same design tradeoffs apply.

The content repository stores all revisions of all content items in a single table, rather than maintaining separate tables for "live" and other revisions. The single-table approach dramatically simplifies most operations on the repository, including adding revisions, marking a "live" revision, and maintaining a full version history. The drawback of this approach is that accessing live content is less efficient. Given the ID of a content item, it is not possible to directly access the live content associated with that item. Instead, an extra join to the revisions table is required. Depending on the production habits of the publisher, the amount of live content in the repository may be eclipsed by large numbers of infrequently accessed working drafts. The impact of this arrangement is minimized by storing the actual content data in a separate tablespace (preferably on a separate disk) from the actual revisions table, reducing its size and allows the database server to scan and read it more efficiently.

VI. Further Reading

The Object Model provides a graphic overview of how the content repository is designed. The model links to pages of the API Guide that describe individual objects. The Developer Guide describes how to address common development tasks using the content repository.

karlg@arsdigita.com
Last Modified: $‌Id: design.html,v 1.3 2018/07/03 18:19:14 hectorr Exp $