Forum OpenACS Q&A: Extensibility in OpenACS - Four Architecture Options

Hi,

We're currently facing an old question again: How can you build a sizy OpenACS application (composed by several modules) that can be customized so that it suits more then one customer? Sounds easy, but it isn't if you want to avoid a copy-paste-modify approach.

Let's take a simple example to explain the requirements: Let's consider user management. OpenACS provides several standard user management screens with the fields "first_names" and "second name". However, people in Spain have two first names and two second names, such as "Juan José Ruiz Martínez". And this time we are working for a demanding customer who requires us to do it "their way" and to use their design standards. So we actually have to include the four pieces of the name in one line so that the users screen needs to look like:

Name: [First1] [First2] [Second1] [Second2]
Username: [Username]
Email: [Email]
Password: [Password]
URL: [Url]
However, another customer from the US may requires us to add a field for a middle name such as in "Frank W. Bergmann" and a third customer requires us to add a second email address for the private email (just to put an example).

Option 1: The Current Copy-Past-Modify Approach

The standard approach in the OpenACS community (and also in many other Web-based content & community tools) for such a situation is to take the OpenACS code as a base and to extend it, adding the necessary fields "manually".

This works pretty well for the first and maybe for the second customer, but after that you're getting a holy mess of different versions that are difficult to maintain. Imagine that you need to upgrade to the next version of OpenACS or that you have developed an improvement for one of the customers that might be useful for the others as well.

Extensible Architecture Requirements

But if you start thinking about how to unify the user management code for all customers, you immediately get to the question of how to extend the unified code to accommodate the different requirement and you get to a list of quite ugly requirements:
  • Adding new fields to a business object:
    We want to be able to add any number of new fields to a user or another object without touching the "core" code. These new fields should support validation and referential integrity such as all other fields.
  • Integrating new packages:
    We want to be able to add new packages to the system, so that they are integrated with the rest of the system. Let's consider adding a "bookmark list". We may want to be able to show a list of bookmarks on the users main page, even though the users page didn't "know" about bookmarks before. And please remember, we don't want to touch the TCL or ADP code, because they are common to all of our customers.
    Also, we want to add a link "add a bookmark" in another part of the page and we want to add a new item in the global site menu such as "Bookmark Management".
  • Customized layout and design:
    Customers are picky, so we want to be able to adapt to all of their design preferences, particular in terms of form layout. Colors and stuff are covered by CSS stylesheets anyway.
Taking into account the overall TCL/ADP structure of OpenACS pages, we can translate these requirements into technical issues that we have to tackle:
  • Customizing ADPs:
    How can we add dynamicallynew pieces of code to an ADP page to display new contents or links?
    How do we dynamically add new fields to a form or new columns to a list view?
  • Customizing TCLs:
    How can we dynamically add business logic to TCLs?
  • Customizing SQLs:
    How can we patch SQL statements to include new fields from new "extension tables" or dynamic attributes? How do we sort the results according to an extension field that didn't exist at the time when we wrote the SQL?
  • Menus and Navigation:
    How can we dynamically adapt the navigation to reflect the presence of new packages?
  • Links and References:
    How do we link from "core" pages to pages in new add-on packages that didn't exist at the time of writing the "core" pages?

Option 2: Extensibility Using User Exits

So let's come back to our user registration example in order to explore how "User Exits" could help us to build a single page that would serve all of our fictitious customers.

The ADP Page

Here we could add several "user exits" to the ADP page that would look like this:
<%=[ad_call_proc_if_exists TCL_library_routine] %>
We could then write a TCL_library_routine implementation for a specific customer that would add the right HTML code in order to create the new fields. Also, we could call ADP includes on an similar "if exists" base to include pieces of content.

The TCL Page

The TCL page has to provide the ADP page with additional business logic for the new fields. So we could use same "user exits" trick and call a TCL library routine at the end of the TCL if it exists.

The SQL

This is more complicated. Let's imagine that the new user name fields are implemented via a "user_extension_table". How do we "join" the contents of this table into our exiting SQL?

One option is to use SQL "views". The TCL page would do a "select * from my_users" where "my_users" is an SQL view that by default only performs a "select * from cc_users". However, our extension module could now overwrite this view with a new version that joins cc_users with the user_extension_table. This approach may cause problems when there is more then one package adding fields to a user, but it's simple and straight-forward.

Menus, Navigation, Links and References

We could again use the "user exits" to implement flexible menus and references.

Pros & Cons

The advantage of this "architecture" is that it's quite simple, transparent and easy to understand. It is actually already being used in the request processor using the ad_call_proc_if_exists routine. Also, it provides a simple "migration path" to migrate an existing hard-coded system towards a more flexible one without rewriting the whole code. However, there may be "extension conflicts" between different modules that extend the same business object, and the code may become very ugly ("spaghetti") with the time.

Option 3: Store Everything in the Database

The current Project/Open architecture stores all variable elements in the database, such as menus, links, "components" (ADP includes), table columns and form fields. Table columns include the TCL code to render a table cell content and they include the "order by" clause if a user wants to sort a list by a specific column. Here is the comlete documentation: http://www.project-open.org/doc/intranet-core/

Pros & Cons

This is a very straight-forward approach that allows for great flexibility and performance. An extension module can just add a new column to a table and define some extra_select, extra_from and extra_where pieces for the SQL clause. However, the approach requires a considerable initial effort and storing TCL code in the database isn't really an elegant solution. So this is why we are considering alternatives in a project that is not related to Project/Open.

Option 4: Extending ad_form and Templating

The last option that we explored is based on the OpenACS templating system and ad_forms. These modules use a list of fields in order to control the rendering of forms and tables. Normally, these lists of fields are defined statically as part of the TCL page as in the following example:
ad_form -form {
    menu_id:key
    {name:text(text) {label Name} {html {size 40}}}
    {label:text(text) {label Label} {html {size 30}}}
    {url:text(text) {label URL} {html {size 100}}}
    {sort_order:text(text) {label "Sort Order"} {html {size 10}}}
} [...]
However, the definition of these fields could be moved out of the ad_form procedure call into a variable. And once it is within a variable, we could overwrite this variable in the case that an exension module has added more fields in a database table:
set field_list {
    menu_id:key
    {name:text(text) {label Name} {html {size 40}}}
    {label:text(text) {label Label} {html {size 30}}}
    {url:text(text) {label URL} {html {size 100}}}
    {sort_order:text(text) {label "Sort Order"} {html {size 10}}}
}
if {[check_the_database]} {
    set field_list [get_field_list_from_the_database]
}
ad_form -form $field_list [...]
This "architecture" would allow for a simple and convenient default configuration defined in the TCL page, while allowing for full extensibility by extension modules.

Another shortcoming of ad_form is its current HTML layout inflexibility. ad_form renders the form fields as a vertical list by default. There is no easy way to say that first_name and second_name should go together into the first line of the form. However, ad_form allows for custom rendering "form templates", so that we could tackle this issue by introducing new field parameters for field positioning (absolute horizontal/vertical or relative line/column) and by creating a customized version of a form template to implement something similar to a "layout manager" in Java.

Also, there are facilities in ad_form to handle dynamic fields via acs_attributes and the OpenACS SQL metadata system. However, the implementation of the acs_attributes feature is not very "transparent" (you don't understand easily what it happening) and doesn't seem to be commonly used. The only place that I have seen is group_type maintenance, and this is an incomplete implementation error with an error when trying to use default values.

Pros & Cons

ad_form and templating could allow for a flexible architecture without storing TCL code in the database. It would provide a very elegant solution if the integration with acs_attributes would work in real-world applications.

However, I personally don't like the "hide as much as possible" philosophy of ad_form, and I have lost many hours debugging relatively simple issues due to the lack of transparency and documentation.

How to Proceed?

How should we go forward? I currently believe that the third option (extending ad_forms with or without acs_attributes) is the most promising one, but I see a lot of risks related to it. Is there an interest from the community and the package maintainers to support our efforts? How should we organize the development so that our modifications could be incorporated into OpenACS?
We are exposed to a certain pressure in order to come up with a viable plan within a few days...

Bests,
Frank

Collapse
Posted by Jade Rubick on
Frank, you should really look into the AMS; it is almost exactly what you're looking for in terms of extending ad_form. I'd also look into better version control strategies. If you're doing any custom development, I'd strongly recommend looking at a better version control strategy than CVS. Arch currently has my vote. You can see my docs on an Arch setup at: http://www.rubick.com/openacs/arch
You are probably always going to want to do some per-customer customization, and the bigger and more "enterprise" the application and the customer, the more so. (Consider things like SAP deployments, for example.)

Whether that customization takes place in the source code or not is, to some degree, an implementation detail. But just hacking the code is always the quickest easiest way to do it - at least at first. So, the obvious first step is to get the best version control tool for your code you can. E.g., ditch CVS and investigate Arch immediately.

That way, at least you won't be artificially pushed into non code hacking methods of customization. And to me, that suggests that your ultimate designed solutions will be superior, as you will be able to learn from, e.g., refactoring 2 to 4 custom customer deployments into one unified whole, once you've learned what's important from experience.

However, I would never call code customization "Copy-Past-Modify". This implies all the wrong stuff... You should probably call it the "Customize Code and Refactor" approach, or something like that.

Frank, as far as the rest of your question about approach or approaches to use... I don't know. You seem to have a good start on thinking about possible ways to do it, and I lack hands-on experience in this. Other helpful steps might be to:

Briefly describe of all the known per-customer customizations you've ended up doing in practice.

Find out who else is using OpenACS to do SAP-like customized independent deployments of vertically integrated integrations to multiple customers. In other words, not fully custom functionality, but (minor?) variations for each customer on a specific application theme (e.g., "company intranet"). The only example I can think of there off hand are the various dotLRN sites running at different universities (except, they all tend to have OpenACS hackers on staff, which changes things). Perhaps also that semiconductor thin films equipment company that uses OpenACS - if they deploy OpenACS to customer sites; I'm not sure whether they do or not.

Conceptually speaking, you are talking about both, customization
and subclassing. While customization is used the component approach
to configure a package on anticipated spots, subclassing is oriented
towards inheriting (reusing) preexisting pieces and extend it in various places
(like "user exits", i haven't heard this term for a while). Both approaches
should allow in general instantiating the base
and exended definitions (e.g. the same system should
be usable for "spanish" apps with two familiy names as well
as for traditional western and other kinds of names).
While the configuration approach
follows favors the "black box" model, the subclassing model
requires more knowlege about the internal structure (white box)
and is used in class based inheritance in oo programming.

As the author of an OO Tcl extension (xotcl), i certainly
beieve, that such a language can help in various ways
to design and implement flexible systems (but it is not
more than a help, oo programming does not require oo
languages). But the answer is certainly not "use oo
and the problems are gone". The problem you are adressing
has to be covered on three layers, the storage layer,
the application logic layer and the presentation layer.
For flexible solutions, one requires sufficient
flexibility from all layers and the base components.
And for most composition issues, it is not stufficient
to stick pieces in an arbitrary order together (e.g. on
a from), but to provide a certain control for the
composition (the same is true for the database side).

I am writing here since you mentioned that you have
using already an implementation that stores pieces of
the code in the database. An approach with an reasonable
effort seems to take these pieces and put it into
an oo framework and to reuse it via inheritance. This
would be at least performance-wise much better and
can lead to higher flexibility, better encapsulation
and a better way to express the dependencies and
interactions.  If you are interested, have a look into
the xotcl tutorial at www.xotcl.org.

best regards
-gustaf neumann

Collapse
Posted by Frank Bergmann on
Hi,

thanks for the qualified feedback. Very interesting, Gustaf is proposing a fifth architecture option with OO Tcl, and we have forgotten to mention a sixth option by customizing only specific modules and creating different, customized versions of them. And AMS in fact already contains a lot of the stuff that we were looking for.

However, there are only three replies to this subject until now which probably means that there aren't many people/companies suffering from the "extensibility issue". Let's face it...

"Customize Code and Refactor" against "Copy-Past-Modify": That's in deed an interesting point. However, refacturing seems difficult if you are working in a project-by-project base. It's considerably cheaper to start from scratch again, compared to refacturing the not always clean customizations from the last project. Arch may definitely be able to improve this. Maybe there is a way to propagate changes in the trunk (let's say an upgrade from OpenACS 5.1.1 to 5.1.2) to all of the branches. Very interesting idea. Implement extensibility by means of a version control system. Sounds strange, but it may actually work in certain organizations, particular if they are suffering from strong time/resource pressure...

Briefly describe of all the known per-customer customizations

A very helpful and straight-forward recomendation. We'll try. Do you know of a better tool to do this apart from "diff"?

Concerning OO TCL: It's a very interesting idea, proven, and could basicly be seen as a way to "patch" your code a posteriori and as an alternative to "user exits" (yes, that's SAP speak I guess. They are successful, though...). Did somebody use this in the OpenACS context before? I personally had to go through piles of obscure OO code in several large projects, and inheritance just doesn't seem to work when there are many different developers patching the code during several years of the lifecycle of an application. Maybe it's not the fault of OO though, but of the freedom that OO gives to users...

I would be very happy to receive more feedback from other users/companies whoface the extensibilty problem. I know that there are atleast the following companies:

- Competitiveness.com (Spain)
- Quest (Ireland)
- Project/Open (Spain)

Thanks a lot!
Frank

Hi Frank, 

i had similar experiences, when i saw my first c++ program and 
could not figure out, what is going on, when the  
software crashed after each small change. oo is no guarantee
for good software at all. refactoring is certainly a key
aspect, as well as determining hot-spots and stable elements.

From the software engineering point of view, i would
recommend to have a look at software product lines
  http://www.sei.cmu.edu/productlines/index.html
  http://www.awprofessional.com/bookstore/product.asp?isbn=0201703327&redir=1
  
http://www.awprofessional.com/bookstore/product.asp?isbn=0201674947&redir=1

which adresses the problem where a common software core 
is tailored for several customers, where fixes and 
improvements should propagate to the tailored versions.
The idea of product lines is taken from automobiles
industry, where more or less the same base-line components
are used in various brand names (e.g. VW, Audi, Skoda, Seat)

best regards
-gustaf
I am working toward more of a 'code generation' option, which I think presents a variation not mentioned here. In short, you use a specification language to generate the data model and forms - there are some examples of similar things out there, including for the web - ERW http://erw.dsi.unimi.it/ , written in Java which produces PHP code, and Xdobry, written in TCL which produces forms on an X windows box.

I think there are enough intermediate APIs to make our acs-objects and CR, search, etc available as tcl scripts generated from a schema definition (well, we're getting there).

The extensibility would then arise by adding custom data elements to the schema def, and re-running the generator.

Hi,

thanks for the different hints.

It looks as if we would finally go towards a direction of plain acs_attributes and/or the AMS system. There is a discussion in another thread:

https://openacs.org/forums/message-view?message%5fid=239510

The main reason is that there are a lot of existing ("legacy") objects in our systems that would need migration. And the extension table approach from acs_objects combines relative simplicity (an extension table is easy to debug) and an upgrade path for the objects.

I had a look at Xdobry, and that's quite impressive. However, customizing such a system is going to be quite complex, compared with a plain SQL / TCL combination. Creating something like this for OpenACS will probably a big hit...

Bests, and thanks for the comments,
Frnak

Collapse
Posted by Frank Bergmann on
Hi,

just to let you know about our advances: We got a lot of feedback over different channels and based on this feedback we have decided to go ahead with Option 4 (extending ad_form and using acs_attributes) plus some Option 1 (user exits) for the hard cases. A "feasibility study prototype" is already working. It mixes the "widgets" idea from AMS with the procedures from "attribute-proces" and adds a few maintenance screens.

We'll post the code as soon as the basic functionality is working.

Bests,
Frank

Along the code generation line of thinking, Laetitia Duby (working with Rafael Calvo) has made a good start on extending ArgoUML to turn UML-like diagrams into OpenACS database schemas and tcl/adp pairs.

You can read about the project and download the current code here:

http://www.weg.ee.usyd.edu.au/people/laetitia/

I'm imagining some kind of base package meta-design that you could open in a tool like this, play around with the user interaction by clicking and dragging objects (you could easily do this with the client there), have it generate the bulk of code for you and tidy it up by hand.

It's not quite there yet - but tantalisingly close.

Frank, since the original posting here, have you made any headway on this topic?

Just curious as to any insights you may have gained since then.

The same goes for anyone that has either posted or is interested in this topic.

Collapse
Posted by Tom Jackson on
I think ad_form is still the way to go if you want to maintain something familiar with most developers here. I have never used ad_form for several reasons, but it works and there are lots of examples of how to use it. That is important, and so is the ability to get help here if you have trouble.

I've tried a different approach than ad_form. I use the .tcl page just to setup variables/multirows. The template has all the HTML. One reason is that I find it difficult to handle complex cases. I also don't like to tie together the SELECTion and CREATE/UPDATE/DELETE functions, because I might want to do multiple of these at one time, and even handle multiple objects at one time.

Here are some examples of how I have approached the problem:

http://rmadilo.com/files/merchant-system/examples/

Each form has multiple possible operations. The edit (purchase order) is a very good example of a typical situation I need to handle.

The template files have hand written forms, although it would be possible to create reusable sub-templates to do the same thing. But one interesting feature is to surround the form with a decoration which can be replaced without much editing of the template. If you look at the forms you will see why they can't be easily handled by ad_form.

One thing I disagree with in the above problem description is that you can automatically add in a different data model or code to the .tcl page. This pretty much breaks the application, and the developer should admit that it is different from the original. I do have ways to limit the fields and field values available depending on the role of the user, but these are not show in the examples.

Collapse
Posted by Frank Bergmann on
Hi Robert,

Here are quick update on the topic: Both Quest.ie and project-open.com are using FlexBase/DynFields (basically the same code, just with small adaptions to the respective environment) in production. DynFields allow SysAdmins to extend OpenACS objects with new fields. These fields are then appended to ad_form or the ]po[ "DynViews" (our alternative to ListBuilder).

DynFields are one of the most important features in ]po[ for users in order to adapt ]po[ to their needs and for us in order to "keep the product clean".

DynFields are now used in most (important) objects/ad_forms. In addition we have integrated DynFields in several other areas:

- "Advanced Filtering" in various "ListPages":
For example: http://pcdemo.dnsalias.com/intranet/projects/ (please login as "Ben Bigboss" before on http://pcdemo.dnsalias.com/): The "Advanced Filtering" allows you to generically filter projects according to values from DynFields. You add a new DynField to a project and you get automatically a new Advanced Filter.

- Reporting:
For example: http://pcdemo.dnsalias.com/intranet-reporting/: You can now generically select DynField variables to be shown in a report

- Data-Warehouse:
For example: http://pcdemo.dnsalias.com/intranet-reporting-cubes/timesheet-cube: DynFields are automatically available as "Dimensions" in our data-warehouse cubes.

To summarize:

The introduction of DynFields was one of the most important milestones in the live for ]po[. Only with DynFields we have been able to maintain a single "product" code tree while serving >2000 customers in 6 different "vertical" markets.

The future:

- We are going to reduce the number of hard-coded fields in new objects and rely mostly on DynFields to define forms and lists.
- We'd need a two-dimensional ad_form layout. No idea yet when and how, but it's going to come. Then we're actually getting closer to Oracle SQL Forms and other form based systems.
- We may revive the idea of an "application generator" for ]po[. This meta-package would generate new packages with one object type each, based on a definition of a new object type and its DynFields. This should actually be quite easy now that all the necessary infrastructure is there.

Thanks to Matthew Gedderts:

I just want to say Thanks! to Matthew for his AMS. DynFields is basically a version of AMS with table columns for storage, plus a few extensions now. The most important contribution of AMS is the introduction of the "widget" concept that refers to a TCL widget + a definition of the data range (the values for a drop-down box). These "widgets" solve the problem of defining the exact semantics for a field, how to display it and how to restrict its value range. Great stuff!

Cheers,
Frank

mailto:frank_dot_bergmann@project_dash_open.com
http://www.project-open.com/