Forum OpenACS Q&A: Utility of XSLT?

Collapse
Posted by Talli Somekh on
Hey guys,

I don't mean this to be either a flame nor a question of anyone's
work, just an honest question.

What is the advantage of using XSLT, other than the "portability" of
content across DB's? It seems that it adds much more complexity to a
system. Why would someone want to us XSLT rather than templates and
HTML files?

Essentially, I am asking under what circumstance is XSLT appropriate?
How much will future versions of OpenACS be tied to using XSLT? Do
people have visions of using it as the core method for developing and
presenting content?

talli

Collapse
Posted by Talli Somekh on
yeah, I know about those Bryan. But I didn't find them convincing at all. And my question is essentially a direct response to the complexity of ACS Java's content system.

talli

Collapse
Posted by Bryan Che on
If your question is really about the complexity of the ACS presentation/content system, then the issue goes deeper than merely the use of XML/XSLT.

As I see it, there are two dramatic differences in ACS 4.6's presentation layer from pervious versions of the ACS and OpenACS:

  • A design/architecture change from a system that generated, for each request, a corresponding Web page to a system that generates pre-instantiated UI-components that process page request information.
  • A change in implementation technology from ADP/Tcl (or ADP/AJP) to XSLT/XML. Note, though, that this change in implementation technology is driven in large part by the design and architecture changes.

I would suspect that a great deal of the complexity you find in the ACS 4.6 presentation layer stems more directly from the design change than the use of XML/XSLT.

In previous versions of the ACS, a request would go to a certain page that processed that request. The page would generate all the HTML for the request based upon some logic and db queries and then return the output.

In ACS 4.6, the architecture changed to resemble that of a GUI toolkit like Swing or MFC (ACS 4.6 was designed with the Model/View/Controller (MVC) architecture in mind). I haven't programmed in Swing before, but I have done a fair amount of work with MFC. In the GUI toolkit paradigm, I would write GUI components which would respond to various events. For example, consider a simple Windows dialog box. When building a dialog box in MFC, I would create a new dialog box which subclassed the CDialogBox class (or whatever it's called). Then, I would add event handlers like an OnOK() method, which would do the appropriate thing when the user clicked on the "OK" button.

Bebop, ACS 4.6's presentation components, works in a similar manner. If you are writing a Web page with a form, you subclass the bebop form component and add event handlers for the appropriate events, like the user hitting the "submit" button. Then, you add the new form object to your page. This page will be instantiated on server startup, and every request to that page will be handled by the one, already-instantiated instance of that page.

Programming the UI for ACS 4.6 Applications requires a fundamental change in approach to how one thinks about building Web UI's. Before, we could essentially treat each Web page request as a single program that would run in isolation from other page requests. With ACS 4.6, now the Web server is a system, and page requests are no longer so isolated. But, doing things this way buys several advantages, including increased code-reuse, easier maintainability in larger applications and Web sites, and flexibility in modifying the site.

The reason I've gone to great lengths to describe these design differences is because I personally found that this is from where the biggest confusion regarding ACS 4.6's presentation layer stemmed. Once I saw that using Bebop was like programming with a traditional GUI toolkit, though, things seemed much less complicated than before. I apologize if this distinction was already clear to you.

Now, regarding XSLT/XML: It's certainly possible to write a presentation layer similar to the old, ADP/Tcl templating system with XML/XSLT. Indeed, XSLT is a turing-complete language, so you could write XML pages that resembled ADP pages and XSLT code like Tcl code. So, XSLT is not necessarily more complex than what we used before--it's just new and different. But, when you combine XSLT/XML with the design/architecture of ACS 4.6's presentation system, then you can do some powerful things.

Because you are using the same UI components for various pages, and each page is instantiated just once, you can truly separate the way you present your Web page from the way you constructed the Web page. For example, say that you are building a page that lists bboard threads. Say that you want to switch between a threaded bboard view and a non-threaded bboard view. In either case, the threads that you list are the same, but the presentation of those threads is different. So, you could write a page that generated XML listing all the threads to display. Then, you could write two different XSL stylesheets: one for doing threaded displays, and one for doing normal displays. Changing the way the bboard threads are displayed is just a matter of swapping stylesheets--no coding changes necessary when generating the XML. This is not something you could have readily done with the old ADP/Tcl templating system.

Now, extending this example just a bit, a Web site can drop in any number of XSLT stylesheets to change the way certain content is displayed or to change the overall look and feel of a site without having to modify any of the application logic that generated the content to be presented.

Because you did not like the documents I listed and your question "is essentially a direct response to the complexity of ACS Java's content system," I have attempted to explain the benefits and design rationale behind ACS 4.6's presentation layer. This post is getting long and is not really on-topic for an OpenACS forum, though. If you want to get more feedback on ACS 4.6-specific questions, I suggest you post at http://developer.arsdigita.com/acs-java/bboard/forum?forum_id=26112 instead. I don't want to turn this thread into a discussion of ACS 4.6 versus other versions of ACS or OpenACS. But, since you asked about ACS Java, I have attempted to answer your question as well as I can. I hope this helps...

Collapse
Posted by Talli Somekh on
Bryan, that's very helpful thanks. I found the docs at ASJ to be a bit too specific to ACS Java and couldn't make the switch to a more general understanding. But your post was very helpful in getting me to understand the use of XSLT.

Thanks alot.

talli

Collapse
Posted by Andrew Spencer on
I believe one of the cheif advantages of using XSLT is excellently demonstrated with Bryan's bboard example. Another advantage is the simplicity of XSL stylesheets. As an example, for this page I could have walked through the XML documents from Slashdot.org and Tomalak.org in TCL and generated the HTML. However, it was much quicker and IMO simpler to transform the XML with XSL stylesheets. Include the results in a template and away you go.
Collapse
Posted by Michael Feldstein on
OK, now what are the disadvantages of XSLT? Is there a price you pay in terms of complexity?
Collapse
Posted by Yon Derek on
I found XSLT to be more complex than, say, ADP. However, in the long term, I think I would still switch to XSLT since it's a one-time fee to master it and I believe in the advantages of the ability to independently develop multiple stylesheets as described by Bryan. And XSLT is a resume enhancer.

I've heard that dealing with relly large XSLT stylesheets can be really painful but then again I've heard that there are good tools that alleviate the problem (visual debuggers for XSLT). But then again the good ones will cost you.

Also it is quite possible that XSLT transformation can be more costly than ADPs, but:

  • it's pure speculation; TCL is not the fastest language
  • it can be alleviated with good caching
And finally, the more data will be available in XML out there, the more useful XSLT will become.
Collapse
Posted by Stephen van Egmond on
At work, I'm developing a system that uses XSLT and XML (in this case, using Perl and Apache using the AxKit XML processor). So far, it kind of looks okay, but there hasn't been any big-bang improvements in productivity, or "phew, I'm glad we're using XSL" moments.

XML/XSL looks great in theory: look, this is how you format a phone book entry. Look: a book. Look: a list of books. Notice how there is a distinct lack of dynamic information there. How do you generate a list of books from an RDBMS? The last I saw, the ACS 4.6/5 design calls for generating DOM trees "by hand". Oh, the humanity...

When it comes to form handling, XML/XSL is pretty much a disaster. It's difficult to achieve a separation of concerns: programmer needs vs. designers' needs. How do you represent the pageflow? How do you represent the validation? What about forms that have multiple targets? How do you cram the form variables into your procedural language, and ultimately the DB? These are sticky, nasty issues that most shiny happy XSLT tutorials don't get around to even looking at. We're trying to deal with them, but even using a flexible language like Perl, it's still a challenge. We've come up with a good approach to the impedance mismatch problem between XML and set-oriented RDBMS.

Other random complaints:

  1. Tool support for designers using XSL is pathetic.
  2. To most designers, the syntax defies explanation and understanding.
  3. The designers of the w3c spec should have exercised more restraint: it's too large, written in a very opaque style, and mixes trivial, unimportant details with things that you'll end up using every day. Just look at where they describe apply-templates.
  4. XML databases are, and appear to be fated to continue to be, sad and inefficient.
Collapse
Posted by Talli Somekh on
Stephen, thanks alot for your comments. That was the kind of thing I was looking for and wanted to know more about. I had a sense that it XML/XSLT could get really ugly given the "abstraction" each one promises. I think the most striking thing that you said is that it's very difficult to mix dynamic data and XSLT. I wonder if anyone (outside of aD, please) has had such experience, similar or otherwise?

I guess my question was based on whether XSLT was another buzzword or there was something actually behind it. I imagine that only time will tell for this.

talli

Collapse
Posted by Stephen van Egmond on
Talli -

The abstraction sounds really good in theory, but I have yet to see it pay off. One is supposed to be able to retarget XML for different, non-HTML devices (WAP and reduced-HTML devices like Blackberry come to mind), but I think that will only result in

  1. severly inelegant user interfaces for those devices, as they try to represent XML that was seen as appropriate for a web browser, or
  2. great inefficiencies for the application servers as they generate, then throw out, vast amounts of XML that can't be shown on the limited devices.
I get the feeling that you'd be better off building two sets of templates for two different kinds of targets.

It turns out that we've managed to solve the problem of getting dynamic information from a procedural language to XML-land, and it doesn't suck. We basically built adapters between Perl datatypes and XML structures. It's not a 1-to-1 mapping, but it's good enough and the code you write remains concise.

Collapse
Posted by Dave Bauer on
You are supposed to have two different XSLT stylesheets for differnt output aren't you? That should be similar to using two different templates. At least, that is how I can provide the same XML output in HTML and RSS (or will when I get around to it.)
Collapse
Posted by Stephen van Egmond on
You are supposed to have two different XSLT stylesheets for differnt output aren't you? That should be similar to using two different templates.

This is true, except that building XML/XSLT templates has, so far, proven to be more difficult than building pages using a templating system, as measured by its high WTF factor, and poor tool support.

HTML and RSS

I think this is a different scenario than most web services find themselves in. They usually have a vast number of HTML pages, and a small number of RSS documents. But if your dominant medium is RSS, then it certainly makes sense to use XSL to let them be viewable in an HTML browser. But I don't know how this would support any concept of page flow, and so is probably an exception.

FWIW, I think Karl's design for ACS4 templating was pretty close to ideal. But that's an armchair analysis, not one from experience. For those who have worked with it, what was it like?

Collapse
Posted by John Sequeira on
I've worked with XML/XSLT and many, many different templating systems,  and I've yet to see a compelling example of XSLT at work.

It is *so* easy to write different templates that run off the same application logic to target different devices,  simply using procedural abstractions for code re-use,  that I don't understand why people keep bringing this up as an argument in favor of XML.

I'm sure bebop is a cool framework for hiding you from the nuances of stateless web UI programming,  and while that's potentially an admirable goal,  it has nothing to do with XML.  Other serverside java frameworks have done a lot of what ACS's framework does,  without using XML (Java Struts, Enhydra Barracuda).  My point it not that their framework isn't good,  but rather,  how does it's use of XML make it better than other, similar frameworks?  My guess is that it's slower because of parsing,  and it's got a bigger learning curve because you've forced people to learn and debug two more languages.

XML is great when you're talking to a system that you don't control (i.e. ebXML for B2B),  or a different platform,  where speed of implementation is more important than raw speed.  I love XML-RPC and WDDX, because they're simple and they just work.  But when I've had the need to use XML in the past,  it's been so easy to bolt it on that I see very little advantage to baking it into your framework.

Collapse
Posted by Henry Minsky on
I also think that the last ACS 4 (Tcl) templating system was a very
simple yet useful templating model. Just simple row structured data sources, and
a couple of conditionals and and iterator in the templating language seemed sufficient for almost everything. More complex pages could be formatted as straight procedural code.

The first ACS 4 Java release had a straight port of the ACS 4 Tcl template system, which I think worked quite nicely.

In fact, it seemed like it would be easy to extend that template system to handle XML, in a lot of cases,  by converting the XML to a data source. I wrote a simple XML DataSource class, which could take a
block of XML and convert it into a data source. I have a pointer to
an older writeup of this at http://domokun.wem.sfc.keio.ac.jp/xmldemo/

I actually improved it some more after that writeup, adding in a hook to the XPATH parser, so you could define a data source that would specify an arbitrary XPATH expression to grab any chunk of XML out of a raw XML source.

The advantage was again that this handled almost every common case of
wanting to use XML data, and you never had to touch a DOM tree by hand. For the more complex cases, you can always manually pick apart the XML if you need to. But the common case was to treat data as
basically row structured, or to coerce it to that with some easy to use tools.

XSLT is interesting because it is so many different things, but also
that is part of it's problem. It is really best as a tree-reordering language. It is a very compact way to transform one XML tree into another. But it is a crappy procedural language. And it is a verbose and clumsy templating language. But still, some people believe that the benefit  using a single language will have ultimate cost savings somehow, in user training or system maintenance. I am not convinced from what I have seen so far though.

Anyway, beyond just the XSLT and templating question, I think it's kind of strange watching how ArsDigita threw out almost everything they had working and started over. The new application model they are using, of persistent user interface objects and Model-View-Controller, etc, was tried by a number of other web application framework products over the years, such as the original NeXT NetObjects product (admittedly, that was in Objective C). I think
in many cases those systems proved to be difficult to
customize and maintain, because of the extra layers of abstraction above the database and the extra layer of abstraction between the application and the HTML pages.

I also think the bboard example is something to be wary of, because, perhaps not coincidentally ,the bboard functionality of successive releases of the ArsDigita tools has been getting steadily less
functional and efficient.
The argument that it's easier to reformat the bboard in lots of different ways is again ignoring the efficiency issue. I would worry that on top of an inefficient underlying representation, adding the overhead of XSLT processing is probably going to hurt the performance (although the
XSL template processor supposedly compiles the templates, still I don't know how efficient the whole thing is compared to straight procedural Java as compiled by the original ACS4 templating system).

Looking at the ACS 5 stuff reminds me of something Greg Haverkamp used to say, "Too many moving parts..."

Collapse
Posted by Don Baccus on
Henry ... your comment about bboards and their evolution in the ACS has got to be the understatement of the century.
I've just had such a pleasurable couple of days whacking on the ACES general-comments based
one for a client, why oh why did aD throw out that tried-and-true bboard datamodel which was fully capable of dealing with tree-threaded
presentation?  The threaded UI in the traditional bboard module sucked, but layering a new UI over the old datamodel would've been simple (and the ACES threaded bboard UI sucks almost as bad anyway!)

As far as your comments about XML as a datasource, my thinking has been that using ns_xml we should be able to do something like that for
slurping and using data represented in XML.  Why not, eh?  We could fill portlets with stuff from remote sites this way, for instance, since that's already set up to recognize a variety of datasources (tcl
scripts, html, etc).  We'd just need some scripting pieces to pull apart the XML doc and point the hose at the portlet afterwards.

I'm finding this discussion about XLST very enlightening, as I've nover played with it and have really had a hard time coming up with a reason for wanting to.

Collapse
Posted by Stephen . on
I think even the ACS4Tcl templating system is cumbersome and would be better scrapped for straight Tcl. There's a lot more presentation logic in any page than HTML, which to achieve with the ACS' not-nearly-XML syntax is a pain!

Tcl has to be one of the simplest scripting languages available. I don't think the addition of <angle brackets> makes it any easier for designers or any one else.

Thank you...

Collapse
Posted by Richard Li on
The motivation behind the choice of XSLT in ACS Java 4.6 was driven by two perceived requirements: global styling and componentization. By global styling, I mean that we wanted to provide a way to consistently style an entire site or subsite by changing stylesheet rules in one place. Since the XSLT transforms in 4.6 are built hierarchically, changing the appearance of a subsite or the entire site generally requires you to modify one rule in a single place, rather than many rules in many places. (This also makes a site easier to upgrade, since customizations are isolated). Componentization was the idea that we wanted to build reusable widgets that could be plopped onto a page -- and look like it was part of the page. In other words, if the page had specific styling, or was part of a subsite that had subsite-specific styling, the component would automatically adopt the appearance of that page or subsite. You could do this, sort of, in the ACS 4 templating world. One issue that came up, though, was when you started to nest components: you would go crazy trying to make sure that the HTML would nest properly. By using XML as an abstraction layer, we avoid that problem. We're not sure yet that this is the solution of the future (tm). That's one reason why 4.6 provides full support for JSPs and regular ol' Javabeans. However, I think there are compelling benefits to using XSLT, but there are a number of challenges we'll have to meet in order to really make the system great (as some of the posts above clearly indicate).

I wonder if anyone (outside of aD, please) has had such experience, similar or otherwise?

Talli, as someone from aD, I was wondering if you could clarify what you mean by the "outside of aD, please" part 😊. Are you implying that the credibility of a poster on the merits of XSLT is suspect when s/he is from aD? I don't understand why you made this comment, since I feel that, even if you don't like the company that the person works for, a post (especially on a technical issue) should be discussed on its merits. Suppose, hypothetically, that I knew someone from Musea was posting nasty comments under a pseudonym. I wouldn't automatically stop reading all Musea posts because of that one person...
Collapse
Posted by Talli Somekh on
Richard, I wanted to hear from people that used XSLT in various instances, not from people (at aD) offering arguments validating their choices. I was not trying to attack the ACS 4.6-5 system's choice to use XSLT. I was only wondering whether it was worth implementing in a system. Bryan's post helped me understand what you were trying to do. Your post, unfortunately, muddled it further with more abstraction. Oh well.

I wanted to know about XSLT not only because of ACS4.6 but because it's either the Next Big Thing or a really bad buzzword. I had been studying it for a while and after reading aD's documents I had no idea whether they were good or not.

In most of my posts, I have mentioned that while I support aD, I am constantly dismayed that they choose to ignore a strong community that used to be an adoring fan base.

But there's been enough OpenACS vs. aD arguments around here lately. I want to say that I support and wish aD the best. They are doing something rare and very cool by trying to build a new product that is open source. I hope that aD succeeds and multiplies because I think their business model is unique and the right way to do it. I also hope, though, that aD begins to recognize the value of open source is the community that gathers around it. I believe that aD has made huge strides in the later point since it was very transparent throughout the development of ACS4.6, something that I appreciated a great deal. Also, I do appreciate the development and release of ACS4 tcl, which OpenACS has picked up and is doing a great job with.

When I posted announcements for the first OpenACS social, I made some silly comments poking fun at aD. I meant it in fun, and I apologize for that.

But I am always happy and pleased when aD'ers show up at the socials because there is some important cross talk that occurs at a much higher and amiable level than on these boards. Lars came to the first one, a bunch of aD'ers came to the second (Paul Hubers, Andrew Piskorski, David Tropiano and probably another 5 or so) and at the last social Dennis Gregoravic, Andrew Piskorski and Eve were there. This was important because both sides were able to share perspectives on the development of their systems.

I'm usually one to fight back whenever someone challenges me, but no one is benefitting from the silly wars that have been going on lately. It's becoming more and more clear that the aD people should stay on their side and we should stay on ours.

talli

Stephen, since you ask:
FWIW, I think Karl's design for ACS4 templating was pretty close to ideal. But that's an armchair analysis, not one from experience. For those who have worked with it, what was it like?

Well, I've had moderate experience using the ACS 4 templating system, and it is fine. Quite useful, and really fairly simple.

At one point, I dug into how it actually worked, as I had a registered filter doing some special-purpose access control checking across thousands of URLs, which needed to tie into the templating system in order to serve nicely templated "acess denied" pages and the like. Hm, here, I saved a code snippet - I ended up doing something like this:

# These all give the correct absolute unix pathname to the
# denied-access template:

#set file_root "[ns_info pageroot]/../templates/denied-access"
#set file_root "[get_server_root]/templates/denied-access"
set file_root [template::util::url_to_file {/templates/denied-access}]

set args [list title $title context_bar $context_bar page_content $page]
set output [template::adp_parse $file_root $args]
db_release_unused_handles
ns_return 403 text/html $output

When I was digging into the templating code in order to really understand how to do the above, my feeling was that the code could have been somewhat better commented, and that the piece that does the walking up and down the tree of included master templates could probably have been made more general - I seem to vaguely remember that, as written, it works only for the usual case where you start off with a *.tcl / *.adp pair in the file-system - not when you call into the templating system procs from some Tcl code that you're writing, like I was doing in the example above.

But those are trivial quibbles, irrelevent to probably 99.9% of the uses to which the templating system is put. So basically, I don't know 'nothin about XML or XSLT, but the ACS 4 templating system struck me as an excellent tool.

Oh yeah, and back when I still worked at aD, I remember hearing one of the aD London developers talk about the graphical-design process used on one their client projects. Apparently, the client's graphic designers all used some fancy Mac-based HTML editor which has hoooks of some sort for foreign (non-HTML) tags.

So somebody (one of the aD developers, I believe), wrote some stuff for those hooks which accomplished two things: One, the Mac program now "knew about" the templating system tags and would not bash them up when editing .adp files. And two, these hooks would insert fake stub data, so for example the <multiple> tag would show with five rows of bogus data, to give the graphic designers an idea of how the final product would look when like and database-backed.

From what I heard, this was a wildly successfull example of separating the graphic design from the programming - frequently a source of great friction on web projects. And these graphic designers loved it too - not just the programmers.

Collapse
Posted by Don Baccus on
Boy, it would be nice to know which Mac tool was being used (DreamWeaver?) and to get the code that the hooks hooked!
Collapse
Posted by Stephen . on
The templating system works well for the simplest cases, but in the real world that doesn't happen too often.

For example...

The multiple tag is typically used to display tabular information, but it does nothing to help you enforce a common style for tables accross the whole site. Is this to be done page by page with dreamweaver? What happens when change is required?

The form system does provide a mechanism to controll the look and feel from one centralised place (that's why all those forms are blue..) but it doesn't allow any way to overide this for subsites or subsections etc.

The template master system is a great idea, but you can only hard code the location of the master in the filesystem, or a single master can be set for an entire subsite. Normal pages within a package can be overiden by placing files under the web root, but master templates are different.

Anyway, I'm wondering how Dreamweaver displays a page with any similarity to reality when that page is composed of 10 or so components, considering the various schemes ACS uses for choosing those components?

A modern ACS page is no longer the product of the designers fancy with some dynamic data filling the hole in the middle. It is composed of a header and footer, perhaps a login widget, some dynamic navigation display on the side, maybe a higher level tab display along the top, some user options, maybe they have admin privileges on this object, user feedback at the bottom of the page, some data pulled from a CMS, an ad chosen by user demographics etc. etc...  There are no chunks of HTML which can be conveniently churned out in a page designing tool. We're defining procs which are called with arguments, over and over again. It's nothing new, we were just confused because for a while we built 'pages'.

The twist that the web world seems to put on this is that of organization. I do think it's usefull to place the procs/includes not in one or more big files, but within the filesystem corrosponding somewhat to the URL hierarchy. URL hierarchy inheritence seems more useful than the traditional OO form...

Okay that's enough, looks like everyone else is happy with it...

Collapse
Posted by John Sequeira on
The last big ACS project I was on was for a vertical market ASP.  Their funding ran out before we took any sites live,  but we had about 5 in QA.  These were simple sites,  with about 15 public-viewable pages, and 40 or so admin pages.  The whole thing was built on ACS 4.0 using templates and subsite functionality.
I should note that I prototyped a solution for them using XML/XSLT to spit out the template pages using Userland's Frontier OODB/Content Management System.  The client didn't like the fact that you couldn't modify pages in real-time (without doing a site republish),  and since I was really reluctant to force the them to learn the OODB I readily agreed to scrap this.

[Aside: This approach had previously served me well on a 200 page Cold Fusion intranet project - managing the web pages and navigation in an OODB (with look and feel inheritance,  and a strong macro language), but programmatically generating scripted web pages. The best of both (OO & script) worlds]

Anyway,  for the ASP the templates were constructed so the designers could pull them up in Dreamweaver.  They *loved* this.  Dreamweaver has XHTML/XML hooks that can be coded to,  and we had an initiative was underway to do something with it,  but didn't get out of R&D.

The learning curve of this templated approach allowed me to bring in part-time programmers (moonlighting from their cash-bleeding .com) who I did not have to train at all.  I did a few via-email code reviews on their stuff and showed them some 4.0 shortcuts,  but these somewhat junior programmers were productive from day 1.  aD really did a great job balancing approachability and power in ACS 4.0.

Stephen makes some good points about when you would reach the limits of templating (components/sub-pages/inheritance),  but I consider that even now to be the exception cases for pages that go into real  sites.  Phil had this really amazing passage about how you really shouldn't design your software to solve 100% of your problems in the Guide to Web Dev.  I won't try to say it better here.