Forum OpenACS Development: Re: Moving to XHTML

4: Re: Moving to XHTML (response to 1)

Posted by Gustaf Neumann on 09/13/07 09:12 AM

I do agree, that moving towards XHTML will not be an easy path, esp. when IE6 compatibility is desired. Note, that the (great) analysis by Ian Hickson (of Opera, now Google) concerns mostly delivering XHTML with the media-type text/html, but there are alternatives http://www.w3.org/TR/xhtml-media-types/ which do not always work (see e.g. the W3C recommendation for IE http://www.w3.org/MarkUp/2004/xhtml-faq.html#ie and the arguments in http://lists.w3.org/Archives/Public/www-html-editor/2004JulSep/0027.html)

However, on the longer range, xhtml is the way to go; newer html editors provide support for xhtml (e.g. tinymce, wymeditor), wide-used systems such as wordpress support xhtml today (via a plugin, essentially checking, what the web agent accepts and setting the mime type accordingly). We should be able to at least as good.

As a side-note: by using tdom as an html generator (as in the xotcl templating), it is actually possible to output optionally html or xml. However, this would require a complete rewrite of acs-templating (also, not all of xowiki uses the tdom generator). As wordpress shows, dynamic switching does not seen necessary.

5: Re: Moving to XHTML (response to 4)

Posted by Tom Jackson on 09/13/07 04:45 PM

I'm not sure what the templating system has to do with XHTML? Template engines should output whatever is asked of them. One failure of acs-templating is adding whitespace in place of removed tags, but otherwise can't it output XHTML right now? Maybe you mean that templates need to look like valid XHTML?

How do html editors fit into OpenACS development? Will future work require their use?

The thing I don't understand about XHTML is how it fits into the needs of a site which allows users to provide dynamic content. Users would never be able to put in anything which wasn't perfect. That means a very smart input filter, and I can't see how the filter would be able to distinguish errors from intentional user input.

But maybe there is a good reason for using XHTML. What is it?

Are there any pages on OpenACS which could be delivered as application/xhtml+xml? If not, are there any pages which are close? Has anyone tried to convert them to see what they look like?

My guess is that you would have to rewrite every page and proc which doesn't know about XHTML, and you would never be able to have a template that had any HTML tags in it, or every page would need to pass through a smart output filter with the same problems as the input filter. Then you have to deal with javascript and css differences, not just quoting the code, but the scripts themselves would need change.

I still think the fastest way to explore this is to find another site/toolkit which takes dynamic input and see how they handle the issues.

6: Re: Moving to XHTML (response to 5)

Posted by Don Baccus on 09/13/07 08:15 PM

Some systems use a different tagging scheme for user input, like [quote] rather than blockquote, and them transform them to the proper HTML or XHTML.

There's no plan to require HTML WYSIWIG editors to be used, but we already support xingha out of the box. Sites that want to make sure pages are correct might want to enforce the use of a WYSIWIG editor.

Browsers are still going to render malformed XHTML as well as they can, just as they do with malformed HTML. Smaller devices like phones might not, but then again by the time we get done with our transition, phones won't be "small devices" in the capacity sense.

As to why to go this direction ... the whole world's going this direction. HTML 4.01 is the last HTML standard W3C will put out. Everything else will be XHTML.

Here's a page written in XHTML strict:

http://w3c.org

Does IE6 render it incorrectly????

7: Re: Moving to XHTML (response to 6)

Posted by Tom Jackson on 09/13/07 11:36 PM

Okay, so you are not going to be serving it as application/xhtml+xml, but as plain ol' text/html.

But I did notice that the static home page at w3.org is treated by Firefox as application/xhtml+xml even though a meta tag and the server headers indicate text/html.

I guess there isn't going to be an option of which to use in OpenACS, the home page here is already not HTML 4.01 transitional, using /> to close empty tags.

9: Re: Moving to XHTML (response to 7)

Posted by Gustaf Neumann on 09/14/07 01:54 PM

it looks to me as for http://www.w3.org/ the apache at w3c evaluates the "accept" request header field from the user agent; if it contains application/xhtml+xml, it serves the page (xhtml 1.0) with this type (no meta flag with http-equiv). If i open this page with safari, i get the meta tag http-equiv for the content type with text/html. We should be able to do similar depending on the capabilities of the browser.

if we simply want to stick with simply rendering traditional web-pages, there is no big need for moving towards XHTML (although it will make styling simpler). But look at the developments like microformats http://microformats.org/, or GRDDL http://www.w3.org/TR/2007/REC-grddl-20070911/, check out the use cases http://www.w3.org/TR/grddl-scenarios/ that show how to extract semantic information (RDF) from xhtml web pages. This opens many new perspectives, especially for systems built around a rich datamodel such as openacs.

-gustaf
PS: by "templating" i was refering in my earlier posting to the automatically generated html.

10: Re: Moving to XHTML (response to 9)

Posted by Tom Jackson on 09/14/07 11:37 PM

My version of Mozilla Firefox gets the w3 home page with a meta tag 'http-equiv="Content-Type" content="text/html; charset=utf-8"'. It is hard to see, smashed up against the head tag. However, I think you must be correct: the page it is being sent as application/xhtml+xml, when I use wget, it is text/html. Moving the identical file to my server and sending it, Firefox detects it as text/html, the only difference must be the server headers.

So how to solve the problem of being able to serve both. Simply making an additional template for each page wouldn't work. Some markup is produced in the tcl pages, in procs, and some is in the data. Somehow all these sources which make up the page which gets sent to the user have to be in sync.

My own method is to try to separate data, code and templating, roughly model-view-controller, but markup is difficult to handle generally. One idea is to have a data model which resembles RDF and to be able to apply templating to tiny chunks of code to create markup. As long as the two remain separate and allow for switching out the template, many different opportunities for reuse will open up. The RDF type model would allow the possibility of browsing the elements which were used to create a web page. (A simple html browser, or something using more complex, but either could operate on such a page just by using a different template. That is, the editor/browser would use the same technology on an expanded scale.) The RDF type linking between objects provides the semantic hints needed to pull this off.

13: Re: Moving to XHTML (response to 6)

Posted by Tom Jackson on 09/16/07 04:20 AM

As to why to go this direction ... the whole world's going this direction. HTML 4.01 is the last HTML standard W3C will put out. Everything else will be XHTML.

Until yesterday I never heard of it, but there is an upcoming HTML 5.0. http://www.w3.org/html/wg/html5/

Relationship to HTML 4.01, XHTML 1.1, DOM2 HTML
This specification represents a new version of HTML4 and XHTML1, along with a new version of the associated DOM2 HTML API. Migration from HTML4 or XHTML1 to the format and APIs described in this specification should in most cases be straightforward, as care has been taken to ensure that backwards-compatibility is retained.
This specification will eventually supplant Web Forms 2.0 as well.
Relationship to XHTML2
XHTML2 defines a new HTML vocabulary with better features for hyperlinks, multimedia content, annotating document edits, rich metadata, declarative interactive forms, and describing the semantics of human literary works such as poems and scientific papers.
However, it lacks elements to express the semantics of many of the non-document types of content often seen on the Web. For instance, forum sites, auction sites, search engines, online shops, and the like, do not fit the document metaphor well, and are not covered by XHTML2.
This specification aims to extend HTML so that it is also suitable in these contexts.
XHTML2 and this specification use different namespaces and therefore can both be implemented in the same XML processor.