Upgrading existing ADPs to noquote templating
Introduction.
The variable substitution in the templating has been changed to become more friendly towards quoting. The rationale for the change and the definition of terms like quoting are present in the quoting article . As it discusses these concepts in some depths, we see no reason to repeat them here. Instead, we will assume that you have read the previous article and focus on the topic of this one: the changes you need to apply to make your module conformant to the new quoting rules.This text is written as a result of our efforts to make the ACS installation for the German Bank project work, therefore, it is based on field experience rather than academic discussion. We hope you will find it useful.
Recap of the Theory.
The change to the templating system can be expressed in one sentence:All variables are now quoted by default, except those explicitly protected by ;noquote or ;literal;.This means that the only way your code can fail is if the new code quotes a variable which is not meant to be quoted. Which is where ;noquote needs to be added. That's all porting effort that is required. Actually, the variables are subject to HTML-quoting and internationalization. The suffix ;noquote means that the variable's content will be internationalized, but not HTML-quoted, while ;noi18n means quote, but don't internationalize. Finally ;literal means: don't quote and don't internationalize.
This is not hard because most variables will not be affected by this change. Most variables either need to be quoted (those containing textual data that comes from the database or from the user) or are unaffected by quoting (numerical database IDs, etc.) The variables where this behavior is undesired are those that contain HTML which is expected to be included as part of the page, and those that are already quoted by Tcl code. Such variables should be protected from quoting by the ;noquote modifier.
The Most Common Cases.
The most common cases where you need to add ;noquote to the variable name are easy to recognize and identify.
Hidden form variables.
Also known as "hidden input fields", hidden form
variables are form fields with pre-defined values which are not
shown to the user. These days they are used for transferring
internal state across several form pages. In HTML, hidden form
variables look like this:
ACS has a convenience function for creating hidden form variables, export_form_vars . It accepts a list of variables and returns the HTML code containing the hidden input tags that map variable names to variable values, as found in the Tcl environment. In that case, the Tcl code would set the HTML code to a variable:<form> <input name=var1 value="value1"> <input name=var2 value="value2"> ... real form stuff ... </form>
The ADP will simply refer to the form_vars variable:set form_vars [export_vars -form {var1 var2}]
This will no longer work as intended because form_vars will be, like any other variable, quoted, and the user will end up seeing raw HTML text of the hidden variables. Even worse, the browser will not be aware of these form fields, and the page will not work. After protecting the variable with ;noquote , everything works as expected:<form> @form_vars@ <!-- WRONG! Needs noquote --> ... real form stuff ... </form>
<form> @form_vars;noquote@ ... real form stuff ... </form>
Snippets of HTML produced by Tcl code, aka
widgets
.
Normally we try to fit all HTML code into the ADP template and have
the Tcl code handle the "logic" of the program. And yet,
sometimes pieces of relatively convoluted HTML need to be included
in many templates. In such cases, it makes sense to generate the
widget programmatically and include it into the template
as a variable. A typical widget is a date entry widget which
provides the user the input and selection boxes for year, month,
and day, all of which default to the current date.
Another example of widgets is the context bar often found on top of ACS pgages.
Obviously, all widgets should be treated as HTML and therefore adorned with the ;noquote qualifier. This also assumes that the routines that build the widget are correctly written and that they will quote the components used to build the widget.
Pieces of text that are already quoted.
This quoting is usually part of a more general preparation for HTML
rendering of the text. For instance, a bboard posting can be either
HTML or text. If it is HTML, we transmit it as is; if not, we
perform quoting, word-wrapping, etc. In both cases it is obvious
that quoting performed by the templating system would be redundant,
so we must be careful to add ;noquote to the ADP.
The property and include Gotchas.
Transfer of parameters between included ADPs often requires manual addition of ;noquote . Let's review why.The property tag is used to pass a piece of information to the master template. This is used by the ADP whose writer consciously chose to let the master template handle a variable given by the Tcl code. Typically page titles, headings, and context bars are handled this way. For example:
master:The obvious intention of the master is to allow its slave templates to provide a "title" and a "heading" of the page in a standardized fashion. The obvious intention of our slave template is to allow its corresponding Tcl code to set a single variable, title , which will be used for both title and heading. What's wrong with this code?<head> <title>@title@</title> </head> <body bgcolor="#ffffff"> <h1>@heading@</h1> <slave> </body>slave:<master> <property name="title">@title@</property> <property name="heading">@title@</property> ...
The problem is that title gets quoted twice, once by the slave template, and once by the master template. This is the result of how the templating system works: every occurrence of @variable@ is converted to [ns_quotehtml $variable], even when it is used only to set a property and you would expect the quoting to be suppressed.
Implementation note: Ideally, the templating system should avoid this pitfall by quoting the variable (or not) only once, at the point where the value is passed from the Tcl code to the templating system. However, no such point in time exists because what in fact happens is that the template gets compiled into code that simply takes what it needs from the environment and then does the quoting. Properties are passed to the master so that all the property variables are shoved into an environment; by the time the master template is executed, all information on which variable came from where and whether it might have already been quoted is lost.
This occurrence is often referred to as over-quoting. Over-quoting is sometimes hard to detect because things seem to work fine in most cases. To notice the problem in the example above (and in any other over-quoting example), the title needs to contain one of the characters <, > or &. If it does, they will appear quoted to the user instead of appearing as-is.
Over-quoting is resolved by adding ;noquote to one of the variables. We strongly recommend that you add ;literal inside the property tag rather than in the master. The reason is that, first, it makes sense to do so because conceptually the master is the one that "shows" the variable, so it makes sense that it gets to quote it. Secondly, a property tag is supposed to merely transfer a piece of text to the master; it is much cleaner and more maintainable if this transfer is defined to be non-lossy. This becomes important in practice when there is a hierarchy of master templates -- e.g. one for the package and one for the whole site.
To reiterate, a bug-free version of the slave template looks like this:
slave sans over-quoting:<master> <property name="doc(title)">@title;literal@</property> <property name="heading">@title;literal@</property> ...
The exact same problems when the include statement passes some text. Here is an example:
Including template:Here an include statement is used to include an HTML form widget parts of which are defined with Tcl variables $id and $default_reason whose values presumably come from the database.<include src="user-kick-form" id=@kicked_id@ reason=@default_reason@>Included template:<form action="do-kick" method=POST> Kick user @name@.<br> Reason: <textarea name=reason>@reason@</textarea><br> <input type="submit" value="Kick"> </form>
What happens is that reason that prefills the textarea is over-quoted. The reasons are the same as in the last example: it gets quoted once by the includer, and the second time by the included page. The fix is also similar: when you transfer non-constant text to an included page, make sure to add ;literal.
Including template, sans over-quoting:<include src="user-kick-form" id=@kicked_id;literal@ reason=@default_reason;literal@>
Upgrade Overview.
Upgrading a module to handle the new quoting rules consists of applying the process mentioned above to every ADP in the module. Using the knowledge gained above, we can specify exactly what needs to be done for each template. The items are sorted approximately by frequency of occurrence of the problem.- Audit the template for variables that export form variables and add ;noquote to them.
- More generally, audit the template for variables that are known to contain HTML, e.g. those that contain widgets or HTML content provided by the user. Add ;noquote to them.
- Add ;literal to variables used inside the property tag.
- Add ;noquote to textual variables whose values are attributes to the include tag.
- Audit the template for occurrences of <%= [ns_quotehtml @variable@] => and replace them with @variable@.
- Audit the Tcl code for occurrences of ns_quotehtml. If it is used to build an HTML component, leave it, but take note of the variable the result gets saved to. Otherwise, remove the quoting.
- Add ;noquote to the "HTML component" variables noted in the previous step.
Testing.
Fortunately, most of the problems with automatic quoting are very easy to diagnose. The most important point for testing is that it covers as many cases as possible: ideally testing should cover all the branches in all the templates. But regardless of the quality of your coverage, it is important to know how to conduct proper testing for the quoting changes. Here are the cases you need to watch out for.-
HTML junk appearing in the page.
Literal HTML visible to the user typically comes from an "export_form_vars" or a widget variable that lacks ;noquote. To fix the problem, simply add ;noquote to the variable. -
Over-quoting and under-quoting.
To detect quoting defects, you need to assume an active role in naming your objects. The best way to do it is to create objects (bboard forums, messages, news items, etc.) with names that contain the representation of an entity, e.g. "©". This looks like the copyright SGML entity, and intentionally so. The testing consists of checking that the browser prints exactly what you typed in as the name. Thus if your forum/message/etc. is listed as "©", everything is OK. If it is listed as "&copy;", it means that the string was quoted twice, i.e. over-quoted. Finally, if the entity is interpreted (shown as ©), it means that the string lacks quoting, i.e. it is under-quoted.To get rid of over-quoting, make sure that the variables don't get quoted in transport, such as in the property tag or as an attribute of the include tag. Also, make sure that your Tcl code is not quoting the variable name.
To get rid of under-quoting, make sure that your variable gets quoted exactly once. This can be achieved either by removing a (presumably overzealous) ;noquote or by quoting the string from Tcl. The latter is necessary when building HTML components, such as a context bar, from strings that come from the database or from the user.
Hrvoje Niksic Last modified: Fri Nov 1 14:11:00 CET 2019