ad_dom_sanitize_html (public)

 ad_dom_sanitize_html -html html [ -allowed_tags allowed_tags ] \
    [ -allowed_attributes allowed_attributes ] \
    [ -allowed_protocols allowed_protocols ] \
    [ -unallowed_tags unallowed_tags ] \
    [ -unallowed_attributes unallowed_attributes ] \
    [ -unallowed_protocols unallowed_protocols ] [ -no_js ] \
    [ -no_outer_urls ] [ -validate ] [ -fix ]

Defined in packages/acs-tcl/tcl/text-html-procs.tcl

Sanitizes HTML by specified criteria, basically removing unallowed tags and attributes, JavaScript or outer references into page URLs. When desired, this proc can act also as just a validator in order to enforce some markup policies.

Switches:
-html
(required)
the markup to be checked.
-allowed_tags
(optional)
list of tags we allow in the markup.
-allowed_attributes
(optional)
list of attributes we allow in the markup.
-allowed_protocols
(optional)
list of attributes we allow into links
-unallowed_tags
(optional)
list of tags we don't allow in the markup.
-unallowed_attributes
(optional)
list of attributes we don't allow in the markup.
-unallowed_protocols
(optional)
list of protocols we don't allow in the markup. Protocol-relative URLs are allowed, but only if proc is called from a connection thread, as we need to determine our current connection protocol.
-no_js
(boolean) (optional)
this flag decides whether every script tag, inline event handlers and the javascript: pseudo-protocol should be stripped from the markup.
-no_outer_urls
(boolean) (optional)
this flag tells the proc to remove every reference to external addresses. Proc will try to distinguish between external URLs and fine fully specified internal ones. Acceptable URLs will be transformed in absolute local references, others will be just stripped together with the attribute. Absolute URLs referring to our host are allowed, but require the proc being called from a connection thread in order to determine the proper current url.
-validate
(boolean) (optional)
This flag will avoid the creation of the stripped markup and just report whether the original one respects all the specified requirements.
-fix
(boolean) (optional)
When parsing fails on markup as it is, try to fix it by, for example, closing unclosed tags or normalizing attribute specification. This operation will remove most of plain whitespace into text content of original HTML, together with every comment and the eventually present DOCTYPE declaration.
Returns:
sanitized markup or a (0/1) truth value when the -validate flag is specified
Author:
Antonio Pisano

Partial Call Graph (max 5 caller/called nodes):
%3 test_ad_dom_sanitize_html ad_dom_sanitize_html (test acs-tcl) ad_dom_sanitize_html ad_dom_sanitize_html test_ad_dom_sanitize_html->ad_dom_sanitize_html ad_conn ad_conn (public) ad_dom_sanitize_html->ad_conn ad_dom_fix_html ad_dom_fix_html (private) ad_dom_sanitize_html->ad_dom_fix_html ad_log ad_log (public) ad_dom_sanitize_html->ad_log dom dom ad_dom_sanitize_html->dom parameter::get parameter::get (public) ad_dom_sanitize_html->parameter::get packages/general-comments/www/comment-add-2.tcl packages/general-comments/ www/comment-add-2.tcl packages/general-comments/www/comment-add-2.tcl->ad_dom_sanitize_html packages/general-comments/www/comment-edit-2.tcl packages/general-comments/ www/comment-edit-2.tcl packages/general-comments/www/comment-edit-2.tcl->ad_dom_sanitize_html

Testcases:
ad_dom_sanitize_html
[ show source ]
Show another procedure: