util_expand_entities_ie_style (public)
util_expand_entities_ie_style html
Defined in packages/acs-tcl/tcl/text-html-procs.tcl
Replaces all occurrences of o and &x0f; type HTML character entities to their ASCII equivalents. It also handles lt, gt, quot, ob, cb and amp.
This proc does the expansion in the style of IE and Netscape, which is to say that it doesn't require the trailing semicolon on the entity to replace it with something else. The reason we do that is that this proc was designed for checking HTML for security-issues, and since entities can be used for hiding malicious code, we'd better simulate the liberal interpretation that browsers does, even though it complicates matters.
Unlike its sister proc,
util_expand_entities
, it also expands numeric entities (#999 or #xff style).
- Parameters:
- html
- Author:
- Lars Pind <lars@pinds.com>
- Created:
- October 17, 2000
- Partial Call Graph (max 5 caller/called nodes):
- Testcases:
- No testcase defined.
Source code: array set entities { lt < gt > quot \" ob \{ cb \} amp & } # Expand HTML entities on the value for { set i [string first & $html] } { $i != -1 } { set i [string first & $html $i] } { set match_p 0 switch -regexp -- [string index $html $i+1]] { # { switch -regexp -- [string index $html $i+2] { [xX] { regexp -indices -start [expr {$i+3}] {[0-9a-fA-F]*} $html hex_idx set hex [string range $html [lindex $hex_idx 0] [lindex $hex_idx 1]] set html [string replace $html $i [lindex $hex_idx 1] [subst -nocommands -novariables "\\x$hex"]] set match_p 1 } [0-9] { regexp -indices -start [expr {$i+2}] {[0-9]*} $html dec_idx set dec [string range $html [lindex $dec_idx 0] [lindex $dec_idx 1]] # $dec might contain leading 0s. Since format evaluates $dec as expr # leading 0s cause octal interpretation and therefore errors on e.g. & set dec [string trimleft $dec 0] if {$dec eq ""} {set dec 0} set html [string replace $html $i [lindex $dec_idx 1] [format "%c" $dec]] set match_p 1 } } } [a-zA-Z] { if { [regexp -indices -start $i {\A&([^\s;]+)} $html match entity_idx] } { set entity [string tolower [string range $html [lindex $entity_idx 0] [lindex $entity_idx 1]]] if { [info exists entities($entity)] } { set html [string replace $html $i [lindex $match 1] $entities($entity)] } set match_p 1 } } } incr i if { $match_p } { # remove trailing semicolon if {[string index $html $i] eq ";"} { set html [string replace $html $i $i] } } } return $htmlXQL Not present: Generic, PostgreSQL, Oracle