Forum OpenACS Q&A: Another solution to Problems with Microsoft char set (eg 'smart quotes') in form input
6:
Another solution to Problems with Microsoft char set (eg 'smart quotes') in form input
(response to 1)
Posted by
Stan Kaufman
on 07/22/02 11:02 PM
Rich, thanks for pointing to your code. I like your use of the
string map
Tcl proc rather than regsub
(more elegant and besides Brent Welch uses it in a similar example for getting rid of "smart quotes" in his Tcl book).
We've found it useful to include a few more mappings for Microsoft's "smart fractions" etc. Here's our version of this "decrufing" proc:
proc_doc decruft { cruft } { Takes a string removes all the cruft introduced by Microsoft apps, such as their 'smart quotes'. Brute-force approach suggested by John Walker's Demoronizer, a Perl script which does a few other things that aren't germane here. This proc could get called lots of places, but to make it automatically run against all user input, we call it from ad_page_variables and (for backward compatibility since this still lurks in the code) set_the_usual_form_variables. It should be trivial to add it to page_contract or whatever OACS 4.5+ uses. } { # ns_log Notice "Before De-Cruft: $cruft" set cruft [ string map [ list x82 , x83 f x84 ,, x85 ... x86 t x87 I x88 ^ x89 { */**} x8a S x8b < x8c Oe x8d {} x8e Z x8f {} x90 {} x91 ` x92 ' x93 {"} x94 {"} x95 * x96 - x97 -- x98 ~ x99 tm x9a S x9b > x9c oe x9d {} x9e Z x9f Y xbd 1/2 xbc 1/4 xbe 3/4 ] $cruft ] # ns_log Notice "After De-Cruft: $cruft" return $cruft }
In addition, instead of calling this proc within modules like bboard and news, we find it useful to push the call back into ad_page_variables
(and set_the_usual_form_variables
since that still gets called some places). That way it always gets called regardless of the destination of the form data. FWIW, here's how we do it:
proc_doc ad_page_variables {variable_specs} { Current syntax: ad_page_variables {var_spec1 [varspec2] ... } This proc handles translating form inputs into Tcl variables, and checking to see that the correct set of inputs was supplied. Note that this is mostly a check on the proper programming of a set of pages. Here are the recognized var_specs: variable; means it's required {variable default-value} Optional, with default value. If the value is supplied but is null, and the default-value is present, that value is used. {variable -multiple-list} The value of the Tcl variable will be a list containing all of the values (in order) supplied for that form variable. Particularly useful for collecting checkboxes or select multiples. Note that if required or optional variables are specified more than once, the first (leftmost) value is used, and the rest are ignored. {variable -array} This syntax supports the idiom of supplying multiple form variables of the same name but ending with a "_[0-9]", e.g., foo_1, foo_2.... Each value will be stored in the array variable variable with the index being whatever follows the underscore. There is an optional third element in the var_spec. If it is "QQ", "qq", or some variant, a variable named "QQvariable" will be created and given the same value, but with single quotes escaped suitable for handing to SQL. Other elements of the var_spec are ignored, so a documentation string describing the variable can be supplied. Note that the default value form will become the value form in a "set" Note that the default values are filled in from left to right, and can depend on values of variables to their left: ad_page_variables { file {start 0} {end {[expr $start + 20]}} } } { # ns_log Notice "ad_page_variables" set exception_list [list] set form [ns_getform] if { $form != "" } { set form_size [ns_set size $form] set form_counter_i 0 # first pass -- go through all the variables supplied in the form while {$form_counter_i<$form_size} { set variable [ns_set key $form $form_counter_i] set value [ns_set value $form $form_counter_i] check_for_form_variable_naughtiness $variable $value set found "not" # find the matching variable spec, if any foreach variable_spec $variable_specs { if { [llength $variable_spec] >= 2 } { switch -- [lindex $variable_spec 1] { -multiple-list { if { [lindex $variable_spec 0] == $variable } { # variable gets a list of all the values upvar 1 $variable var lappend var $value set found "done" break } } -array { set varname [lindex $variable_spec 0] set pattern "($varname)_(.+)" if { [regexp $pattern $variable match array index] } { if { ![empty_string_p $array] } { upvar 1 $array arr set arr($index) [ns_set value $form $form_counter_i] } set found "done" break } } default { if { [lindex $variable_spec 0] == $variable } { set found "set" break } } } } elseif { $variable_spec == $variable } { set found "set" break } } if { $found == "set" } { upvar 1 $variable var if { ![info exists var] } { # take the leftmost value, if there are multiple ones set var [ns_set value $form $form_counter_i] } } incr form_counter_i } } # now make a pass over each variable spec, making sure everything required is there # and doing defaulting for unsupplied things that aren't required foreach variable_spec $variable_specs { set variable [lindex $variable_spec 0] upvar 1 $variable var if { [llength $variable_spec] >= 2 } { if { ![info exists var] } { set default_value_or_flag [lindex $variable_spec 1] switch -- $default_value_or_flag { -array { # don't set anything } -multiple-list { set var [list] } default { # Needs to be set. uplevel [list eval set $variable "[subst [list $default_value_or_flag]]"] # This used to be: # # uplevel [list eval [list set $variable "$default_value_or_flag"]] # # But it wasn't properly performing substitutions. } } } } else { if { ![info exists var] } { lappend exception_list ""$variable" required but not supplied. Bummer." } } # modified by rhs@mit.edu on 1/31/2000 # to QQ everything by default (but not arrays) if {[info exists var] && ![array exists var]} { # Begin De-Cruft stuff here # ns_log Notice "Before De-Cruft: $var" set var [decruft $var] # ns_log Notice "After De-Cruft: $var" # End De-Cruft stuff here upvar QQ$variable QQvar set QQvar [DoubleApos $var] } } set n_exceptions [llength $exception_list] # this is an error in the HTML form if { $n_exceptions == 1 } { ns_returnerror 500 [lindex $exception_list 0] return -code return } elseif { $n_exceptions > 1 } { ns_returnerror 500 "<li>[join $exception_list " <li>"] " return -code return } }
For amusement value, here's a demo we created that shows the problem and the fix: http://www.epimetrics.com/demos/decrufter?demo_id=7