Forum OpenACS Development: Language lawyering

Collapse
Posted by Alfred Werner on
I have been following a few of the "let's clean up threads" and looked at some pages, like http://www.infiniteinfo.com/openacs/things2remember.html along with many threads on this site.

Curiosity got the better of me, so I ran some greps on the latest 4.6.3 beta2 I just downloaded.

There were a few surprises - for example, why doesn't this break?
./packages/notifications/tcl/sweep-init.tcl:    if { $n_seconds == 24 * 60 * 60 } {
./packages/notifications/tcl/sweep-init.tcl:    } elseif { $n_seconds == 7 * 24 * 60 * 60 } {

I thought you had to use [expr 24 * 60 * 60] ??
What is actually being compared here?

Collapse
2: Re: Language lawyering (response to 1)
Posted by Jeff Davis on
It does an expr on the if statement (well, maybe there are some subtleties but that is more or less true).
$ tclsh
% if { 4 == 2 * 2} { puts true}
true
% if { 4 / 0 } { puts true }
divide by zero
Collapse
3: Re: Language lawyering (response to 1)
Posted by Andrew Piskorski on
It's still probably bad style to depend on the implicit expr in the if statement to do your multiplication. I would have written it "if { $n_seconds == [expr {24 * 60 * 60}] }".
Collapse
4: Re: Language lawyering (response to 1)
Posted by Don Baccus on
I don't think it's bad style at all ... it is clear that "if" is intended to be used without the explicit call to "expr".

From the Tcl doc:

The if command evaluates expr1 as an expression (in the same way that expr evaluates its argument).
Collapse
6: Re: Language lawyering (response to 4)
Posted by Alfred Werner on
Welcome back Don. That explains it for me - I missed that line in the TCL docs.

I was really looking to see how many cases of

if {$somestring == 'some literal' }  were in the code and that was just a surprise to me.

Seems like the following things happen with some regularity:

$somestring == 'some literal'  happens quite a bit and could be cleaned up pretty easily

There seems to be some confusion / debate over whether

$somestring eq 'some literal'
[string equal $somestring 'some literal']
[string compare $somestring 'some literal']

are equivalent.

FAIK, [string compare] and eq have been around for ages - [string equal] was JUST introduced.

[string equal] does EXACTLY the same thing as eq - compares two STRINGS. The reason for introducing [string equal] when eq already exists is to allow the other [string] parameters to come into play, and also because eq only exists within the context of an [expr] construct, which as you pointed out is implicitly invoked in an if statement (and is the reason I always assumed it was a first order operator, rather than only existing within the expr context).

[string compare] thinks of the two operands as less,equal, or greater,
[string equal] and eq only says whether they are identical strings. If you want to be case insensitive, both [string equal] and [string compare] have the -nocase option.

I had always missed the following:
if {$a eq $b}
is implicitly if [expr {$a eq $b}]
I wrongly assumed it was made to be [expr $a eq $b] - which just doesn't work in the TCL interpreter.

So back to looking at the 'things to look out for' threads and pages:
$somestring eq 'some literal' vs $somestring == 'some literal'.

Those will NOT return a bad result if the literal can not be interpreted as a number. The worst case there is that the interpreter always tries them as a number first, so they are more inefficient because the literal will always be a string. The only time it may be actually WRONG to use the == is if both arguments are variables that the author intends/expects to be strings, but may conceivably be numbers during some invocation.

Collapse
5: Re: Language lawyering (response to 1)
Posted by Don Baccus on
Note that your rewrite still depends on the implicit "expr". The correct way of writing it to not depend on this implicit call would be:
if { [expr { $n_seconds == 24 * 60 * 60} ] }

ain't none of us goin' to start writing:

if { [expr { [string is true $foo_p] } ] }
in all of our "if" statements ...
Collapse
9: Re: Language lawyering (response to 5)
Posted by Andrew Piskorski on
Don, of course a Tcl if statement does an explicit expr, I never said that was bad. Most of the code I've read or written tends to use that solely to check boolean truth values though, not to do a much math, which is why I figured maybe it was a style faux paux. [shrug] Wouldn't be the first time I've been wrong.
Collapse
7: Re: Language lawyering (response to 1)
Posted by Don Baccus on
The latter was the reasoning and in fact apparently aD and OpenACS folk were unaware of "eq"/"ne".  I know I was until you pointed this out - I've been using "string equal" ...

Sounds like we could clean up our code a bit.  Also our general way of writing "set" expressions is sub-optimal, I know folks in general seem to realize it's good practice to surround your "if" expr in curly-braces, but I'd never thought about the fact that the same is true of "set".  We aren't getting full speed from the bytecode compiler it appears:

"Enclose expressions in braces for the best speed and the smallest storage requirements. This allows the Tcl bytecode compiler to generate the best code.

As mentioned above, expressions are substituted twice: once by the Tcl parser and once by the expr command. For example, the commands

set a 3
set b {$a + 2}
expr $b*4

return 11, not a multiple of 4. This is because the Tcl parser will first substitute $a + 2 for the variable b, then the expr command will evaluate the expression $a + 2*4.

Most expressions do not require a second round of substitutions. Either they are enclosed in braces or, if not, their variable and command substitutions yield numbers or strings that don't themselves require substitutions. However, because a few unbraced expressions need two rounds of substitutions, the bytecode compiler must emit additional instructions to handle this situation. The most expensive code is required for unbraced expressions that contain command substitutions. These expressions must be implemented by generating new code each time the expression is executed."

The "gotcha" seems awful until you realize that AFAIK we've never been bit by it in "if" statements ...

Collapse
8: Re: Language lawyering (response to 7)
Posted by Alfred Werner on
Don, it gets worse :)

Following your example from the docs -

$tclsh
% info tclversion
8.4
% set a 3
3
% set b {$a + 2}
$a + 2
% expr $b * 4
11
% if {$b * 4 == 11} {puts "yep"}
can't use non-numeric string as operand of "*"
% if {11 == $b * 4} {puts "yep"}
can't use non-numeric string as operand of "*"
% if {[expr $b * 4] == 11} {puts "yep"}
yep

Yikes! Damned if you do - damned if you don't. Or at least if you do - you have to do more than what might seem obvious at first reading.

It is worth noting that if you try:
set b $a + 2  it is an error - set takes only two operands/parameters
set b [expr $a + 2] is the correct form. So if you were INTENDING to defer the math and just pass the formula in some recursive lisp-ish type of way then you'd need to remember to later [expr] it.

I guess this is just a cautionary note for anyone introducing braces in their code without first really understanding the deep voodoo at work. OTOH, I think it would be safe to change a lot of == to eq within if clauses, and should the code break, there was probably something wrong-ish to begin with.

Collapse
10: Re: Language lawyering (response to 1)
Posted by David Walker on
In the tcl documentation the first reference to "string equal" is in version 8.2.3 (under the "string command", of course).  The first reference to "eq" is 8.4.2 under the "expr" command.

So I wouldn't fault the Arsdigita folks too heavily not using "eq".

Collapse
11: Re: Language lawyering (response to 10)
Posted by Alfred Werner on
Doh - must have been thinking of perl for the eq command.

Looks like eq appeared in 8.4.0 actually - here's the CHANGE file for the release:
http://sourceforge.net/project/shownotes.php?release_id=109968 - came out in Nov 2000, so that's probably why I've seen it around.

8.4.0 I think was only just recently compatible with AOLSserver (3.5.1 in Nov 2002 ?? ), which explains why nobody's really using it - it being [expr eq].

Collapse
12: Re: Language lawyering (response to 1)
Posted by Don Baccus on
Oh, this is good to know, sounds like we shouldn't encourage use of "eq" until we switch over to AOLserver 4/Tcl 8.4 ...

And now I don't feel so stupid for not knowing about "eq" :)

It's a good thing we don't do much math in OpenACS, ain't it?  How confusing.  If you don't curly brace stuff the expression gets recompiled each time, as the quote above implies - I dug into the Tcl source just to make sure and yes, indeed, it's true.  It pushes the tokenized string and an EVAL bytecode op, and EVAL bytecompiles and executes the result.  If you bracket the bytecode itself is saved instead.

Which is why

if { ... }

is recommended even if the expression is a proc call.

Collapse
13: But, what is truth? (response to 1)
Posted by Alfred Werner on
% if 0 {puts true}
% if {0} {puts true}
% if true {puts true}
true
% if 1 {puts true}
true
% if {1} {puts true}
true
% if {true} {puts true}
true
% if {"true"} {puts true}
true
% set blah true
true
% if {$blah} { puts true}
true
% if {t} {puts true}
true
% if {yes} {puts true}
true

% if {randomstring} {puts true}
syntax error in expression "randomstring": variable references require preceding $
% set blah randomstring
randomstring
% if $blah {puts true}
syntax error in expression "randomstring": variable references require preceding $
% if {$blah} {puts true}
expected boolean value but got "randomstring"

% set blah -1.2
-1.2
% set blah -4323.937
-4323.937
% if $blah {puts true}
true
%  if {$blah} { puts true}
true

TCL is surprisingly generous in its ability to handle TRUTH. 1, t, true, and yes are all true as well as any number value that is non-zero. To be truly false you have to be equal to 0, f, false, no.

% set blah no
no
% if {$blah} { puts true}
%
% set blah false
false
% if {$blah} { puts true}
%
% if {!$blah} {puts true}
true

So, I'm not sure when all these versions of the truth showed up, but constructs like:

if {[string compare $html_p "t"] == 0} can be changed to
if {$html_p}
in /packages/ecommerce/www/admin/tools/spell.tcl
AND
if { [string compare $unique_users_p "t"] == 0 }
if {$unique_users_p}
in /packages/simple-survey/www/admin/responses.tcl

Ecommerce seems the most heavily 'protected' as far as using [string] and invoking == 0 against the results.

In most cases, people use 1 and 0 to store true and false, and then compare the result with == 0 anyway. In some cases, they actually store the result backwards (0 as true) but they are saved by the == 0.

find . -name *.tcl | xargs grep "&&"
find . -name *.tcl | xargs grep "||"
find . -name *.tcl | xargs grep "==0"
find . -name *.tcl | xargs grep "== 0"

Will find the majority of candidates.

A conversion to using boolean names - including in package configuration parameters, would lead to much more readable code - I'm willing to bet a lot of people often forget which is true and false - 1 and 0, which is why so many == 0 instances in the code. Strongly recommending 'true' and 'false' (or pick one of the three pairs) will help clean things up.

Ultimately [anything] == 0 should only be used in if statements when people are actually comparing numbers.

To that end, we should discourage the practice of _p variable names if people are actually storing counts.

Ultimately {[string compare $a $b] == 0} should be globally replaced with {$a eq $b}.

That's enough fun for a rainy Sunday afternoon :)

Collapse
14: Re: Language lawyering (response to 1)
Posted by Don Baccus on
This is good stuff.  I think the generousity of Tcl truth may date from Tcl 8 as I *think* that's when "string is true" was implemented, and clearly the "if" command is using "string is true" on the result (I say this because "string is true" is documented to accept <>0,t,true,yes, the very semantics you've uncovered)

So I suspect the old coding style you've uncovered may be due to Tcl 7.x semantics.

I've switched to using "string is true" because I didn't know that "if" was doing this automatically so now we can clean these out, too, can't we? :)

And, yes, we can start using "eq" once we require AOLserver 4.0 (not until after OpenACS 5.0, unfortunately, because AOLserver 4.0's still beta and though it should be finalized before OACS 5.0 is finalized it would be rather harsh to force folks to switch right away)

Until we know we can retire aol3.3+ad13 we can't really make this switch ...