Forum OpenACS Development: ad_page_contract optional default will clober a local var value

Hi,

I have applied a patch on this one:

https://openacs.org/bugtracker/openacs/bug?filter%2estatus=open&bug%5fnumber=891

Lars did point out that I should have solicited comments.  Anyway I thought it was not a change but a fix, I did not think about posting it.  I did post it the bugtracker to have it documented other than just a CVS log.  I do normally ask opinions when there are changes like the other issue that I posted about ad_page_contract.

Anyway to answer Lars question. The motivation behind it when a var is present before ad_page_contract either by ad_return_template, include, or plain tcl code.  The default value will override the current value.  Of course the form value will always be followed if its supplied.  So I did not see any major change here.  In my honest opinion is was a bug.  I apologize if I applied it without posting to the forum.

I think an optional form var should not overwrite an existing tcl var value if no form var is present.  Of course  if the optional form var is present it will be followed.

The old was:

- get the value from the form var if present
- get the value from the default if present

The fix was:

- get the value from the form var if present
- get the value from the tcl var if present
- get the value from the default if present

btw,

if the form var is present and tcl var is present.  it will complain that there is 2 values.  instead of clobering the tcl var silently.

Jun

Did you change the behavior of an api: yes. Is _that_ a bug: yes. For this you should not even have to ask: don't do it.

Lars is being nice. Unfortunately even if you did post, you might not get any reponse, and you go ahead and make the change. Before you know it we are going to have a toolkit full of junk. And we will end up having to pay folks to use it.

Jun, nothing personal, but we really need to get this habit under control. This is not a place to experiment and see what happens. Many folks have real code that relies on the core apis, and you can't just change the way they work because you want to, for your special situation. This is a very basic principle.

Think about this: we have been using these same procedures for years. What is the chance, after all this time, that what you found is a bug? Maybe at least ask around. And then even if it might be a bug, is it possible someone relies on this behavior?

Example. the ATS 'if' tag has an 'in' operator. To me this is a list, unfortunately it is implemented as a regexp, which gives it some funny behaviors. But in some cases these behaviors are useful, and I have relied on them. But because of the way it works, you can't just pass in a list as a variable. Is that a bug? Yes and no. You can't break the old behavior just to get the thing to work the way you want.

I posted a recommendation to add something like an 'inlist' operator that would work for lists, or lists of lists. No one responded. I assume it is unimportant. I don't put it in. Finished. I still use this on my sites as a convenience to me, but I'm not going to burdon everyone else with my fancy programming abilities.

Example: Lars suggested adding a -unclobber switch to db_multirow. I complained, saying it changed the behavior of the way db_multirow worked. Actually it turns out that db_multirow was changed before, adding the loop found in the db_foreach api. Unfortunately when this was added care was not taken to ensure that db_multirow worked the same way regardless of whether the code block was included or not. When the code block is included, variables get clobbered in the calling environment. When there is no code block this doesn't happen. This is dumb, because you might write a page one day, and everything works. Tomorrow, you add a foreach loop and you have to then go around changing the names of things that get clobbered. It also might not be apparent what is causing the problem, because only the value is incorrect, you don't really get an error with this kind of bug. To me this really was a bug in db_multirow. Lars probably would have been right to completely remove this behavior, but he only suggested a switch to adjust it when needed.

This last example points out something else that is important when making these kinds of changes. Providing background as to the problem you have found helps others evaluate the changes you are suggesting. Not many of us have the time to go look at the code in the detail you have had to to make the change. What you are suggesting could be the perfect solution, but you have to tell us how you got there. There was a reasoning process involved, but the rest of us are starting from scratch.

The behavior of getting the value from the existing tcl var if the formvar is optional and not present in the form seems very confusing and a potential source of silent errors to me.

If we are to change this behavior I'd rather see it give an error because, well, it probably is an error.  At least I wouldn't recommend depending on this behavior as being good programming practice.

I agree with Tom, though - we need to take great care when we change core APIs because real live production code lives and dies with our toolkit.  Just silently making a change is a big no-no.

Ok, I apologize for not soliciting comments for this.  I did not make the right judgement in committing this.

Can I ask an opinion which one to do:

- reject the patch and revert back to old.  Wherein the tcl var gets overridden by the default value of the ad_page_contract.

- check if the value exists and throw an error.

- check if the value exists and throw an error.  And add a switch not to clobber, so no error is thrown.

Thanks.  I apologize of the lack of judgement on this one.  Also I don't think I have shown any habits of changing this toolkit for my own good.

Jun, I know what it is like to search through a bunch of code to figure out why my code isn't working. If the problem happens to be in the core, I usually change it, because that is what I need for my stuff to work.

I think the hard decision comes when you want to apply your logical and working fix to the cvs copy.

Possibly a temporary fix to avoid changing your core copy is to create a package which loads after the core and replaces the proc which has the offending logic. I did this the other day for ns_getform.

It seems to be a fine line between improving the toolkit and frustrating other developers that don't have the time to keep up with all the changes, and the reasons for the change.

But here is what I understood from reading your original post. Somehow a variable exists before ad_page_contract is called. Now, what is the purpose of ad_page_contract? The purpose is to enforce the interface to that page. It's a contract. The behavior you describe seems to support the intended behavior of ad_page_contract. It should be clear from reading the tcl page which contains ad_page_contract that a variable is either supplied in the query, or a specified default is used. Now, if you start adding in your own stuff and sourcing this page in some other way, or including it in another page, why would anyone expect it to work the way they wanted it to. So if ad_page_contract was changed to use some other value, hunting down bugs would be very difficult. Future developers would have no way of knowing, by looking at the page, what values might end up there, without the current behavior.

Also, if this is changed, what is ad_page_contract supposed to do: validate the data already present in the variable with the filters specified?

An additional feature of ad_page_contract is to complain if you supply two copies of a variable. I'm not sure you will find an easy solution to determine when a variable came from some other location (the place you wanted it to) or from a buggy form. If it just uses the current value, that means ad_page_contract no longer complains when a form submits two values for one variable. The result is, again, a very hard to find bug, this time in a form.

I really think we need to try to limit the use of ad_page_contract for its intended purpose, which is to document and enforce the interface to the page.

Hi Tom,

I do make use of the practice you have given.  Normally I create a zzz-something-procs.tcl to change things that I want to change, but doesn't seem to be applicable to the general public. I guess its a fine line that I have misjudged.  Normally when I see a bug, I put a patch for it.  To my judgement at that time and testing, this patch will not create big changes.  And I am able to patch, so I did patch it after filing on the bug tracker.  Thanks for Lars for pointing this out.

Anyway I will just revert back.  Although I think we need to keep the bug open.  I think its a bug, maybe another solution can be more acceptable.

The backgrounder how I found this bug was.  I was using ad_return_template behaviour that the same level is being used.

For example I have generic.tcl which ad_return_template to specific.tcl.  I then added an optional form var, so I can override the value to specific.tcl.  For example /foo/bar generic.tcl will get some vars and decide which specific.tcl it needs to return.  Then I created something like this /foo/bar?usethis=value.  I think I first found this when passing some optional values to a preview page.  I could pass it ok, but then after the values with default of ad_return_template was not being accepted.  When it actuality its being clobbered.

Jun,

Have you reverted all that needs to be reverted?

I updated the whole code base and run into this problem:

[05/Oct/2003:14:10:04][2636.16384][-main-] Notice: Scheduling proc auth::sync::sweeper
[05/Oct/2003:14:10:04][2636.16384][-main-] Error: Error sourcing /web/oacsdev/packages/acs-content-repository/tcl/acs-content-repository-init.tcl:
invalid command name "template::filter"
    while executing
"template::filter add content::init"
    (file "/web/oacsdev/packages/acs-content-repository/tcl/acs-content-repository-init.tcl" line 1)
    invoked from within
"source $__file "
[05/Oct/2003:14:10:04][2636.16384][-main-] Notice: Scheduling proc lang::catalog::import_from_all_files_and_cache
[05/Oct/2003:14:10:04][2636.16384][-main-] Notice: lang::message::cache - Initializing message cache ...
The colored diff from the changes four days ago is here

I'll keep digging but if someone finds it before I do, please post here. Thanks.

Hi Ola,

Yes I have reveted.  Its basically just an and condition to check the var. e.g([info exists]).

The diff you have presented I believe is unrelated to this.  That diff is this patch by Brad.

https://openacs.org/bugtracker/openacs/com/acs-templating/bug?filter%2estatus=resolved&bug%5fnumber=651

I helped out Lars in looking for Brad.  I got Brad's diff from the head.  I added a small one liner documentation.  Other than I applied that patch.

I also looked at tcl/acs-content-repository-init.tcl  I am not sure if its related to it.  I also did some quick test on template::foward, it does work with simple and "t" call that was added by Brad.

Can you give me some more info, how you encountered the error?

Forgive me, Jun,

I have myself to blame for this error ... HEAD works without problem.

Sorry about jumping to conclusions.

/Ola

Jun Please refer to our Previous discussion from Sept. 21 on this subject. I think I and several other folks made it clear that this is a big deal that should be condensed into a TIP. However, you have yet to provide one example of why this is a bug. ad_page_contract is doing something you don't like, but it is doing exactly what it was designed to do.

Jun,

FYI, I just found out that it was the .info file in acs-templating I had corrupted by using double quotes in the parameter description ... that was enough to hose that package. Grrr! I'll have to make do with single quotes :-)

This will teach me to always make a totally fresh installation before reporting errors.

/Ola

Ola,

Its ok.  Also thanks to you I retested things up.  I found some 2 bugs on wizard which dependend on the template::foward for Brad.  I have fixed one (some lines got messed up on Brad's diff), still trying to fix another (I will likely need Brad's help this one).

Tom,

Thanks for pointing that out.  But I have to say its not really related to this.  I have also indicated on my last post on thread that the alternative solution are not successful.  And I will not post a tip anymore.  As you guys call it...  Beating a dead horse.  You can beat that dead horse on the other thread. The other thread is not a bug, its a change so I can not provide an example that shows its a bug.  So I posted it, did not file a bug on the bugtracker and discussed.  Got consensus its not good, took the suggested alternative, unable to provide a solution in time, concluded the thread.

As opposed to this thread, I have judge that it was a bug, so I posted on the bug tracker.  And my mistake to misjudge not to post it.  You can beat this horse too.  And I think clearly that I am not mucking around and changing things all over to my liking.  I filled a bug, made a patch and volunteered to apply it.  I have used that patch for a couple of weeks, and before applying did some testing and I did not find any drawbacks.

I have reverted so any further discussion and testing can be done.  Sorry Tom but is so darn frustrating.  I am just trying to help out here.  Maybe you can help out and apply the patch to your instance and see if it breaks something. I am pertaining to the patch on thread, not the other thread.

Anyway I better sleep, its like my am not thinking straight anymore.

Jun, I don't think Tom's accusing you of making changes willy-nilly to the core APIs.  This is not the first time the subject of changing the core API without notice has come up - just the first time in awhile because we've educated ourselves not to do so.

But a bunch of us have made this particular error in the past, don't feel like you're being singled out other than of course you happened to make this particular change.

As far as ad_page_contract itself goes, I agree with Tom and would personally prefer that it flag as an error any existing Tcl var with the same name as one of the vars listed as potential GET or POST values being passed to the page. I don't feel strongly enough about this to post a TIP but I'm not supportive of your suggested change, either.

And to make things clear, that's in no way personal, I just disagree with your thinking in this particular case.

Absolutely, Jun, this is nothing personal, at all. I know I am less than diplomatic at times (like most of the time), so take that into account as well.

I did a little more digging on the effect of the patch. It actually turns out that the effect would be that any existing variable would skip validation/filtering completely, although post filters will still run. But the main filters are skipped because the variable isn't in the form.

I think Jun can get the effect he wants by doing the following:

  • Remove the variable from the environment that calls ad_page_contract.
  • Put the variable into the request form as a query var.
  • Call ad_page_contract
Hi,

Thanks to Tom for further testing this.  And catching a problem that I did not catch from my test.

I have put up 3 options on top.  One of it is already done, since I reverted already.  Don does suggest to flag the error which one of the option.

How would we like to flag this error?  Do we throw a:

- error "blah blah"
- use ns_log to give the error?

I am thinking that the error should be more visible to the developer.  But what if this happens on run time, maybe its a bit too much for the user?  Use ns_log?