Forum OpenACS Development: Proposal for Tcl/SQL contract

Collapse
Posted by Dan Chak on
I have been trolling the bboards, and I am impressed at how well OpenACS-4 is coming along! I have one concern about something that might impede future development though, and would like to get some feedback on a possible solution I have been thinking about.

The old style of writing a db query (but using the db-abstraction) in a tcl page went something like:
db_1row query_name "
       select u.first_names,
              u.last_name,
              s.something
         from users u,
              something_tbl s
        where s.user_id = u.user_id"
And similarly for db_foreach, etc. In these statements, the variables first_names, last_name, and something are all implicitly being declared and defined by the db-abstraction proc. One is free to use them later in the tcl page.

However, what we are going to see soon is something like:
db_1row query_name
The tcl code following the query will still use the variables that are implicitly declared, but by examining a tcl page, there is really no way to know what variables have been declared for use by the db proc. We could encourage people to put comments around their db calls, but comments may become incorrect if someone updates a sql file but forgets to update the comments in the tcl file.

I read that Don Baccus was planning to clean sweap out the sql portions of the tcl files once the porting is completed. I would like to instead propose that the old sql code be replaced with a small contract, so instead of leaving only:
db_1row query_name
which describes nothing of the implicity declarations taking place in the db proc, we instead can have
db_1row query_name {first_names last_name something}
This way, there is still something left in the tcl file that both describes what variables are availabe after the db proc executes, and also describes what is expected of the sql code in the .xql file.

Additionally, one could invision somewhere down the road a code checking tool that compares the sql contract in tcl code and sql code in a .xql file and alerts you when they are out of sync.

Some things I would need other's input on:
  • Should the contract to go both ways? IE, should bind variables also be part of the contract?
  • Does this make programming more tedious, or more clear? Personally, I see it as writing more self-documenting code, much as ad_page_variables was much better at self documenting than set_the_usual_form_variables was.
  • anything else someone can think of.
Comments please.
Collapse
Posted by Domingo Alvarez duarte on
Well I was thinking about the code obscurity that removing the sql queries from the source files will promote as well.
And I'm not so happy with the amount of overhead the new database API is adding to the program, one of my thoughts was to preserve the actual style :

db_1row query_name "
      select u.first_names,
              u.last_name,
              s.something
        from users u,
              something_tbl s
        where s.user_id = u.user_i

and based on the query name run at installation time a script that will parse the code and based on the query name will replace the query string based on the database in use.

That way we can work confortable with the code and run a utility script that will extract the query strings and add then to a repository.

And as a bonus there isn't so much overhead.

Collapse
Posted by Dan Chak on
Domingo, your solution addresses the obscurity of variables, but I don't think it's a good solution for a couple reasons.

  • The query dispatcher is already a reality. Questioning its existance seems, at least to me, to be a moot point already. My proposal will still work with the query dispatcher as-is (it already ignores the second variable passed to a db-proc, which it can continue to do unless we want contract enforcement), whereas your script solution calls for a large change to the acs-core.
  • Having vestigial sql code in tcl files will be extremely error prone. For your solution to be viable, programmers should not be allowed to edit the tcl files to fix sql code. They would still have to edit the .xql file, and then run a script that would update the tcl file. I can envision people accidentally editing the tcl file and then obliterating their changes when they later run the script.

So I'm still keeping the same proposal of a short contract in the db proc that in essance is little more than an enforced comment. The decision to keep tcl and sql code separate is a good one, but there does need to be some level of obviousness in the tcl logic code as to where variables are coming from.

The system as-is is fine for porting, but it will be a nightmare when writing and debugging new modules, which I think we can all agree, is what will determine the longevity of OpenACS. Having to ask questions like "which file am I allowed to edit?" or "where did that variable come from?" will make it very unapealing for people to develop new modules.
Collapse
Posted by Ed Avis on

Personally I think that having mysterious SQL code in the .tcl files as well as in .xql files is a really bad idea. You'd end up with duplicate code in two places, and not know which one to change (or maybe you'd have to change both). Also you don't want code in your .tcl file which is never actually executed!

(Unless I have misunderstood Domingo Alvarez duarte's proposal.)

If you really really want to have a reminder of the SQL appearing in your .tcl file, make it into a comment! I'd advise against it, but if you like that sort of thing, using a comment gives the same effect as having a string of inline SQL code which is never used, but without the confusion.

The idea for a contract specifying what variables are used is a good idea. IMHO the best way to do it would be to 'call' SQL code like a procedure:

# Get the thingy with id 12345 and store its details into the 
# variables name and height.
# 
db_1row get_thingy {12345} {name height}

And in the .xql file, something like this (just pseudocode really, I'm not pretending that this is what you'd actually write):

get_thingy {
    # Input parameters
    id
} {
    # Output variable names
    nm ht
} {
    select thingy_name as nm,
           thingy_height as ht
    from thingies
    where thingy_id = :id
}

Then there wouldn't really be a possibility of getting out of sync because the variables must get matched up when the query is called. But this is something a bit further than what Dan Chak suggested.

With the original proposal of
db_1row query_name {first_names last_name something}

I'd be in favour _if_ these variable names are checked automatically somehow. If they are not checked and not actually used for anything, they shouldn't be in the code - better to put them as a comment explaining your intentions.

Collapse
Posted by Stephen . on
Oracle 9i supports standard SQL join syntax, case expressions, timestamp etc...

My guess is that most queries can now be made identical for both Oracle and Postgresql, and it would make sense to keep 'standard' queries in the Tcl files with any remaining database specific queries in xql files.

Collapse
Posted by Rob Wright on
I agree with Ed, that a signature for the queries would be nice if some type of validation was performed.  Nevertheless, as Dan described, the fundamental problem is that a query name will probably not give you all the information you want.  This can be the case with any method, parameter, or class.  The best solution may be just good software practice and thoroughly document the code.  But this is one area where a good IDE could can be of great benefit.  So while it may not be the best solution, another alternative may be to write an Emacs or Vi module so that with a keystroke, the query is looked up when the query name is highlighted/selected/active.  This could be added to the file manager or SDM, and the documentation include a reference to it as a useful tool.

just my thoughts...

Collapse
Posted by Domingo Alvarez duarte on
Ed and Rob idea sounds good and that will permit query reuse so no need for duplicate queries. And using even any IDE or editor extension will facilitate code writing or  revision.
Collapse
Posted by Ben Adida on
I think Dan's idea is quite good. Remember the important tenets of the query dispatcher:
  • we need to support multiple DBs. Possibly more than just Oracle and PG in the future. This means giving up some coding convenience. Such is life: give up short-term convenience for long-term maintainability.
  • we should make it possible for DBAs to optimize queries without digging into Tcl code.
  • we should cleanly separate Tcl from SQL so that the query dispatcher can be ported to other languages (for those Java and C# enthusiasts) without any dependence on the Tcl language.

Dan's proposal significantly improves the maintainability of the code with minimal programmer hassle, and without breaking the above tenets. I'd love to hear more comments on this particular proposal, as I am seriously considering implementing it in some form.

Collapse
Posted by Jonathan Marsden on
I like Dan's idea too.  Ed's enhancement of separating input and
output variables may also have merit, assuming it can be coded without
significant extra overhead.

Having db_1row somequery just magically set up several TCL variables would be ... hard to read and debug.  The proposed change solves that with minimal programmer hassle.

Go for it!