Forum OpenACS Development: Database API & Query Replacement

Collapse
Posted by Ben Adida on
The most important issue of this porting is, of course, DB-related. With the new DB API and the huge shift from Tcl to PL/SQL functionality implementation, the job should be much easier.

However, there are issues to address:

  • How do we handle bind variables? Can we fake this in the driver until PG has real bind vars? That would be great for generality purposes
  • How do we organize the "replacement" queries that PG will be used instead of the default Oracle ones? Do we use a parallel file hierarchy? Remember that a query is uniquely identified by its name, package, and contained file name for Tcl pages, and name, package, and utility function for Tcl functions.
  • How do we handle queries that are "somewhat dynamic," meaning where the selected fields and ANSI-compliant where clauses are variable? Can we build a DB-independent API for building up these statements?
  • How do we handle queries that are "way too dynamic," meaning where the SQL is just fully dynamically generated? One idea is to mark the Tcl blocks that dynamically generate this SQL in order to simply swap out the whole code chunk, again without editing the original code files.
The goal here is to build a new core that could potentially remain Oracle-compatible. Using the same codebase for both databases, with one simple switch between Oracle and Postgres, would be a serious win.
Collapse
Posted by Dan Wickstrom on
I've been looking at the bind vars and it seems that we should be able to fake it either in a tcl proc or in the pg driver. It appears that all of the new db api calls go through db_exec which is shown below:
ad_proc -private db_exec { type db statement_name sql args } {

    A helper procedure to execute a SQL statement, potentially binding
    depending on the value of the $bind variable in the calling environment
    (if set).

} {
    set start_time [clock clicks]

    set errno [catch {
	upvar bind bind
	if { [info exists bind] && [llength $bind] != 0 } {
	    if { [llength $bind] == 1 } {
                # case 1 -> already an ns_set - pass it to the oracle driver with the -bind flag
		return [eval [list ns_ora $type $db -bind $bind $sql] $args]
	    } else {
                # case 2 -> we have a list - so convert to ns_set and pass to oracle driver
		set bind_vars [ns_set create]
		foreach { name value } $bind {
		    ns_set put $bind_vars $name $value
		}
		return [eval [list ns_ora $type $db -bind $bind_vars $sql] $args]
	    }
	} else {
            # case 3 -> go up to the calling scope and let the driver access the variables directly
	    return [uplevel 2 [list ns_ora $type $db $sql] $args]
	}
    } error]

    ad_call_proc_if_exists ds_collect_db_call $db $type $statement_name $sql $start_time $errno $error
    if { $errno == 2 } {
	return $error
    }

    global errorInfo errorCode
    return -code $errno -errorinfo $errorInfo -errorcode $errorCode $error
}

As far as I can tell there are three methods for using bind variables:

  1. Pass an ns_set with the -bind flag.
  2. Pass an array with key/value pairs and the -bind flag.
  3. Use tcl variables in the calling scope that match the bind variable names.
To get the porting effort going we should be able to modify the db_exec proc to handle any of these three cases and substitute in any bind variables that are required. It also seems that we should be able to modify the pg driver at a later date to handle the bind variable substitution.

I haven't looked at clob/blob with bind varibles, but it probably can be handled in a similar manner.

Has anybody seen any situations where we might run into trouble doing this?

Collapse
Posted by Sebastian Skracic on

Here's how I planned to implement bind variables for InterBase driver:

InterBase does not support bind variables alla Oracle. Instead it allows SQL statements with positional parameters, like

SELECT col1, col2 FROM table1 WHERE col3 = ? AND col4 > ?

Positional parameters are denoted with question marks and they have to be described using appropriate InterBase API calls after the statement is prepared, but prior its actual execution. Once prepared, statement can be reused (re-executed) using another set of positional parameter values.

My idea is to write thin Tcl layer (ns_ib) which will transform ACS-friendly constructs like:

SELECT col1, col2 FROM table1 WHERE col3 = :tclvar1 AND col4 > :tclvar2
into IB-friendly form shown above, grabbing $tclvar1 and $tclvar2 and passing them to driver as textual representation of positional parameter values. Fortunately, when filling positional parameters, InterBase can convert textual representation into any IB native datatype, although this conversion should not be too hard to write ourselves *if* we know which datatype we should convert into.

I suspect that PostGres provides similar functionality.

Collapse
Posted by Don Baccus on
PG supports similar functionality with stored SQL functions (usingstandard SQL syntax I believe???) but AFAIK not standalone SQLstatements.  I should dig into this a bit to see what's available...
I'm just catching up on ACS v4.0 and the porting effort, but here's a thought about organizing "replacement" (ironic quotes added) queries:

As the ACS4 DB API suggests (near the bottom) three possible strategies for storing queries not in the appropriate SQL dialect:

  1. The "db_implement_statement" approach - Explicitly call this API call with the path of logical SQL statement in the entire ACS on server startup. Cache the results (if a dialect mismatch) in: 1. an NSV array 2. a pinned DB table or 3. a special file (DBM?)
  2. The "file-based" approach - Use parallel files (ex: "foo.tcl" and "foo.pg7" files) writing in the corresponding SQL dialect
  3. The "magic file" approach - One massive file with translations for all the logical SQL statements in the ACS

I would like to see a solution that:

  1. Makes the transitions from any DB to any other DB as seamless as possible.
  2. Doesn't make the common case slow for this flexibility. How often does one switch DBs?
  3. Allows for the possibilitly of _multiple/mixed db_ ACS releases
  4. Avoids having multiple source files for every page to sync.
How about using the ACS 4.0 Object Model to do this?

Define a "SQL" object type with subtypes for each SQL dialect. The SQL for each dialect will be stored in separate tables, identified by the path-like scheme already suggested. Where dialect translation is not needed the "SQL" type will not be subtyped, resulting in less redundancy (and the syncing problems). Yes, storing the translation table in the DB will add to the DB overhead. Can this table be explicitly cached? Or pinned like the suggested solution 1.3 above?

This is my "first-pass" at the idea. I'm not sure if this is a viable solution. Comments?