Tcl procedures: /packages/acs-kernel/10-database-procs.tcl
Tcl initialization: /packages/acs-kernel/database-init.tcl
One of OpenACS's great strengths is that code written for it is very close to the database. It is very easy to interact with the database from anywhere within OpenACS. Our goal is to develop a coherent API for database access which makes this even easier.
There were four significant problems with the way OpenACS
previously used the database (i.e., directly through the
ns_db
interface):
Handle management. We required code to pass database handles around, and for routines which needed to perform database access but didn't receive a database handle as input, it was difficult to know from which of the three "magic pools" (main, subquery, and log) to allocate a new handle.
-
Nested transactions. In our Oracle driver,
begin transaction
really means "turn auto-commit mode off" andend transaction
means "commit the current transaction and turn auto-commit mode on." Thus if transactional code needed to call a routine which needed to operate transactionally, the semantics were non-obvious. Consider:proc foo { db args } { db_transaction { ... } } db_transaction { db_dml unused {insert into greeble(bork) values(33)} foo $db db_dml unused {insert into greeble(bork) values(50)} }
This would insert greeble #33 and do all the stuff in
foo
transactionally, but theend transaction
infoo
would actually cause a commit, and greeble #50 would later be inserted in auto-commit mode. This could cause subtle bugs: e.g., in the case that the insert for greeble #50 failed, part of the "transaction" would have already have been committed!. This is not a good thing. Unorthodox use of variables. The standard mechanism for mapping column values into variables involved the use of the
set_variables_after_query
routine, which relies on an uplevel variable namedselection
(likewise forset_variables_after_subquery
andsubselection
).Hard-coded reliance on Oracle. It's difficult to write code supporting various different databases (dynamically using the appropriate dialect based on the type of database being used, e.g., using
DECODE
on Oracle andCASE ... WHEN
on Postgres).
The Database Access API addresses the first three problems by:
making use of database handles transparent
wrapping common database operations (including transaction management) in Tcl control structures (this is, after all, what Tcl is good at!)
It lays the groundwork for addressing the fourth problem by assigning each SQL statement a logical name. In a future version of the OpenACS Core, this API will translate logical statement names into actual SQL, based on the type of database in use. (To smooth the learning curve, we provide a facility for writing SQL inline for a "default SQL dialect", which we assume to be Oracle for now.)
To be clear, SQL abstraction is not fully implemented in OpenACS 3.3.1. The statement names supplied to each call are not used by the API at all. The API's design for SQL abstraction is in fact incomplete; unresolved issues include:
how to add
WHERE
clause criteria dynamicallyhow to build a dynamic
ORDER BY
clause (Ben Adida has a proposed solution for this)how to define a statement's formal interface (i.e., what bind variables it expects, what columns its
SELECT
clause must contain if it's a query) without actually implementing the statement in a specific SQL dialect
So why is the incremental change of adding statement naming to the API worth the effort? It is worth the effort because we know that giving each SQL statement a logical name will be required by the complete SQL abstraction design. Therefore, we know that the effort will not be wasted, and taking advantage of the new support for bind variables will already require code that uses 3.3.0 version of the API to be updated.
set_variables_after_query
is
gone! (Well, it's still there, but you'll never need to use
it.) The new API routines set local variables automatically. For
instance:
db_1row select_names "select first_names, last_name from users where user_id = [ad_conn user_id]" doc_body_append "Hello, $first_names $last_name!"
Like ns_db 1row
, this will
bomb if the query doesn't return any rows (no such user
exists). If this isn't what you want, you can write:
if { [db_0or1row select_names "select first_names, last_name from users where user_id = [ad_conn user_id]"] } { doc_body_append "Hello, $first_names $last_name!" } else { # Executed if the query returns no rows. doc_body_append "There's no such user!" }
Selecting a bunch of rows is a lot prettier now:
db_foreach select_names "select first_names, last_name from users" { doc_body_append "Say hi to $first_names $last_name for me!<br>" }
That's right, db_foreach
is now like ns_db select
plus a
while
loop plus set_variables_after_query
plus an
if
statement (containing code
to be executed if no rows are returned).
db_foreach select_names "select first_names, last_name from users where last_name like 'S%'" { doc_body_append "Say hi to $first_names $last_name for me!<br>" } if_no_rows { doc_body_append "There aren't any users with last names beginnings with S!" }
The new API keeps track of which handles are in use, and automatically allocates new handles when they are necessary (e.g., to perform subqueries while a select is active). For example:
doc_body_append "<ul>" db_foreach select_names "select first_names, last_name, user_id from users" { # Automatically allocated a database handle from the main pool. doc_body_append "<li>User $first_names $last_name\n<ul>" db_foreach select_groups "select group_id from user_group_map where user_id = $user_id" { # There's a selection in progress, so we allocated a database handle # from the subquery pool for this selection. doc_body_append "<li>Member of group #$group_id.\n" } if_no_rows { # Not a member of any groups. doc_body_append "<li>Not a member of any group.\n" } } doc_body_append "</ul>" db_release_unused_handles
A new handle isn't actually allocated and released for every
selection, of course - as a performance optimization, the API keeps
old handles around until db_release_unused_handles
is invoked (or
the script terminates).
Note that there is no analogue to ns_db gethandle
- the handle is always
automatically allocated the first time it's needed.
Introduction
Most SQL statements require that the code invoking the statement pass along data associated with that statement, usually obtained from the user. For instance, in order to delete a WimpyPoint presentation, a Tcl script might use the SQL statement
delete from wp_presentations where presentation_id = some_presentation_id
where some_presentation_id
is a
number which is a valid presentation ID of the presentation I want
to delete. It's easy to write code handling situations like
this since SQL statements can include bind variables, which represent
placeholders for actual data. A bind variable is specified as a
colon followed by an identifier, so the statement above can be
coded as:
db_dml presentation_delete { delete from wp_presentations where presentation_id = :some_presentation_id }
When this SQL statement is invoked, the value for the bind
variable :some_presentation_id
is pulled from the Tcl variable $some_presentation_id
(in the caller's
environment). Note that bind variables are not limited to one per
statement; you can use an arbitrary number, and each will pull from
the correspondingly named Tcl variable. (Alternatively, you can
also specify a list or ns_set
providing bind variables' values; see Usage.)
The value of a bind variable is taken literally by the database
driver, so there is never any need to put single-quotes around the
value for a bind variable, or to use db_quote
to escape single-quotes contained
in the value. The following works fine, despite the apostrophe:
set exclamation "That's all, folks!" db_dml exclamation_insert { insert into exclamations(exclamation) values(:exclamation) }
Note that you can use a bind variable in a SQL statement only
where you could use a literal (a number or single-quoted string).
Bind variables cannot be placeholders for things like SQL keywords,
table names, or column names, so the following will not work, even
if $table_name
is set
properly:
select * from :table_name
Why Bind Variables Are Useful
Why bother with bind variables at all - why not just write the Tcl statement above like this:
db_dml presentation_delete " delete from wp_presentations where presentation_id = $some_presentation_id "
(Note the use of double-quotes to allow the variable reference
to $some_presentation_id
to be
interpolated in.) This will work, but consider the case where some
devious user causes some_presentation_id
to be set to something
like '3 or 1 = 1'
,
which would result in the following statement being executed:
delete from wp_presentations where presentation_id = 3 or 1 = 1
This deletes every presentation in the database! Using bind
variables eliminates this gaping security hole: since bind variable
values are taken literally. Oracle will attempt to delete
presentations whose presentation ID is literally '3 or 1 = 1'
(i.e., no
presentations, since '3 or 1 =
1'
can't possibly be a valid integer primary key for
wp_presentations
. In general,
since Oracle always considers the values of bind variables to be
literals, it becomes more difficult for users to perform URL
surgery to trick scripts into running dangerous queries and
DML.
Usage
Every db_*
command accepting
a SQL command as an argument supports bind variables. You can
either
specify the
-bind
switch to provide a set with bind variable values, orspecify the
-bind
switch to explicitly provide a list of bind variable names and values, ornot specify a bind variable list at all, in which case Tcl variables are used as bind variables.
The default behavior (i.e., if the -bind
switch is omitted) is that these
procedures expect to find local variables that correspond in name
to the referenced bind variables, e.g.:
set user_id 123456 set role "administrator" db_foreach user_group_memberships_by_role { select g.group_id, g.group_name from user_groups g, user_group_map map where g.group_id = map.user_id and map.user_id = :user_id and map.role = :role } { # do something for each group of which user 123456 is in the role # of "administrator" }
The value of the local Tcl variable user_id
(123456) is bound to the
user_id
bind variable.
The -bind
switch can takes
the name of an ns_set
containing keys for each bind variable named in the query,
e.g.:
set bind_vars [ns_set create] ns_set put $bind_vars user_id 123456 ns_set put $bind_vars role "administrator" db_foreach user_group_memberships_by_role { select g.group_id, g.group_name from user_groups g, user_group_map map where g.group_id = map.user_id and map.user_id = :user_id and map.role = :role } -bind $bind_vars { # do something for each group in which user 123456 has the role # of "administrator" }
Alternatively, as an argument to -bind
you can specify a list of alternating
name/value pairs for bind variables:
db_foreach user_group_memberships_by_role { select g.group_id, g.group_name from user_groups g, user_group_map map where g.group_id = map.user_id and map.user_id = :user_id and map.role = :role } -bind [list user_id 123456 role "administrator"] { # do something for each group in which user 123456 has the role # of "administrator" }
When processing a DML statement, Oracle coerces empty strings
into null
. (This coercion does
not occur in the
WHERE
clause of a query, i.e.
col = ''
and
col is null
are not
equivalent.)
As a result, when using bind variables, the only way to make
Oracle set a column value to null
is to set the corresponding bind
variable to the empty string, since a bind variable whose value is
the string "null" will be interpreted as the literal
string "null".
These Oracle quirks complicate the process of writing clear and abstract DML difficult. Here is an example that illustrates why:
# # Given the table: # # create table foo ( # bar integer, # baz varchar(10) # ); # set bar "" set baz "" db_dml foo_create {insert into foo(bar, baz) values(:bar, :baz)} # # the values of the "bar" and "baz" columns in the new row are both # null, because Oracle has coerced the empty string (even for the # numeric column "bar") into null in both cases
We now require that each SQL statement be assigned a logical name for the statement that is unique to the procedure or page in which it is defined. This is so that (eventually) we can implement logically named statements with alternative SQL for non-Oracle databases (e.g., Postgres). More on this later.
Normally, db_foreach
,
db_0or1row
, and db_1row
places the results of queries in
Tcl variables, so you can say:
db_foreach users_select "select first_names, last_name from users" { doc_body_append "<li>$first_names $last_name\n" }
However, sometimes this is not sufficient: you may need to
examine the rows returned, to dynamically determine the set of
columns returned by the query, or to avoid collisions with existing
variables. You can use the -column_array
and -column_set
switches to db_foreach
, db_0or1row
, and db_1row
to instruct the database routines
to place the results in a Tcl array or ns_set
, respectively, where the keys are
the column names and the values are the column values. For
example:
db_foreach users_select "select first_names, last_name from users" -column_set columns { # Now $columns is an ns_set. doc_body_append "<li>" for { set i 0 } { $i < [ns_set size $columns] } { incr i } { doc_body_append "[ns_set key $columns $i] is [ns_set value $columns $i]. \n" } }
will write something like:
first_names is Jon. last_name is Salz.
first_names is Lars. last_name is Pind.
first_names is Michael. last_name is Yoon.
Note that you never have to use ns_db
anymore (including ns_db gethandle
)! Just start doing stuff,
and (if you want) call db_release_unused_handles
when you're
done as a hint to release the database handle.
db_null
-
db_null
Returns a value which can be used in a bind variable to represent the SQL value
null
. See Nulls and Bind Variables above. db_foreach
-
db_foreach statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \ [ -column_array array_name | -column_set set_name ] \ code_block [ if_no_rows if_no_rows_block ]
Performs the SQL query
sql
, executingcode_block
once for each row with variables set to column values (or a set or array populated if-column_array
orcolumn_set
is specified). If the query returns no rows, executesif_no_rows_block
(if provided).Example:
db_foreach select_foo "select foo, bar from greeble" { doc_body_append "<li>foo=$foo; bar=$bar\n" } if_no_rows { doc_body_append "<li>There are no greebles in the database.\n" }
The code block may contain
break
statements (which terminate the loop and flush the database handle) andcontinue
statements (which continue to the next row of the loop). db_1row
-
db_1row statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \ [ -column_array array_name | -column_set set_name ]
Performs the SQL query
sql
, setting variables to column values. Raises an error if the query does not return exactly 1 row.Example:
db_1row select_foo "select foo, bar from greeble where greeble_id = $greeble_id" # Bombs if there's no such greeble! # Now $foo and $bar are set.
db_0or1row
-
db_0or1row statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \ [ -column_array array_name | -column_set set_name ]
Performs the SQL query
sql
. If a row is returned, sets variables to column values and returns 1. If no rows are returned, returns 0. If more than one row is returned, throws an error. db_string
-
db_string statement-name sql [ -default default ] [ -bind bind_set_id | -bind bind_value_list ]
Returns the first column of the result of SQL query
sql
. Ifsql
doesn't return a row, returnsdefault
(or throws an error ifdefault
is unspecified). Analogous todatabase_to_tcl_string
anddatabase_to_tcl_string_or_null
. db_nextval
-
db_nextval sequence-name
Returns the next value for the sequence sequence-name (using a SQL statement like
SELECT
sequence-name
.nextval FROM DUAL
). If sequence pooling is enabled for the sequence, transparently uses a value from the pool if available to save a round-trip to the database. db_list
-
db_list statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
Returns a Tcl list of the values in the first column of the result of SQL query
sql
. Ifsql
doesn't return any rows, returns an empty list. Analogous todatabase_to_tcl_list
. db_list_of_lists
-
db_list_of_lists statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
Returns a Tcl list, each element of which is a list of all column values in a row of the result of SQL query
sql
. Ifsql
doesn't return any rows, returns an empty list. (Analogous todatabase_to_tcl_list_list
.) db_list_of_ns_sets
-
db_list_of_ns_sets statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
Returns a list of ns_sets with the values of each column of each row returned by the
sql
query specified. db_dml
-
db_dml statement-name sql \ [ -bind bind_set_id | -bind bind_value_list ] \ [ -blobs blob_list | -clobs clob_list | -blob_files blob_file_list | -clob_files clob_file_list ]
Performs the DML or DDL statement
sql
.If a length-n list of blobs or clobs is provided, then the SQL should return n blobs or clobs into the bind variables
:1
,:2
, ... :n
.blobs
orclobs
, if specified, should be a list of individual BLOBs or CLOBs to insert;blob_files
orclob_files
, if specified, should be a list of paths to files containing the data to insert. Only one of-blobs
,-clobs
,-blob_files
, and-clob_files
may be provided.Example:
db_dml insert_photos { insert photos(photo_id, image, thumbnail_image) values(photo_id_seq.nextval, empty_blob(), empty_blob()) returning image, thumbnail_image into :1, :2 } -blob_files [list "/var/tmp/the_photo" "/var/tmp/the_thumbnail"]
This inserts a new row into the
photos
table, with the contents of the files/var/tmp/the_photo
and/var/tmp/the_thumbnail
in theimage
andthumbnail
columns, respectively. -
db_write_clob
,db_write_blob
,db_blob_get_file
-
db_write_clob statement-name sql [ -bind bind_set_id | -bind bind_value_list ] db_write_blob statement-name sql [ -bind bind_set_id | -bind bind_value_list ] db_blob_get_file statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
Analogous to
ns_ora write_clob/write_blob/blob_get_file
. db_release_unused_handles
-
db_release_unused_handles
Releases any allocated, unused database handles.
db_transaction
-
db_transaction code_block [ on_error { code_block } ]
Executes
code_block
transactionally. Nested transactions are supported (end transaction
is transparentlyns_db dml
'ed when the outermost transaction completes). Thedb_abort_transaction
command can be used to abort all levels of transactions. It is possible to specify an optionalon_error
code block that will be executed if some code in code_block throws an exception. The variableerrmsg
will be bound in that scope. If there is noon_error
code, any errors will be propagated.Example:
proc replace_the_foo { col } { db_transaction { db_dml delete {delete from foo} db_dml insert {insert into foo(col) values($col)} } } proc print_the_foo {} { doc_body_append "foo is [db_string "select col from foo"]<br>\n" } replace_the_foo 8 print_the_foo ; # Writes out "foo is 8" db_transaction { replace_the_foo 14 print_the_foo ; # Writes out "foo is 14" db_dml insert_foo {insert into some_other_table(col) values(999)} ... db_abort_transaction } on_error { doc_body_append "Error in transaction: $errmsg" } print_the_foo ; # Writes out "foo is 8"
db_abort_transaction
-
db_abort_transaction
Aborts all levels of a transaction. That is if this is called within several nested transactions, all of them are terminated. Use this instead of
db_dml "abort" "abort transaction"
. db_multirow
-
db_multirow [ -local ] [ -append ] [ -extend column_list ] \ var-name statement-name sql \ [ -bind bind_set_id | -bind bind_value_list ] \ code_block [ if_no_rows if_no_rows_block ]
Performs the SQL query
sql
, saving results in variables of the formvar_name:1
,var_name:2
, etc, settingvar_name:rowcount
to the total number of rows, and settingvar_name:columns
to a list of column names.Each row also has a column, rownum, automatically added and set to the row number, starting with 1. Note that this will override any column in the SQL statement named 'rownum', also if you're using the Oracle rownum pseudo-column.
If the
-local
is passed, the variables defined by db_multirow will be set locally (useful if you're compiling dynamic templates in a function or similar situations).You may supply a code block, which will be executed for each row in the loop. This is very useful if you need to make computations that are better done in Tcl than in SQL, for example using ns_urlencode or ns_quotehtml, etc. When the Tcl code is executed, all the columns from the SQL query will be set as local variables in that code. Any changes made to these local variables will be copied back into the multirow.
You may also add additional, computed columns to the multirow, using the
-extend { col_1col_2 ... }
switch. This is useful for things like constructing a URL for the object retrieved by the query.If you're constructing your multirow through multiple queries with the same set of columns, but with different rows, you can use the
-append
switch. This causes the rows returned by this query to be appended to the rows already in the multirow, instead of starting a clean multirow, as is the normal behavior. The columns must match the columns in the original multirow, or an error will be thrown.Your code block may call
continue
in order to skip a row and not include it in the multirow. Or you can callbreak
to skip this row and quit looping.Notice the nonstandard numbering (everything else in Tcl starts at 0); the reason is that the graphics designer, a non-programmer, may wish to work with row numbers.
Example:
db_multirow -extend { user_url } users users_query { select user_id first_names, last_name, email from cc_users } { set user_url [acs_community_member_url -user_id $user_id] }
db_resultrows
-
db_resultrows
Returns the number of rows affected or returned by the previous statement.
db_with_handle
-
db_with_handle var code_block
Places a database handle into the variable
var
and executescode_block
. This is useful when you don't want to have to use the new API (db_foreach
,db_1row
, etc.), but need to use database handles explicitly.Example:
proc lookup_the_foo { foo } { db_with_handle db { return [db_string unused "select ..."] } } db_with_handle db { # Now there's a database handle in $db. set selection [ns_db select $db "select foo from bar"] while { [ns_db getrow $db $selection] } { set_variables_after_query lookup_the_foo $foo } }
db_name
-
db_name
Returns the name of the database, as returned by the driver.
db_type
-
db_type
Returns the RDBMS type (i.e. oracle, postgresql) this OpenACS installation is using. The nsv ad_database_type is set up during the bootstrap process.
db_compatible_rdbms_p
-
db_compatible_rdbms_p db_type
Returns 1 if the given db_type is compatible with the current RDBMS.
db_package_supports_rdbms_p
-
db_package_supports_rdbms_p db_type_list
Returns 1 if db_type_list contains the current RDMBS type. A package intended to run with a given RDBMS must note this in it's package info file regardless of whether or not it actually uses the database.
db_legacy_package_p
-
db_legacy_package_p db_type_list
Returns 1 if the package is a legacy package. We can only tell for certain if it explicitly supports Oracle 8.1.6 rather than the OpenACS more general oracle.
db_version
-
db_version
Returns the RDBMS version (i.e. 8.1.6 is a recent Oracle version; 7.1 a recent PostgreSQL version.
db_current_rdbms
-
db_current_rdbms
Returns the current rdbms type and version.
db_known_database_types
-
db_known_database_types
Returns a list of three-element lists describing the database engines known to OpenACS. Each sublist contains the internal database name (used in file paths, etc), the driver name, and a "pretty name" to be used in selection forms displayed to the user.
The nsv containing the list is initialized by the bootstrap script and should never be referenced directly by user code. Returns the current rdbms type and version.