By Pete Su and Jon Salz. Modified by Roberto Mello.
One of OpenACS's great strengths is that code written for it is very close to the database. It is very easy to interact with the database from anywhere within OpenACS, and we have a coherent API for database access which makes this even easier.
More detailed information about the DB API is available at Database Access API.
The OpenACS database API is meant to save developers from making common mistakes and to provide a more structured syntax for specifying database operations, including transactions. Here's an example of the API.
set count 0 set tcl_var "foo" set sql { SELECT foo, bar, baz FROM some_table, some_other_table WHERE some_table.id = some_other_table.id and some_table.condition_p = :tcl_var } db_transaction { db_foreach my_example_query_name $sql { lappend rows [list $foo $bar $baz] incr count } foreach row $rows { call_some_proc $foo $bar $baz } }
There are several things to note here:
No explicit code for grabbing and releasing handles. Usage of the Database API implicitly deals with all handle management issues.
The
db_transaction
command makes the scope of a transaction clear;db_transaction
takes the code block argument and automatically runs it in the context of a transaction. If you use something like db_foreach though, you need to make sure that there are no calls in the code block which would take a second db handle since the transaction is only valid for one handle (that's why we build up a list of returned values and call a second proc outside the db_foreach loop).The command
db_foreach
writes our old while loop for us.Every SQL query has a name, which is used in conjunction with .XQL files to support multiple databases.
Finally and most importantly, there API implements bind variables, which we will cover next.
Bind variables are placeholders for literal values in an SQL query being sent to the server. In the old way, data was generally passed to directly to the DB backend, via Tcl string interpolation. In the example above, the query would look like:
select foo, bar, baz from some_table, some_other_table where some_table.id=some_other_table.id and some_table.condition_p = '$foo'
There are a few problems with this:
If the value of $foo is a huge string, then we waste a lot of time in the database server doing useless parsing.
Second, if the literal value contains characters like single quotes, we have to be careful to properly escape them, because not quoting them will lead to surprising errors.
-
Third, no type checking occurs on the literal value. Finally, if the Tcl variable is passed in or between web forms or otherwise subject to external modification, there is nothing keeping malicious users from setting the Tcl variable to some string that changes the query textually. This type of attack, called SQL smuggling, can be very damaging - entire tables can be exposed or have their contents deleted, for example.
Another very important reason for using bind variables is performance. Oracle can cache previously parsed queries. If there are values in the where clause, that is how the query is cached. It also performs bind variable substitution after parsing the SQL statement. This means that SQL statements that use bind variables will always match (assuming all else is the same) while SQL statements that do not use bind variables will not match unless the values in the statement are exactly the same. This will improve the query cache considerably, which can make the server much more efficient.
What the DB API (in conjunction with the database drivers implemented for AOLserver) do is send the SQL statement to the server for parsing, then bind values to the variables and sends those values along separately as a second step. This separate binding step is where the term bind variable comes from.
This split has several advantages. First, type checking happens on the literal. If the column we are comparing against holds numbers, and we send a string, we get a nice error. Second, since string literals are no longer in the query, no extra quoting is required. Third, substitution of bind variables cannot change the actual text of the query, only the literal values in the placeholders. The database API makes bind variables easy to use by hooking them smoothly into the Tcl run time so you simply provide :tclvar and the value of $tclvar is sent to the backend to actually execute the query.
The database API parses the query and pulls out all the bind variable specifications and replaces them with generic placeholders. It then automatically pulls the values of the named Tcl vars out of the run time environment of the script, and passes them to the database.
Note that while this looks like a simple syntactic change, it really is very different from how interpolated text queries work. You use bind variables to replace what would otherwise be a literal value in a query, and Tcl style string interpolation does not happen. So you cannot do something like:
set table "baz" set condition "where foo = bar" db_foreach my_query { select :table from some_table where :condition }
SQL will not allow a literal to occur where we've put the bind variables, so the query is syntactically incorrect. You have to remember that while the bind variable syntax looks similar to variable interpolation in Tcl, It is not the same thing at all.
Finally, the DB API has several different styles for passing bind variable values to queries. In general, use the style presented here because it is the most convenient.
Every db_*
command accepting
a SQL command as an argument supports bind variables. You can
either
Specify the
-bind
switch to provide a set with bind variable values, orSpecify the
-bind
switch to explicitly provide a list of bind variable names and values, orNot specify a bind variable list at all, in which case Tcl variables are used as bind variables.
The default behavior (i.e., if the -bind
switch is omitted) is that these
procedures expect to find local variables that correspond in name
to the referenced bind variables, e.g.:
set user_id 123456 set role "administrator" db_foreach user_group_memberships_by_role { select g.group_id, g.group_name from user_groups g, user_group_map map where g.group_id = map.user_id and map.user_id = :user_id and map.role = :role } { # do something for each group of which user 123456 is in the role # of "administrator" }
The value of the local Tcl variable user_id
(123456) is bound to the
user_id
bind variable.
The -bind
switch can takes
the name of an ns_set
containing keys for each bind variable named in the query,
e.g.:
set bind_vars [ns_set create] ns_set put $bind_vars user_id 123456 ns_set put $bind_vars role "administrator" db_foreach user_group_memberships_by_role { select g.group_id, g.group_name from user_groups g, user_group_map map where g.group_id = map.user_id and map.user_id = :user_id and map.role = :role } -bind $bind_vars { # do something for each group in which user 123456 has the role # of "administrator" }
Alternatively, as an argument to -bind
you can specify a list of alternating
name/value pairs for bind variables:
db_foreach user_group_memberships_by_role { select g.group_id, g.group_name from user_groups g, user_group_map map where g.group_id = map.user_id and map.user_id = :user_id and map.role = :role } -bind [list user_id 123456 role "administrator"] { # do something for each group in which user 123456 has the role # of "administrator" }
When processing a DML statement, Oracle coerces empty strings
into null
. (This coercion does
not occur in the
WHERE
clause of a query, i.e.
col = ''
and
col is null
are not
equivalent.)
As a result, when using bind variables, the only way to make
Oracle set a column value to null
is to set the corresponding bind
variable to the empty string, since a bind variable whose value is
the string "null" will be interpreted as the literal
string "null".
These Oracle quirks complicate the process of writing clear and abstract DML difficult. Here is an example that illustrates why:
# # Given the table: # # create table foo ( # bar integer, # baz varchar(10) # ); # set bar "" set baz "" db_dml foo_create "insert into foo(bar, baz) values(:bar, :baz)" # # the values of the "bar" and "baz" columns in the new row are both # null, because Oracle has coerced the empty string (even for the # numeric column "bar") into null in both cases
The database library can transparently maintain pools of
sequence values, so that each request for a new sequence value
(using db_nextval
) does not
incur a roundtrip to the server. For instance, this functionality
is very useful in the security/sessions library, which very
frequently allocates values from the sec_id_seq
sequence. To utilize this
functionality for a particular sequence, register the sequence to
be pooled, either using the db_register_pooled_sequence
procedure at
server startup time, or by including a configuration parameter of
the form
PoolSequence.sequence_name_seq=count
in any configuration
section in the yourservername.ini
file, e.g.,
[ns/server/yourservername/acs/security]
PoolSequence.sec_id_seq=20
The database library will allocate this number of sequence
values at server startup. It will periodically scan pools and
allocate new values for sequences which are less than half-full.
(This normally occurs every 60 seconds, and is configurable via the
PooledSequenceUpdateInterval
parameter in the [ns/server/
yourservername
/acs/database]
configuration section.)
The Database API has several functions that wrap familiar parts of the AOLserver database API.
Note that you never have to use ns_db
anymore (including ns_db gethandle
)! Just start doing stuff,
and (if you want) call db_release_unused_handles
when you're
done as a hint to release the database handle.
db_abort_transaction
-
db_abort_transaction
Aborts all levels of a transaction. That is if this is called within several nested transactions, all of them are terminated. Use this instead of
db_dml "abort" "abort transaction"
. db_multirow
-
db_multirow [ -local ] [ -append ] [ -extend column_list ] \ var-namestatement-namesql \ [ -bind bind_set_id | -bind bind_value_list ] \ code_block [ if_no_rows if_no_rows_block ]
Performs the SQL query
sql
, saving results in variables of the formvar_name:1
,var_name:2
, etc, settingvar_name:rowcount
to the total number of rows, and settingvar_name:columns
to a list of column names.Each row also has a column, rownum, automatically added and set to the row number, starting with 1. Note that this will override any column in the SQL statement named 'rownum', also if you're using the Oracle rownum pseudo-column.
If the
-local
is passed, the variables defined by db_multirow will be set locally (useful if you're compiling dynamic templates in a function or similar situations).You may supply a code block, which will be executed for each row in the loop. This is very useful if you need to make computations that are better done in Tcl than in SQL, for example using ns_urlencode or ns_quotehtml, etc. When the Tcl code is executed, all the columns from the SQL query will be set as local variables in that code. Any changes made to these local variables will be copied back into the multirow.
You may also add additional, computed columns to the multirow, using the
-extend { col_1col_2 ... }
switch. This is useful for things like constructing a URL for the object retrieved by the query.If you're constructing your multirow through multiple queries with the same set of columns, but with different rows, you can use the
-append
switch. This causes the rows returned by this query to be appended to the rows already in the multirow, instead of starting a clean multirow, as is the normal behavior. The columns must match the columns in the original multirow, or an error will be thrown.Your code block may call
continue
in order to skip a row and not include it in the multirow. Or you can callbreak
to skip this row and quit looping.Notice the nonstandard numbering (everything else in Tcl starts at 0); the reason is that the graphics designer, a non-programmer, may wish to work with row numbers.
Example:
db_multirow -extend { user_url } users users_query { select user_id first_names, last_name, email from cc_users } { set user_url [acs_community_member_url -user_id $user_id] }
You can also iterate over a multirow after it has been created - check the documentation for template::multirow
For example,
db_multirow assets assets { select asset_id, from ... } .. set asset_id_l [list] multirow foreach assets { lappend asset_id_l $asset_id }
Technically it's equivalent to using a code block on the end of your db_multirow.
db_foreach
-
db_foreach statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \ [ -column_array array_name | -column_set set_name ] \ code_block [ if_no_rows if_no_rows_block ]
Performs the SQL query
sql
, executingcode_block
once for each row with variables set to column values (or a set or array populated if-column_array
orcolumn_set
is specified). If the query returns no rows, executesif_no_rows_block
(if provided).Example:
db_foreach select_foo "select foo, bar from greeble" { doc_body_append "<li>foo=$foo; bar=$bar\n" } if_no_rows { doc_body_append "<li>There are no greebles in the database.\n" }
The code block may contain
break
statements (which terminate the loop and flush the database handle) andcontinue
statements (which continue to the next row of the loop). db_1row
-
db_1row statement-namesql [ -bind bind_set_id | -bind bind_value_list ] \ [ -column_array array_name | -column_set set_name ]
Performs the SQL query
sql
, setting variables to column values. Raises an error if the query does not return exactly 1 row.Example:
db_1row select_foo "select foo, bar from greeble where greeble_id = $greeble_id" # Bombs if there's no such greeble! # Now $foo and $bar are set.
db_0or1row
-
db_0or1row statement-namesql [ -bind bind_set_id | -bind bind_value_list ] \ [ -column_array array_name | -column_set set_name ]
Performs the SQL query
sql
. If a row is returned, sets variables to column values and returns 1. If no rows are returned, returns 0. If more than one row is returned, throws an error. db_nextval
-
db_nextval sequence-name
Returns the next value for the sequence sequence-name (using a SQL statement like
SELECT
sequence-name
.nextval FROM DUAL
). If sequence pooling is enabled for the sequence, transparently uses a value from the pool if available to save a round-trip to the database (see Sequence Pooling). db_register_pooled_sequence
-
db_register_pooled_sequence sequence-namepool-size
Registers the sequence sequence-name to be pooled, with a pool size of pool-size sequence values (see Sequence Pooling).
db_string
-
db_string statement-namesql [ -default default ] [ -bind bind_set_id | -bind bind_value_list ]
Returns the first column of the result of SQL query
sql
. Ifsql
doesn't return a row, returnsdefault
(or throws an error ifdefault
is unspecified). Analogous todatabase_to_tcl_string
anddatabase_to_tcl_string_or_null
. db_list
-
db_list statement-namesql [ -bind bind_set_id | -bind bind_value_list ]
Returns a Tcl list of the values in the first column of the result of SQL query
sql
. Ifsql
doesn't return any rows, returns an empty list. Analogous todatabase_to_tcl_list
. db_list_of_lists
-
db_list_of_lists statement-namesql [ -bind bind_set_id | -bind bind_value_list ]
Returns a Tcl list, each element of which is a list of all column values in a row of the result of SQL query
sql
. Ifsql
doesn't return any rows, returns an empty list. (Analogous todatabase_to_tcl_list_list
.) db_dml
-
db_dml statement-namesql \ [ -bind bind_set_id | -bind bind_value_list ] \ [ -blobs blob_list | -clobs clob_list | -blob_files blob_file_list | -clob_files clob_file_list ]
Performs the DML or DDL statement
sql
.If a length-n list of blobs or clobs is provided, then the SQL should return n blobs or clobs into the bind variables
:1
,:2
, ... :n
.blobs
orclobs
, if specified, should be a list of individual BLOBs or CLOBs to insert;blob_files
orclob_files
, if specified, should be a list of paths to files containing the data to insert. Only one of-blobs
,-clobs
,-blob_files
, and-clob_files
may be provided.Example:
db_dml insert_photos { insert photos(photo_id, image, thumbnail_image) values(photo_id_seq.nextval, empty_blob(), empty_blob()) returning image, thumbnail_image into :1, :2 } -blob_files [list "/var/tmp/the_photo" "/var/tmp/the_thumbnail"]
This inserts a new row into the
photos
table, with the contents of the files/var/tmp/the_photo
and/var/tmp/the_thumbnail
in theimage
andthumbnail
columns, respectively. -
db_write_clob
,db_write_blob
,db_blob_get_file
-
db_write_clob statement-namesql [ -bind bind_set_id | -bind bind_value_list ] db_write_blob statement-namesql [ -bind bind_set_id | -bind bind_value_list ] db_blob_get_file statement-namesql [ -bind bind_set_id | -bind bind_value_list ]
Analogous to
ns_ora write_clob/write_blob/blob_get_file
. db_release_unused_handles
-
db_release_unused_handles
Releases any allocated, unused database handles.
db_transaction
-
db_transaction code_block [ on_error { code_block } ]
Executes
code_block
transactionally. Nested transactions are supported (end transaction
is transparentlyns_db dml
'ed when the outermost transaction completes). Thedb_abort_transaction
command can be used to abort all levels of transactions. It is possible to specify an optionalon_error
code block that will be executed if some code in code_block throws an exception. The variableerrmsg
will be bound in that scope. If there is noon_error
code, any errors will be propagated.Example:
proc replace_the_foo { col } { db_transaction { db_dml delete {delete from foo} db_dml insert {insert into foo(col) values(:col)} } } proc print_the_foo {} { doc_body_append "foo is [db_string "select col from foo"]<br>\n" } replace_the_foo 8 print_the_foo ; # Writes out "foo is 8" db_transaction { replace_the_foo 14 print_the_foo ; # Writes out "foo is 14" db_dml insert_foo {insert into some_other_table(col) values(999)} ... db_abort_transaction } on_error { doc_body_append "Error in transaction: $errmsg" } print_the_foo ; # Writes out "foo is 8"
db_resultrows
-
db_resultrows
Returns the number of rows affected or returned by the previous statement.
db_with_handle
-
db_with_handle varcode_block
Places a database handle into the variable
var
and executescode_block
. This is useful when you don't want to have to use the new API (db_foreach
,db_1row
, etc.), but need to use database handles explicitly.Example:
proc lookup_the_foo { foo } { db_with_handle db { return [db_string unused "select ..."] } } db_with_handle db { # Now there's a database handle in $db. set selection [ns_db select $db "select foo from bar"] while { [ns_db getrow $db $selection] } { set_variables_after_query lookup_the_foo $foo } }
The database API allows for direct caching of query results. Repeated calls will return the cached value until it is either explicitly flushed using db_flush_cache, times out (configured the ns_cache is called to create the cache), or another cached query fills the cache, causing older entries to be flushed.
Values returned by a query are cached if you pass the "-cache_key" switch to the database procedure. The switch value will be used as the key in the ns_cache eval call used to execute the query and processing code. The db_flush proc should be called to flush the cache when appropriate. The "-cache_pool" parameter can be used to specify the cache pool to be used, and defaults to db_cache_pool. The size of the default cache is governed by the kernel parameter "DBCacheSize" in the "caching" section.
Currently db_string, db_list, db_list_of_lists, db_1row, db_0or1row, and db_multirow support caching.
For caching to be effective, one must carefully design a cache_pool and cache_key strategy that uniquely identifies a query within the system, including the relevant objects being referenced by the query. Typically a cache_key should include one or more object_ids and a name that identifies the operation being done.
Here is an example from the layout-manager package:
# Query to return the elements of a page as a list. The prefix "page_" is used to denote # that this is a page-related query, page_id is used to uniquely identify the query # by object, and the suffix uniquely defines the operation being performed on the # page object. db_list -cache_key page_${page_id}_get_elements get_elements {} # When the contents of a page are changed, we flush all page-related queries for the given # page object using db_flush_cache. db_flush_cache -cache_key_pattern page_${page_id}_*