Forum OpenACS Development: driver support for multi-db db_api

I just commited the basic multi-db support for the OpenACS db_* API to the head. Read the comments at the top of "packages/acs-tcl/tcl/00-database-procs.tcl" for more info, but basically, it should work exaclty the way it always has, unless you pass the "-dbn" switch to specify a different database name.

So first of all, I don't have any regression tests for the db_* api and I know for a fact that my manual testing didn't hit everything. So perhaps I missed a typo somewhere that will break something - if so please let me know.

More interestingly, the current multi-db implementation isn't really finished. It should currently be completely backwards compatible with the old (non-multi-db) db_* api, however, right now it will only work with databases using the same AOLserver db driver as your default OpenACS database! Which is not super useful... This limitation exists because currently, there are completely separate Oracle and PostgreSQL implementations of db_exec, and db_exec is ultimately called for basically everything.

I think what's really going on here, is that for historical reasons, two different concepts have been confounded in the OpenACS database support code:

  1. The OpenACS code needs to know, "Which database am I using for the OpenACS data model?". This must be exactly one database, currently the choices are either Oracle or PostgreSQL, and this is what drove the entire "database independence" design of OpenACS.

  2. The db_* api code needs to know, "For features not part of the standard ns_db api, how do I call those special features of the AOLserver driver for this database?" Note that this has actually has nothing to do with what database the OpenACS data model is using, as the db_* api is utility code, and does not care at all what data model is in the database it's talking to.

  3. There is also a third uglier mixed case that I didn't think of before. A good example is db_table_exists. This proc has two separate definitions, one for Oracle, one for Postgres, and currently only one version can ever exist in an OpenACS system at a time. However, the difference is not because of the AOLserver db driver, and it's not because of the OpenACS data model either. It's because of the RDBMS system data model. So that's a third case, also counfounded, and I'd appreciate suggestions on how to best approach it... (Fortunately, this issue exists for very few procs, so it could be kludged around adequately even without a nice clean solution.)

So, in order to make the multi-db support work right, we need to un-confound those two issues. My question is, how? I can basically think of two ways:

  1. Merge all the db_* api code together, and use switch statements to distinguish between the different AOLserver db drivers. This is actually pretty clean; when I did it for 4.2, only 2 procs (db_exec and db_resultrows) had a switch statement in them, including the blob procs, etc.

  2. Instead use some kind of fancy Tcl service contract. Perhaps the new stuff Lars just released with Workflow and Bug Tracker would be relevent here, but I don't know, I haven't looked at it.

I favor the switch based approach. It's simple, I've done it before, and presumably most of the work necessary (reviewing and rationalizing the existing independent Oracle and Postgres procs) would be reused anyway if we later want to move to some fancier method. Let me know what you think...

Posted by Peter Alberer on
Hmm, seems when running the arg parser of db_list_of_ns_sets there are problems now:

No value specified for argument statement_name
    while executing
    (procedure "db_list_of_ns_sets" line 1)
    invoked from within
"db_list_of_ns_sets select_notifications {}"
    (procedure "notification::sweep::sweep_notifications" line 7)
    invoked from within
"notification::sweep::sweep_notifications -interval_id 487 -batched_p 0"
    ("eval" body line 1)
    invoked from within
"eval [concat [list $proc] $args]"
    (procedure "ad_run_scheduled_proc" line 43)
    invoked from within
"ad_run_scheduled_proc {t f 60 notification::sweep::sweep_notifications {-interval_id 487 -batched_p 0} 1048243725 0 t}"

Posted by Jeff Davis on
I think I fixed that, can you update and see if it works now.
(the proc had {args ""} but args is special and should not be
Posted by John Sequeira on

You might want to look at how nstcl solves the multiple db problem.  I'm not quite sure that its multi-db support will be directly compatible with yours - I've peeked at it but spent most of my time working on templating - but MC's code is extremely well organized, mixes OO and tcl idioms well, and might serve as a good basis for what you're trying to do.

I confess to suggesting this for self reasons:  this would save me from writing more glue code.

Posted by Andrew Piskorski on
John, when I last looked at the nstcl implementation it's db pool handling was decidedly inferior (in terms of multi-db support I mean). I think it only supported using named AOLserver databae pools (rather than something like my -dbn switch), which doesn't mesh nicely with the rest of the (Open)ACS db_* API at all. But that was a long time ago, probably back in 2001 when I originally made this multi-db stuff work on ACS 4.2, so I'll take a look.

Also, I don't recall at all how nstcl handled the multiple database driver issue. (Which is a bigger issue! Making the -dbn switch type thing work is really somewhat trivial.) So good suggestion, I should definitely take a look at it for at least that reason.

8: nstcl vs. OpenACS db_* api (response to 5)
Posted by Andrew Piskorski on
John, I looked at nstcl a bit, but AFAIK, no ideas of immediate use to the OpenACS multi-db db_* api work there. I did the straightforward thing - switch statements.

One, with the issue of supporting multiple drivers, nstcl is just completely different. Not bad (as far as I saw), just very different.

Two, nstcl still only allows you to specify a pool name to access a different (non-default) database, there is nothing like my -dbn switch. nstcl offers only, "To specify an alternate pool, other than the default, preface the statement_name with the name of the pool, followed by a colon.", e.g. "db_dml oracle:old_customers $sql".

In the OpenACS db_* api, if you call a db_ function while you already have a db handle in use (e.g., inside a db_foreach loop), the api will automagically open a second handle (and a second transaction) from a different database pool, if you have any pools left. (See the use of db_nth_pool_name in db_with_handle.) Whether or not this is a good feature or not is a different question (and one that I'm not entirely sure about), but AFAIK the db_* api has always done it that way; it certainly has since before ACS 4.2, anyway.

For the default database, I'm not really sure whether nstcl supports automagic multiple handles at once like that or not - I strongly suspect it does not (see: "::nstcl::database::api_get_dbhandle"). But nstcl definitely doesn't support it for any other database. I don't know if Cleverly did that on purpose, or by accident. Since there are no notes or comments about it, and his nstcl package seems otherwise well documented, I suspect the latter.

Posted by Michael A. Cleverly on
nstcl does support "automagic multiple handles" at once.

The only reason OpenACS has multiple pools (all to the same database, traditionally being main, subquery, and log) is due to the restriction in AOLserver that you have to allocate as many handles as you want upfront.  Once you have allocated handle(s) you have to release them all before you can get more from the same pool.

I didn't feel any need to impose this restriction with nstcl.  Rather, you can define a "pool" to have multiple connections and freely use them all up.  In OpenACS if you have the standard 3 pools you can't nest more four-levels of db_foreach statements.

In nstcl, for any given pool (default or otherwise) you can have as many db_foreach statements nested as you have defined connections. With nstcl there really is no need to define multiple pools for any reason other than to connect to multiple databases, be they two different Oracle databases (production & development perhaps) or two completely different types of databases (Oracle & Postgres).

In this sense, a single nstcl "pool" encompases the main, subquery, and log pools of AOLserver.  As for -dbn syntax; if that's the route OpenACS takes I'll likely add support to nstcl for it.  Lately I've found myself using nstcl more for work utterly unrelated to (Open)ACS (and I regret I haven't had time to track and participate in OpenACS development), but since I'll be converting a site from ACS 4.2 to OpenACS shortly, it'll be time to catch up on things.

Posted by Andrew Piskorski on
Ah, interesting, thanks for filling us in, Michael!

Hm, so the current workings of nstcl depend on the fact that all the drivers nstcl currently supports do not share AOLserver's "each thread must allocate all handles from a single pool at one time" limitation? So if down the road nstcl decided to support db drivers based on the AOLserver C code, you'd want to add in something like the db_nth_pool_name type stuff that OpenACS has?

On eventually adding the -dbn switch to nstcl, cool! I've yet to receive much of any feedback on it good or bad, but presumably others will gradually start using it. :)

Also, for the OpenACS db_* API, the -dbn switch definitely was the right way to go, but I don't see any harm in also adding your nstcl style "use this pool and no other" syntax, if people ask for it.

Posted by Andrew Piskorski on
I just committed the rest of the multi-db support on the Head. You can now use the db_* API to access both Oracle and PostgreSQL at the same time!

There are still a few little bits of code I want clean up, but none of them should cause any visible changes. And I definitely need to add some more info to the default OpenACS AOLserver config file.

Again, no regression tests, so any further testing and feedback by others would be much appreciated. LOBs were king of annoying to make work, but File Storage at least seems to work. (That is, it seems to work just as well as it did before these changes; it has bugs.)

Posted by Andrew Piskorski on
Oh, when I checked yesterday, Head was pretty hard to test because it didn't seem to work at all. I worked around that by using the very latest acs-tcl and acs-bootstrap-installer packages, and everything else from 2003-03-21. E.g., I did something like:
cd /web/acs47/
cvs up -D 2003-03-21
cd /web/acs47/packages/acs-tcl
cvs up -A
cd /web/acs47/packages/acs-bootstrap-installer
cvs up -A