Forum OpenACS Development: Big honkin' commit now available ...

Collapse
Posted by Don Baccus on
OK, I've made my first round of commits that implement a multi-db
aware APM and includes a merge with Ben's initial Query dispatcher work.

It's avaiable for testing.  Some quick notes:

1. I've not actually ported the APM to PostgreSQL at this point, and
of course only a bit of datamodel porting has been done for other
acs-kernel submodules.

2. We still require Oracle EE but for simple poking around and testing
the APM changes SE will work fine if you don't mind ignoring the
errors due to creating bitmap indices and other EE features.  A
freshly installed system's mostly empty and the missing indices don't
hurt.

3. If you want to install this under Oracle (and I'd appreciate
someone doing so to double-check that I got everything in) check out
the APM.  Some core packages - acs-kernel, for instance - are now
labelled as supporting both PG and Oracle, while others - acs-mail,
for instance - are labelled as Oracle only.  Yet others say "none
specified" (not all of the core modules use the database).

4. Install under PG makes it past acs-kernel and tries to load other
core modules.  The APM doesn't yet give an error when you try to load
an Oracle-only module in a PG environment, but it will soon - no need
to report this to me!

5. The APM still has other rough edges, I'll be working on it next
week.  We're OK for the moment now, I think, the needed clean-ups
aren't critical-path issues.

I'll post information about what's up next separately so folks can
comment if they want.

Collapse
Posted by Tom Jackson on

I know everyone is busy with this great effort. Since I have EE installed, I'd love to volunteer to test it. Where is the source? Are there any quick and dirty instructions on installing and testing? I am very familiar with ACS install, and I'd like to commit to maintaining simple install instructions if possible.

I have PG on the same machine. When PG is enabled, which exact version is required? Seems like everytime I move my OpenACS3.x db around I have to run the pgplsql script to install pl. Is this going to be required as well.

Sorry if I missed the instructions somewhere else.

Collapse
Posted by Don Baccus on
The source can be gotten from cvs at openacs.org:/cvsroot.  Copy (co)
openacs-4.  Anonymous pserv is supposd to work...

The installation for Oracle is no different than for ACS Classic 4.x.
I'd appreciate your trying it out under EE to make sure there are no
errors (I'll grab a copy of EE at some point to install I guess).

For PostgreSQL, you need to use version 7.1, which is in release
candidate form at the moment.  You can get it from ftp.postgresql.org
in pub/dev - the biggest files are the complete system for each
release (RC2 is the latest there).

You do need to install PL/pgSQL.  It can be done easily after you do
your createdb, i.e.:

createdb pgtest;
createlang plpgsql pgtest

and you're set.

Then you need to tell OpenACS where you've installed the binaries so
it can find PSQL (I'll make it smart enough to do this from
environment variables if they're set later on):

ns_section "ns/db/driver/postgres"
ns_param pg_bin "/home/pgtest/install/bin"

works for me as I installed PG7.1 as my pgtest user using
/home/pgtest/install rather than /usr/local/pgsql.

After you've done this just fire things up like you would for Oracle
4.x and off you go.  The only differences will be in your AOLserver
config file - you need three PG pools rather than three Oracle pools.
Use the same pageroot in both cases.

Only the core is included in our source tree at the moment.  If you
want to load other modules you'll have to go to the aD repository
(unless you have them yourself, of course).

Oh - you'll get lots and lots of "notice" lines in your AOLserver log,
from Ben's very preliminary version of the query processor.  They'll
go away soon.

Collapse
Posted by Edmund Lian on
Way to go!
Collapse
Posted by Vinod Kurup on
I ran into a problem when loading this under PG (7.1rc1). The datamodel loads, but acs-kernel fails :
Failed to install ACS Kernel, version 4.2. The following error was generated: 

     Database operation "0or1row" failed (exception NSDB, "Query was not a statement returning rows.")

Package enabled. 
The error log shows:
[05/Apr/2001:18:45:12][8255.25605][-conn4-] Error: dbinit: error(localhost::openacs-4,ERROR: ExecAppend: rejected due to CHECK constraint apm_package_vers_desc_for_ck ): ' select apm_package_version__new( '', 'acs-kernel', '4.2', 'http://www.arsdigita.com/acs-repository/download/apm/acs-kernel-4.2.apm', 'Routines and data models providing the foundation for ACS-based Web services.', '', '', '2001-03-06', 'ArsDigita Corporation', 'http://www.arsdigita.com/', 't', 't'); '
There's a check constraint on apm_package_versions that the description_format should be either "text/html" or "text/plain", but we're trying to set it to '' (the fifth argument). I think this is the old '' vs NULL issue, because if I enter the above command with NULL's instead of empty strings, it works. If I fix this issue, the same issue comes up later in apm_package_version__add_file (can't enter '' for db_type, because of foreign key violation). Am I missing something?

I then tried to load via Oracle and everything works beautifully!!! (8.1.6 EE). Really, REALLY cool!

Collapse
Posted by Don Baccus on
Oops - my fault.  I forgot that we changed the driver again and you need to download the latest from CVS at sourceforge, recompile it for PG 7.1 and take off from there.

We decided that we would change empty strings to NULL in the driver FOR BIND VARIABLE EMULATION ONLY.  This helps a lot.  Since the bind variable hack isn't part of SQL but purely a database API thing, it's not as awful a hack as it first appear (awful enough, though).  Without this hack you would have no way to insert NULL at all using bind variable emulation, clearly wrong (because the driver quotes all non-empty strings, which would cause NULL to become the literal string 'NULL', not all that useful).

We need to get our documentation act together...

Anyway, I do believe this will help you get to the "official" point of failure with PG 7.1.

Collapse
Posted by Tom Jackson on

My installation into Oracle 8.1.6EE went fine with a few exceptions.

Here is my mini-list of the steps involved.

This seemed to allow AOLserver to startup and stay running. I connected to the pageroot and was able to install OpenACS4 into Oracle 8.1.6EE.

Collapse
Posted by Tom Jackson on

I was able to install the data model into pg up to the "Completing Install" message, right after the skins package. I'm not sure if this is the official failure point, but here is how I got to that point.

  • Download and install the most recent PostgreSQL Release Candidate (today: RC3) from ftp://ftp.postgresql.org/pub/dev/.
  • Refer to the steps in the previous post on downloading openacs-4 and installing the nsxml module.
  • Download the pgdriver from SourceForge:
    cvs -d:pserver:anonymous@cvs.acs-pg.sourceforge.net:/cvsroot/acs-pg login
    (press enter at the password prompt)
    cvs -z3 -d:pserver:anonymous@cvs.acs-pg.sourceforge.net:/cvsroot/acs-pg co driver-2.3
    
  • Edit the acs/aolserver configuration file to include the parameter for the psql binary file. This was mis-stated in a recent bboard posting. Here is what I used:
    [ns/db/driver/postgres]
    datestyle=iso
    pgbin=/usr/local/pgsql/bin/
    

    So pgbin is the full path to your just installed psql binary. Remember to end the path with a slash.

  • Startup your aolserver and visit the pageroot.
Collapse
Posted by Don Baccus on
Yeah, that sounds like the right place.  Ideally you'd get an error earlier in the process saying "hey! acs-whatever only supports Oracle!" but that error check's not there yet.

Instead if it can't find a datamodel it just keeps chuggin' along at the moment.

The pgbin param shouldn't be so picky, I'll try to look at that and also the default to use PG environment variables by default if they're  available next week...

Collapse
Posted by Edmund Lian on
Hmmm... I get to http://www.mydomain.com:8080/install-data-model? and then get an ACS installation error:
Database operation "0or1row" failed (exception NSDB, "Query was not a statement returning rows.")
    while executing
"ns_pg_bind 0or1row nsdb0 {
    select case when count(*) = 0 then 0 else 1 end from apm_package_db_types}"
    ("uplevel" body line 1)
    invoked from within
"uplevel 2 [list ns_pg_bind $type $db $sql] $args"
    invoked from within
"db_exec 0or1row $db $full_name $sql"
    invoked from within
"set selection [db_exec 0or1row $db $full_name $sql]"
    ("uplevel" body line 2)
    invoked from within
"uplevel 1 $code_block "
    invoked from within
"db_with_handle db {
	set selection [db_exec 0or1row $db $full_name $sql]
    }"
    (procedure "db_string" line 7)
    invoked from within
"db_string db_types_exists "
    select case when count(*) = 0 then 0 else 1 end from apm_package_db_types""
    invoked from within
"set apm_db_types_exists [db_string db_types_exists "
    select case when count(*) = 0 then 0 else 1 end from apm_package_db_types"]"
    (file "/web/openacs-4/packages/acs-tcl/installer/install-data-model.tcl" line 40)
    invoked from within
"source $path"
Collapse
Posted by Don Baccus on
Database operation "0or1row" failed (exception NSDB, "Query was not a statement returning rows.")
         while executing
     "ns_pg_bind 0or1row nsdb0 {
         select case when count(*) = 0 then 0 else 1 end from apm_package_db_types}"
Well, this is weird. Are you sure you have the latest driver?

On the surface, the message is particulary weird because of course it returned no rows - and it is a 0or1row query so shouldn't fail!

Any more information you can provide, the better. I hate to say this, but my first response when this kind of thing happens on my machine is to seek a virgin environment, even if it means deleting a lot of stuff...

Collapse
Posted by Edmund Lian on
Yes, I'm using the 2.3 driver from Sourceforge. This is running on AOLserver 3.2+ad12, on a Debian potato system. PostgreSQL is 7.1RC3 On visually scanning the server logs, I see:
[09/Apr/2001:01:42:20][15832.4101][-conn0-] Notice: Querying '
    select case when count(*) = 0 then 0 else 1 end from apm_package_db_types;'
[09/Apr/2001:01:42:20][15832.4101][-conn0-] Error: Ns_PgExec: result status: 7 message: ERROR:  Relation 'apm_packag
e_db_types' does not exist

[09/Apr/2001:01:42:20][15832.4101][-conn0-] Error: dbinit: error(localhost::pgtest,ERROR:  Relation 'apm_package_db_
types' does not exist
): '
    select case when count(*) = 0 then 0 else 1 end from apm_package_db_types'
On checking for the apm_package_db_types table, I see it does not exist. I'll blow away the test DB tomorrow and see what happens. It's a bit late to track anything down (1:52 am) BTW, I noticed the following message in the logs:
[09/Apr/2001:01:42:16][15832.1024][-main-] Notice: Multiple definition of util_memoize in /web/openacs-4/packages/ac
s-tcl/tcl/memoize-procs.tcl and /web/openacs-4/packages/acs-tcl/bootstrap/bootstrap.tcl
We should blow away one of the definitions of util_memoize...
Collapse
Posted by Don Baccus on
Your datamodel didn't get loaded, that's the problem.  The code doing the select thinks that an earlier install attempt got further along than it really did.  It thinks the acs-kernel datamodel installed without error and is checking to see if any packages are enabled in the APM.

Could be a weak point in the bootstrap/installer's checks to see if the kernel datamodel's been installed.  aD wrote in some sanity checking but not a lot (nor should it need a lot).

The acs-kernel datamodel will load on my laptop ... let me know what happens when you blow away the database and try again.

Collapse
Posted by Edmund Lian on
<blockquote>>Your datamodel didn't get loaded, that's the problem.<<
</blockquote>

Yup, you're right. But in fact, nothing seems to have loaded. I tried blowing away the DB and starting again, but I get the same problem.

The server logs don't contain anything I would recognize as an error since I'm not familiar with the query dispatcher. Should I email you my logs?

Collapse
Posted by Don Baccus on
Are you sure you've got the right path to PSQL specified in your
.tcl init file that's read by OpenNSD on startup?  Are you sure that the command will execute if you run it as the same user you're running OpenNSD as?

I ask because at one point I had silent non-loading of the datamodel due to filtering out all but "ERROR" lines from running PSQL in db_source_sql_file (or whatever it is called).  Turns out that was not  a very good idea given the fact that PSQL will also return FATAL error lines :)

I filter these out so you don't get a bazillian NOTICE and messages along with other PSQL stuff dumped  on your screen as error messages ...

Come to think of it, I'm not certain I've committed that change as I've been saving some APM changes to submit as one fell swoop.  I'll look into this later today ... for now you can look at 10-database-procs-postgresql.tcl and strip out the filtering of PSQL output and you'll see everything PSQL emits.

Collapse
Posted by Edmund Lian on
<blockquote>>Are you sure you've got the right path to PSQL specified in your .tcl init file that's read by OpenNSD on startup?<<
</blockquote>

Ahhh!!! This is sort of the problem... Thank you!!! It turns out that the debs for PG do not modify the systemwide /etc/profile scripts to set the PG variables for users by default.

Been having some correspondence with Elphick about this. It's deliberate since he feels it is not appropriate to force this on all users. All PG users on Debian systems will have to source /etc/postgresql/postgresql.env

As an aside for other Debian users, the identd daemon must be installed on a Debian system to allow defined PG users (including the postgres superuser) access to the db without password authentication.

The kernel data model loaded, but PL/pgSQL doesn't seem to have be installed even though it should have been. Will post an update for others once I figure it out.

Collapse
Posted by Don Baccus on
You need to do a "createlang plpgsql database_foo" after you do "createdb".  It's never added by default ...
Collapse
Posted by Edmund Lian on
Thanks... the kernel data model and packages loaded! Things failed at http://www.mydomain.com/create-administrator with the error message:
Database operation "0or1row" failed (exception NSDB, "Query was not a statement returning rows.")
    while executing
"ns_pg_bind 0or1row nsdb0 {select email from cc_users where rownum = 1}"
    ("uplevel" body line 1)
    invoked from within
"uplevel 2 [list ns_pg_bind $type $db $sql] $args"
    invoked from within
"db_exec 0or1row $db $full_statement_name $sql"
    invoked from within
"set selection [db_exec 0or1row $db $full_statement_name $sql]"
    ("uplevel" body line 2)
    invoked from within
"uplevel 1 $code_block "
    invoked from within
"db_with_handle db {
	set selection [db_exec 0or1row $db $full_statement_name $sql]
    }"
    (procedure "db_0or1row" line 22)
    invoked from within
"db_0or1row user_exists "select email from cc_users where rownum = 1""
    (file "/web/openacs-4/packages/acs-tcl/installer/create-administrator.tcl" line 1)
Is this the official failure point now? 😊
Collapse
Posted by Dan Wickstrom on
That's it.