Forum OpenACS Q&A: Request Error: Server startup failed

Collapse
Posted by Tom Brown on
Aolserver returns the following error:

Request Error
Server startup failed: Error during bootstrapping

command "ns_db" is not enabled
    while executing
"ns_db pools"
    (procedure "db_bootstrap_set_db_type" line 72)
    invoked from within
"db_bootstrap_set_db_type database_problem"

Here is the first warning from freegeek-error.log.

Warning: modload: failed to load '/usr/local/aolserver/bin/nspostgres.so': 'libpq.so.2: cannot open shared object file: No such file or directory'
[01/Jul/2003:04:00:57][792.1024][-main-] Error: dbdrv: failed to load driver 'postgres'
[01/Jul/2003:04:00:57][792.1024][-main-] Error: dbinit: no such default pool 'pool1'

/usr/local/aolserver/bin/nspostgres.so does exist. The owner is nsadmin.nsadmin, permissions 755.

/usr/local/pgsql/lib/libpq.so.2 also exists. The owner is postgres.web, permissions 755. Libpq.so.2 links to libpq.so.2.2.

Later the log reports a series of errors beginning with the warning at the top of this post. The later errors seem to derive from the modload error.

Tom

Collapse
Posted by russ m on

(assuming this is on a solaris box)

run ldd /usr/local/aolserver/bin/nspostgres.so and take a look at the output. if there's a line that says libpq.so.2 => (file not found) then this is your problem -

that error message means that when the dynamic loader is loading nspostgres.so it can't find the required libpq.so.2 - this would be because /usr/local/pgsql/lib is neither in the compiled-in RPATH or the environment LD_LIBRARY_PATH...

the solutions are to recompile nspostgres.so with LD_OPTIONS='-R/usr/local/pgsql/lib' set in your environment (this fixes your nspostgres.so so that it knows where to find it's dependancies) or alternatively to use your existing binary but make sure that /usr/local/pgsql/lib is included in the aolserver's LD_LIBRARY_PATH environment variable...

cheers

russell

Collapse
Posted by Tom Brown on
Contents of nsd-postgres:

#!/bin/bash

export PATH=$PATH:/usr/local/pgsql/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/pgsql/lib

exec /usr/local/aolserver/bin/nsd $*

--------------------------------------

btw, the start command I am using is /usr/local/aolserver/bin/nsd-postgresql -u nsadmin -t /web/freegeek/freegeek.tcl.

If I start oacs as user nsadmin (rather than root), the nsd-postgres script above appears to work. At least postgresql loads. But other errors are exposed.

As the various packages load, there is a lot of querying on some of them, particularly ones which start "NO FULLQUERY FOR dbqd".

Example:

[01/Jul/2003:08:54:26][1583.1024][-main-] Debug: NO FULLQUERY FOR dbqd..NULL --> using default SQL
[01/Jul/2003:08:54:26][1583.1024][-main-] Notice: Querying '
    select package_key from apm_packages where package_id = '0';'
[01/Jul/2003:08:54:26][1583.1024][-main-] Notice: dbinit: sql(localhost::freegeek): '
    select package_key from apm_packages where package_id = '0'

There are errors on a proc indexer routine.

Example:

[01/Jul/2003:08:31:28][1379.2051][-sched-] Notice: Running scheduled proc search_indexer...
[01/Jul/2003:08:31:28][1379.2051][-sched-] Error: invalid command name "parameter::get"
invalid command name "parameter::get"
    while executing
"parameter::get -package_id $package_id -parameter $name -default $default"
    (procedure "ad_parameter" line 6)
    invoked from within
"ad_parameter -package_id [apm_package_id_from_key search] FtsEngineDriver"
    (procedure "search_indexer" line 3)
    invoked from within
"search_indexer"
    ("eval" body line 1)
    invoked from within
"eval [concat [list $proc] $args]"
    (procedure "ad_run_scheduled_proc" line 43)
    invoked from within
"ad_run_scheduled_proc {f f 60 search_indexer {} 1057064728 0 t}"

Finally, Aolserver no longer returns an error page. Now it returns a blank screen.

Tom

Collapse
Posted by russ m on

has this setup worked in the past, has the configuration been changed, or is it a new install of aolserver and postgres? am I right to guess you're using solaris?

as far as I know the NO FULLQUERY notices are harmless - they just mean the query dispatcher is using an inline query from the .tcl file because there's nothing appropriate in a .xql

the other errors you mention remind me of something I saw ages ago when I had either homedir or serverroot set incorrectly in the sitename.tcl config file.

to see exactly what's going on with finding/loading libpq, you can add

export LD_DEBUG=detail,basic,libs
export LD_DEBUG_OUTPUT=ld-debug-output-file

to your nsd-postgres, and check the output file for the section where it's trying to load libpq... in my setup I see

19888: 1: find object=libpq.so.2; searching
19888: 1: search path=/opt/pgsql/lib:/usr/local/lib (RPATH from file /opt/aolserver/bin/postgres.so)
19888: 1: trying path=/opt/pgsql/lib/libpq.so.2

where /opt/pgsql/lib/libpq.so.2 is the correct path and all is well... you will see everywhere it's trying to look, which should help track down why it's not looking in the right place (assuming it used to)...

cheers

russell

Collapse
Posted by Tom Brown on
Sorry to keep you guessing on the OS -- Slackware 8.1 on an aging IBM Aptiva.

Yes, the system has worked in the past, very recently in fact. I wish I could tell you that I changed a config file or a set of permissions, anything. But I can't think of a single change. So it has to be something unremarkable.

I set "export LD_DEBUG=libs" in nsd-postgres and exported an output file. The other options (detail and basic) returned errors in Slackware so I dropped them. But the libs option was good. I searched the debug output file for occurances of libpq.so.2. It never turned up. So I searched for "libpq" and "so.2". Plenty of each but never combined as libpq.so.2.

Tom

Collapse
Posted by russ m on

I don't know a whole lot about Linux's dynamic loader (or anything else), so I'm getting into speculation here... perhaps someone else is better placed to help...

but, regarding your loader debug output, I'd expect there's a section there where it's talking about trying to find libpq (perhaps without the .so or library version extensions), and iterating through some series of directories before giving up and calling it a day... it might be interesting to compare loader debug output when nsd is run as root and as nsadmin to see what difference there is in searching for shared objects...

another thought regarding your nsd-postgres - if LD_LIBRARY_PATH is undefined when nsd-postgres is called, the final value that aolserver will inherit is ":/usr/local/pgsql/lib"... it's possible that the empty path component at the start of that is confusing the loader... it's a stab in the dark, but you might try changing that to just

export LD_LIBRARY_PATH=/usr/local/pgsql/lib

and see what happens...