Forum OpenACS Q&A: AOLserver won't start

Collapse
Posted by Ola Hansson on
Hi all!

Please help me with this one:
(Debian Testing, kernel 2.4.17, PG 7.1.3, AOLserver 3.3.1+ad13, latest virtual Jerry's.., OpenACS 4.5B)

One of my three virtual servers suddenly died on me without apparent reason. I'm using daemontools to manage the servers but now the *bad* server can no longer be started with "svc -u /service/nsd-infogettable-dev/". However, I can start it by typing "./bin/nsd-postgres -ft infogettable-dev.tcl" but then it takes around 90 seconds to start it, compared to the 10 seconds it used to take.

Oddly enough, it cannot be started in the background (which is what svc does, I think), only in the foreground. Oddlier still, if I direct stdout to a file (for debugging purposes) "./bin/nsd-postgres -ft infogettable-dev.tcl 2> STARTUP_LOG" it takes the, normal, 10 seconds to start!

Whichever way I choose to start the server, I get a "Segmentation fault" (server dies) when a non-existing page is requested. I never saw this before...
Notice: Error from etp::get_pa was:
 Query did not return any rows.
Notice: index.vuh: request for kfgkfdkfjdfsl
Notice: Edit This Page index.vuh: serving 
/web/infogettable-dev/packages/edit-this-page/templates/article-content
Warning: APM: cache_max_age does not exist
Segmentation fault
nsadmin@hal:~$ 

Sometimes I get this too: pq_recvbuf: unexpected EOF on client connection

(This started to happen while, or after, the system was in the process of indexing (for search) large amounts of large static pages, which are now perfectly searchable, BTW.)

Any help is as always much appreciated,
Collapse
Posted by Dave Bauer on
Ola,

Check the stacksize in your tcl config file. There are two places in the sample config where it is listed. It has to be at least 500000 for OpenACS 4. I ran into this problem myself.

I have stacksize set in ns/threads.

Collapse
Posted by Ola Hansson on
Thanks Dave!

Increasing the stack size to 512*1024 (four times the default value...) makes the segfault disappear when I request pages that don't exist. Good.

It doesn't help the server start-up problem, though.

(Obviously, I've restarted the master and the other servers but never rebooted the whole box...I don't think I dare to do that remotely:-b)

Collapse
Posted by Ola Hansson on
I get this when I try to start the server with "-it" from the command line.

Anyone knows what file size is referred to?

(All the servers start, but this)

nsadmin@hal:~$ ./bin/nsd-postgres -it infogettable-dev.tcl
[24/May/2002:13:29:13][5528.1024][-main-] Notice: nsd.tcl: starting to read config file...
[24/May/2002:13:29:13][5528.1024][-main-] Warning: nsd.tcl: nsssl not loaded because key/cert files do not exist.
[24/May/2002:13:29:13][5528.1024][-main-] Notice: nsd.tcl: finished reading config file.
File size limit exceeded
nsadmin@hal:~$

Collapse
Posted by Ola Hansson on
Found the answer through Google!

The server log had exceeded 2 GB (ouch!)

nsadmin@hal:~$ ls -l log/infogettable-dev-error.log
-rw-r--r--    1 nsadmin  web      2147483647 May 22 14:21 log/infogettable-dev-error.log

I guess I need Arjun's log roll script...