Forum OpenACS Q&A: Mozilla 4.0 kills openacs?

Collapse
Posted by Koala Yeung on
I am not sure if this is true.

The OpenACS service instance on my server died quite frequently for months. I have to write a bash script, monitor the service and restart it everytime it dies. My script will also record the last few lines of AOLserver's error_log and access_log so that I can track the error.

It seems taht the service dies quite frequently with Googlebot visit. I blocked Googlebot (yes, thats crazy) for a few months and the down time reduces dramatically. I still get down times. I examine those logs and found that seems Mozilla 4.0 or things based on that (like Googlebot) would sometimes kill OpenACS service instance.

I'm not sure if anyone got the same situation. May be it is because of badly written code or something else. I'm not sure what to look at. Please tell me what to do.

Thanks,
Koala

Collapse
Posted by Torben Brosten on
It's been a couple of years.. so maybe I'm remembering wrong..

I believe aolserver with nsopenssl crashed frequently when it tried serving mixed http/https content from the same connection/request when using early browsers, such as nn4 or ie3.

Collapse
Posted by Koala Yeung on
Thanks for your reply.

It may not be the case since we do not serve mixed http/https content on our server. We do not even use nsopenssl. But we uses apache2 with mod_ssl and mod_proxy to serve our site as https site. Do you think that is related?

Also Mozilla is a successor of Nescape Navigator. May there is something to do with this, too.

How can I proof if this assumption is right or wrong?

Koala

Collapse
Posted by Torben Brosten on
I tested two ways:

1. create an html page that contains img tags with src to fully-qualified https urls on the same site, and then access via http. For example, create test-page-1.html:

...some html...
>img src="https://mysite.foobar.com/image.gif">;
...some html...
some more image tags with https references
...more html..

then reference the page from the browser:
http://mysite.foobar.com/test-page-1.html

2. create an html page that contains img tags with src to fully-qualified http urls on the same site, and then access via https. For example, create test-page-2.html:

..some html..
>img src="http://mysite.foobar.com/image.gif">;
..some html..
some more image tags with http references
..more html..

then reference the page from the browser:
https://mysite.foobar.com/test-page-2.html

Note: Request the page using the http/https protocol *not* used in the image references.

Collapse
Posted by Torben Brosten on
oops..


just to clarify any ambiguity,


I meant to write <img instead of >img
Collapse
Posted by Andrew Piskorski on
Koala, btw, there is no such thing as an "OpenACS service instance", I presume you mean the AOLserver process which you use with OpenACS.

You need to find out why your AOLserver process is going down. Guessing is useless, you need to know. Is it crashing, or is something else on your system purposely killing it?

If AOLserver is crashing, Make sure you have core files enabled (usually done with "ulimit -c unlimited" under Linux), and then examine the core file with gdb. Minimally, show us the stack backtrace ("bt"). Hopefully it will give a clue as to why the process crashed.

Also, I see no reason why you needed to write any bash script to, "monitor the service and restart it everytime it dies". Why did you do this? Getting AOLserver to restart anytime it is killed is easy, just use /etc/inittab or Daemontools, as is described in the OpenACS docs.

Collapse
Posted by Koala Yeung on
I know it is stupid but how to use gdb to get backtrace? I type "bt" before the process dead, it show me this:

(gdb) bt
#0 0x003e67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x0065451e in do_sigwait () from /lib/tls/libpthread.so.0
#2 0x006545bf in sigwait () from /lib/tls/libpthread.so.0
#3 0x00b26056 in ns_sigwait ()
from /usr/local/AOLserver4.0.9/lib/libnsthread.so
#4 0x00dff7f6 in NsHandleSignals ()
from /usr/local/AOLserver4.0.9/lib/libnsd.so
#5 0x00de67a9 in Ns_Main () from /usr/local/AOLserver4.0.9/lib/libnsd.so
#6 0x0804861d in main ()

After the process dead, it give me this:
(gdb) bt
#0 0x003e67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
Error accessing memory address 0xbff687b4: No such process.

Is that enough? What did I miss? How to obtain what I lost?

Also, I see no reason why you needed to write any bash script to, "monitor the service and restart it everytime it dies". Why did you do this? Getting AOLserver to restart anytime it is killed is easy, just use /etc/inittab or Daemontools, as is described in the OpenACS docs.

First reason is I tried the Inittab method but it gets my machine down. Second, I know bash better than Daemontools. Third thing is I need to monitor the service status on web interface with wget.

Anyway, my bash is working so fine and I have all that I need.

Collapse
Posted by Andrew Piskorski on
Koala, it's not particularly stupid, if you've never used it before there is definitely some learning curve to gdb. (Using it is actually pretty easy, but it is not obvious; you have to learn how.)

I think you are attaching gdb to the running process, which then gets killed. You can probably make that work too (I don't remember, it's been a long time since I used gdb much), but for just getting a backtrace that's not necessary. Instead, just run gdb on the core file which is written out after the process crashes:

In Emacs, do: M-x gdb-core and answer it's questions. It will ask you for the path to the AOLserver executable, and for the path to the core file. Once gdb starts up, just try "bt". Some additional gdb commands that might help give you more info are:

(gdb) dir /web/aol4-src/aolserver
(gdb) dir /web/aol4-src/aolserver/nsd
(gdb) set height 0
(gdb) show directories
Of course, change the path names above to point to your source trees for AOLserver.
Collapse
Posted by Brian Fenton on
Hi Koala,

what are your values for the following AOLserver parameters?

maxconnections
maxthreads
minthreads

Brian