Forum OpenACS Development: Re: Calls to server itself are ok on command line, but hang in /ds/shell

What are your thread settings?

There is a case where nsd will not actually spawn a new thread for a new connection even if there is no thread available, and this can lead to a blocking http call to the server itself hanging everything until it times out.

In naviserver, you can fix this by setting

ns_section "ns/server/${servername}"
ns_param lowwatermark 0 ;# eager thread creation at low concurrency

I don't recall the corresponding setting for AOLserver without checking, but the thread creation behavior is a bit different so it might not have the same issue.

The symptoms you described look just like mine, but still can't figure out my cause. Here is the relevant part of my conf on the infamous instance

# Scaling and Tuning Options
# ns_param maxconnections 100 ;# 100; number of allocated connection stuctures
# ns_param maxthreads 10 ;# 10; maximal number of connection threads
# ns_param minthreads 1 ;# 1; minimal number of connection threads
ns_param connsperthread 1000 ;# 10000; number of connections (requests) handled per thread
# ns_param threadtimeout 120 ;# 120; timeout for idle theads
# ns_param lowwatermark 10 ;# 10; create additional threads above this queue-full percentage
ns_param highwatermark 100 ;# 80; allow concurrent creates above this queue-is percentage
;# 100 means to disable concurrent creates

It is not subjected to heavy load, because we use it for development. I have checked on other instances where I don't have the problem and saw that configurations are the same about threads.

Right, this is what I expected. Since minthreads=1 (generally quite reasonable on a development server) a new thread won't spawn until needed. The watermarking logic defaults to not "needing" a new thread until the queue is 10% full, i.e., there are 10 connections waiting to be services.

In your test, you have one connection currently running (the ds request), and a second waiting (the loopback request), so only 2 connections is not busy enough to warrant spawning a new thread, and the server doesn't (AFAIR) have any logic to specify that a thread should be created if connections have simply been queued for a certain amount of time without progressing. (The benefit of such logic would be dubious IMO; connections are generally expected to complete quickly unless there is a specific reason they should be acting differently).

So changing the lowwatermark setting to 0 will make the server always create a new thread on a new request if there's not already a thread for it, so the loopback request will be handled.

Alternately, you could set minthreads to something more than 1, or while waiting for the loopback request to time out hit the server 10 or more times using curl to queue up connections and get it to create a new thread, and then things will progress (until a restart, at least).

Changing minthreads value solved the issue!

Thanks Jeff and Gustaf!

small addon:
- newer versions of NaviServer and OpenACS show a more modest growth of memory per additional connection thread. E.g. uses minthread 5, and uses after 4 weeks of continuous running < 2GB RSS
- NaviServer checks at the end of each request if there are requests queued and/or sufficient threads available. During a request is running in a connection thread, incoming requests are either handle by other connection requests or they are queued. This is perfectly fine on real-world systems with many small requests, but bad for your kind of application, where the only connection thread deadlocks, since it waits for the completion of a queued request.
- Allowing concurrent connection thread creates won't help, since this means that thread creates are allowed at a time while threads are being created (many tcl version had/have problems with this)