Handling out of memory on "exec" calls
Created by Gustaf Neumann, last modified by Gustaf Neumann 13 Jul 2020, at 02:24 PM
On sites with high number of configured threads and high number of activated packages, the VM size of nsd might become large. One measures is to compile NaviServer and Tcl with the flag SYSTEM_MALLOC, and to run it with a memory efficient malloc library such as TCMalloc (using in the service file LD_PRELOAD=/usr/lib/libtcmalloc.so").
However, even then memory might become short, when "exec" is performed, even via the nsproxy module. The nsproxy module was developed to reduce memory consumption by running a separate process which communicates via pipes to nsd. The nsproxy module is designed to run multiple worker processes (the number can be configured), but when this number runs out (more such worker processes are needed) a new nsproxy worker process has to be created. This happens via a fork() system call under Linux, which can result in an out-of-memory message like the following:
Error: exec: ns_fork() failed: Cannot allocate memory Error: exec failed: not enough memory
What should one do in such cases? In general, the first rule is to reduce the "exec" calls as far as possible, since these are relative slow and resource intensive, and NaviServer/Tcl provide a large number of built-ins or library functions, which should be used if possible.
Secondly, it is preferable to start many nsproxy workers rather soon in the live-time of nsd (e.g. at startup, when it has a small footprint) and ensure that the nsproxy module keeps these worker process alive a long time (by setting "idletimeout" to a high value)
ns_section ns/server/${server}/module/nsproxy { # ns_param recvtimeout 5000 # ns_param waittimeout 1000 # ns_param idletimeout 300000 ns_param idletimeout 700000000 }
When all configured nsproxy worker processes are running all the time, there is not need to fork later, and the error above will not occur anymore. The following snippet shows, how to start all nsproxy worker processes by creatin an ns_job queue with sufficient threads, and start a simple command executed by via nsproxy asynchronously (using the "-detached" flag) in parallel.
set concurrency [ns_proxy configure ExecPool -maxslaves] if {"q1" ni [ns_job queues]} { ns_job create q1 $concurrency } # # Queue a sufficient number of jobs to be executed in parallel # time {ns_job queue -detached q1 {exec sleep 1}} [expr {$concurrency * 2}]
Additionally, versions of NaviServer beyond 4.99.20 show via
ns_proxy stats ExecPool
the number of running worker processes of the ExecPool (also included in the process page of nsstats).