Forum OpenACS Q&A: OpenACS clustering setup and how it relates to xotcl-core.
We are looking to get some feedback and direction on the implementation of naviserver clustering.
Our growing company is very interested in implement naviserver clustering behind an nginx load balancing server. We have been testing this setup for a little over a week now. The nginx load balancing between the naviservers appears to work great. The main reason for looking into clustering is to scale up our capacity as it relates to naviserver connections, because we have during high loads, run out of available connections at times.
Not surprisingly, In our initial testing we have found that by having multiple naviservers behind nginx (serving round robin) we have much more capacity for connections during our load tests. And we have also found that having the database on a separate VM than the naviserver instances, provides a great speed-up during extreme loads.
In our testing we also use Docker to manage the nginx and backend naviservers containers. We are using an up to date oacs-5.10 development version and are using the latest xotcl-core.
In our testing we have set up the naviserver parameters to allow naviserver clustering under the kernel package (CanonicalServer, ClusterAuthorizedIP, ClusterEnabledP, ClusterPeerIP, EnableLoggingP) in order to properly implement cache flushing across the naviserver cluster.
In testing this configuration we have found that the code does call the cluster servers for anything that is util_memoize
cached by calling the pages/SYSTEM/util_memoized_statement.tcl
script for each server. However, for things like cached parameters it does not call the cluster servers. See Example below:
Broader overall Question #1:
How much of the cache flushing over the cluster has actually been implemented? We are asking this question along with the following 2 questions which are more specific.
Question #2:
In your opinion what would be the pluses and minus of us making parameters, users info, etc, flushable across all servers in the cluster?
Example: under acs-subsite/www/shared/parameters.tcl
- calls parameter-procs.tcl -> parameter::set_value package_id $package_id -parameter $name
- acs-content-repository-callback-procs.tcl -> callback subsite::parameter_changed
- if {$package_key eq "acs-content-repository" && $parameter eq "CRFileLocationRoot" && $value ne ""} Then
- a) nsv_unset CR_LOCATIONS CR_FILES
- b) nsv_set CR_LOCATIONS CR_FILES "$file_location"
- acs.tcl - ad_parameter_cache -delete -set $value $package_id $parameter
- nsv_unset ad_param_$key $parameter_name
- NOTE: <b>We could call code to call each server in the cluster to do the same nsv_unset here...</b>
We also noticed that xotcl-core looks like it has some caching setup for sitenodes when UserXotclSiteNodes is set to true. There also looks like there is a method for handling acs::clusterwide flushing of the generic cache in the base generic caching class (::acs::Cache) although there is a comment saying that it may not be all the way implemented yet.
Question #3:
In your opinion, if we wanted to implement server clustering in the next 2 months, Would it be easier and faster to implement cluster caching across most caches using the util_memoize_flush
way of doing things
Or
Would we be better off implementing the xotcl caching for parameters, util_memoized functions etc?
If the later is true do you have any tips as to how to start caching/flushing across servers in the cluster this way for most of our caches.
Thanks for your assistance and direction, Marty
Dear Marty,
you are raising many interesting questions! The short answer is that the cluster support is just partly implemented - simply because with high numbers of cores and a suitable naviserver configuration one one gets a very high scalability (up 1000 permission checked OpenACS requests per second) - and getting all of OpenACS with all its packages fit for clusters is some work, and using cluster make debugging more complex.
If one has many long running connection requests, one should use multiple connection thread pools (have you set this up), and one should consider reducing the blocking time of the requests (what times are you experiencing). Are you using the xotcl request monitor? e.g. the long-calls page gives you a good overview.
found that having the database on a separate VM than the naviserver instances
i do not think this is in general the case, but depends on the memory of the VM, on the number of cores given to the VM, etc. The local communication performance within a machine is always faster in terms of latency and throughput than via TCP/IP. When the a VM is under-configured for the required load (has only e.g. 2 cores, or is running out of memory, not enough shared buffers, ...) then maybe.
how much of the cache flushing over the cluster has actually been implemented
Caching happens on many levels: per-request caching, per-thread caching, caching via ns_cache
and caching via nsv
. The first one is not relevant for cluster considerations, per-thread caching is intentionally lock-free, but requires measures like a restart (used for things that "never change"). For the other ones, there is the special handling as provided by util_memoize
and the custerwide operations. Using util_memoize
for everything is not good, since it does not scale, since there are many long locking operations that limit concurrency (check on nsstats the lock times and sort by "total wait"; it is not unlikely that 'util_memoize' cache is top), so the caching work of the last years was devoted in defining multiple caches and cache partitioning.
The clusterwide support should work for you out of the box. Historically, it was developed in xotcl-core, but has been moved partly into acs-tcl since this is useful everywhere (some more generally useful stuff will be moved in future versions from xo:: to the core locations). Slightly simplified it works like the following: instead of writing
::acs::clusterwide nsv_unset $array $key
one can use
::acs::clusterwide nsv_unset $array $key
to execute a command on all machines of a cluster. So prefixing the cache clearing operations with ::acs::clusterwide
should do the job.
Be aware that intra-server talk for cache clearing operations can become quite heavy, and that the communication as done right now is not how it should be done. ... many things can be done there, but first it has to be clear that it is worth the effort.
The first check is usually to figure out the local bottlenecks and the configuration, this is usually much less work than going towards the clustered solution.
Hope this answers some of your questions
all the best
-g
Sorry for the late reply. I was out on vacation.
We appreciate your help and expertise. Your point is well taken that we should try to fine tune our configuration as much as we can. I will try to describe our setup and testing on our end so we can try to make sense of what we see. I work on the same naviserver instance as Tony Kirkham. I know you have worked with him on some of our setup so that may help you, to some degree, know what type of setup we have.
We did, at one time, have an extra connection thread pool set up for slow runners. However, after recently updating to the newest naviserver with xotcl we do not have one currently. It was our understanding that the new xotcl may take care of some of the long runners automatically without us defining a specific thread pool and telling xotcl about it. Is this not a good assumption? How would we go about telling the new core where our new long running pool(s) resides? Our current pool config is below.
ns_section ns/server/${server}/pools ns_param monitor "Monitoring actions to check the health of the system" ns_param image "Image file requests" ns_section ns/server/${server}/pool/monitor ns_param minthreads 3 ns_param maxthreads 5 ns_param map "GET /admin/nsstats" ns_param map "GET /SYSTEM" ns_param map "GET /ds" ns_param map "POST /ds" ns_section ns/server/${server}/pool/image ns_param minthreads 3 ns_param maxthreads 5 ns_param map "GET /*.png" ns_param map "GET /*.jpg" ns_param map "GET /*.PNG" ns_param map "GET /*.JPG" ns_param map "GET /*.JPEG" ns_param map "GET /*.jpeg" ns_param map "GET /*.mp4" ns_param map "GET /*.MP4" ns_param map "GET /*.pdf" ns_param map "GET /*.PDF" ns_param map "GET /*.gif" ns_param map "GET /*.GIF"Our Server setup:
We have one VM that has 24 cores and 80 Gig of RAM which our live system resides. We also have a staging VM that has 16 cores and 200 Gig of RAM.
The tests below will be run against our staging server.I have been using a load testing tool called k6 (See: https://k6.io/) to set a base line test for our server and to do a quick 4 minute test.
The base test had the following ramp up options:export const options = { stages: [ { target: 25, duration: '30s' }, // Ramp up to 25 concurrent users over the 30 seconds { target: 100, duration: '30s' }, // Then ramp up to 100 concurrent users for the next 30 seconds { target: 300, duration: '1m' }, // Then ramp up to 300 concurrent users for the next minute { target: 100, duration: '1m' }, // Then ramp down to 100 concurrent users for the next minute { target: 0, duration: '1m' }, // Finally ramp down to 0 concurrent users for the next minute ] };
Each k6 session does the following things: 1. Go to login page 2. Verify that it got a status 200 from login page 3. Sleep 1 second 4. Submit the login form 5. Verify that it got a status 200 6. Sleep 3 seconds 7. Go to a quick loading page (~ under 1 second) 8. Verify that it got status 200 9. Sleep 3 seconds 10. Go to our highly dynamic server side rendered commonly used effort reporting page (normally takes a full 2 seconds to load) 11. Verify that it got status 200 12. Sleep 3 seconds
Test #1 on our Staging server
Nsstats over time ------------------------ No Load on system: 3:56pm Connection Threads: min 2 max 10 current 2 idle 1 stopping 0 waiting 0 Request Handling: requests 7, queued 0 (0.00%), spooled 6 (85.71%) Request Timing: avg queue time 98.57µs, avg filter time 183.63ms, avg run time 12.91ms avg trace time 3.92ms Active Requests: cns6 172.31.6.115 running GET /z1 0.014718 0 . Starting the test. ~3:58pm 140 VUs (Virtual Users) Connection Threads: min 2 max 10 current 10 idle 0 stopping 0 waiting 73 Request Handling: requests 1.7K, queued 1K (61.48%), spooled 1K (62.24%) Request Timing: avg queue time 660.56ms, avg filter time 5.44ms, avg run time 218.25ms avg trace time 1.31ms Active Requests: cns1639 172.31.207.40 running GET /z2 5.408894 0 . ~3:59pm 300 VUs Connection Threads: min 2 max 10 current 10 idle 0 stopping 0 waiting 90 Request Handling: requests 3.6K, queued 3K (77.81%), spooled 2K (64.06%) Request Timing: avg queue time 1.39s, avg filter time 5.51ms, avg run time 224.59ms avg trace time 1.6ms Active Requests: cns3397 172.31.207.40 running GET /z3 0.815387 0 . ~4:00pm 200 VUs Connection Threads: min 2 max 10 current 10 idle 0 stopping 0 waiting 75 Request Handling: requests 4.9K, queued 4K (82.60%), spooled 3K (66.17%) Request Timing: avg queue time 1.95s, avg filter time 5.77ms, avg run time 268.55ms avg trace time 1.71ms Active Requests: cns4809 172.31.207.40 running GET /z4 0.932554 0 . 4:01pm 49 VUs Connection Threads: min 2 max 10 current 10 idle 3 stopping 0 waiting 0 Request Handling: requests 6.1K, queued 5K (81.76%), spooled 4K (66.21%) Request Timing: avg queue time 2.03s, avg filter time 5.84ms, avg run time 289.01ms avg trace time 1.74ms Active Requests: cns6038 172.31.207.40 running GET /z5 7.99834 0 . K6 summary ------------------ running (4m09.0s), 000/300 VUs, 964 complete and 86 interrupted iterations default ✓ [======================================] 000/300 VUs 4m0s ✗ status is 200 ↳ 99% — ✓ 4051 / ✗ 7 ✗ page succeeded ↳ 99% — ✓ 4041 / ✗ 17 █ setup █ teardown checks.........................: 99.70% ✓ 8092 ✗ 24 << Note: We would like be 100% on our page check for status 200 and page succeeded. data_received..................: 922 MB 3.7 MB/s data_sent......................: 5.2 MB 21 kB/s http_req_blocked...............: avg=2.6ms min=1.48µs med=7.01µs max=41.1ms p(90)=8.31ms p(95)=12.44ms http_req_connecting............: avg=180.19µs min=0s med=0s max=34.22ms p(90)=419.21µs p(95)=465.91µs ✗ http_req_duration..............: avg=3.89s min=441.05µs med=1.15s max=35.08s p(90)=11.28s p(95)=17.46s << Note: It would be nice to be under 8 seconds in the 95th percentile. { expected_response:true }...: avg=3.89s min=4.6ms med=1.15s max=35.08s p(90)=11.28s p(95)=17.4s http_req_failed................: 0.11% ✓ 7 ✗ 6143 << Note: We would like to have 0% failed http requests. http_req_receiving.............: avg=1.16ms min=51.84µs med=178.08µs max=41.42ms p(90)=2.66ms p(95)=4.52ms http_req_sending...............: avg=42.78µs min=8.58µs med=40.67µs max=513.41µs p(90)=76.4µs p(95)=88.37µs http_req_tls_handshaking.......: avg=2.38ms min=0s med=0s max=39.85ms p(90)=7.75ms p(95)=11.64ms http_req_waiting...............: avg=3.89s min=294.73µs med=1.15s max=35.08s p(90)=11.28s p(95)=17.46s http_reqs......................: 6150 24.695368/s iteration_duration.............: avg=32.34s min=2.31µs med=31.28s max=1m6s p(90)=54.78s p(95)=56.5s iterations.....................: 964 3.870949/s vus............................: 1 min=1 max=300 vus_max........................: 300 min=300 max=300 Error.log file shows we are out of available connections during this test. -------------------- ... [13/Sep/2021:16:00:16][1.7f1a15c5a700][-driver:nsssl_v4:0-] Notice: [east_intrastage pool ] All available connections are used, waiting 90 idle 0 current 10 [13/Sep/2021:16:00:17][1.7f1a15c5a700][-driver:nsssl_v4:0-] Notice: [east_intrastage pool ] All available connections are used, waiting 90 idle 0 current 10 [13/Sep/2021:16:00:18][1.7f1a15c5a700][-driver:nsssl_v4:0-] Notice: [east_intrastage pool ] All available connections are used, waiting 90 idle 0 current 10 [13/Sep/2021:16:00:18][1.7f1a15c5a700][-driver:nsssl_v4:0-] Notice: [east_intrastage pool ] All available connections are used, waiting 90 idle 0 current 10 [13/Sep/2021:16:00:19][1.7f1a15c5a700][-driver:nsssl_v4:0-] Notice: [east_intrastage pool ] All available connections are used, waiting 90 idle 0 current 10 [13/Sep/2021:16:00:20][1.7f1a15c5a700][-driver:nsssl_v4:0-] Notice: [east_intrastage pool ] All available connections are used, waiting 90 idle 0 current 10 [13/Sep/2021:16:00:21][1.7f1a15c5a700][-driver:nsssl_v4:0-] Notice: [east_intrastage pool ] All available connections are used, waiting 90 idle 0 current 10 ...Some of the longer calls from request monitor during this time. request-monitor/long-calls?lines=1000
... 14.371059 0.003934 1.550067 15.925060 2021 Sep 13 16:00:30 14.290415 0.006291 1.668398 15.965104 2021 Sep 13 16:00:30 14.062884 0.003386 0.007183 14.073453 2021 Sep 13 16:00:29 14.189928 0.003912 1.581149 15.774989 2021 Sep 13 16:00:29 14.176448 0.004008 0.018152 14.198608 2021 Sep 13 16:00:29 14.314284 0.003587 0.010584 14.328455 2021 Sep 13 16:00:29 14.265956 0.004304 1.653035 15.923295 2021 Sep 13 16:00:29 14.447920 0.005010 1.477352 15.930282 2021 Sep 13 16:00:29 14.803571 0.004733 1.380080 16.188384 2021 Sep 13 16:00:29 14.458517 0.004058 1.835248 16.297823 2021 Sep 13 16:00:29 14.572207 0.006749 1.628151 16.207107 2021 Sep 13 16:00:29 14.449219 0.003139 0.006589 14.458947 2021 Sep 13 16:00:28 14.402088 0.007025 1.514557 15.923670 2021 Sep 13 16:00:28 14.547796 0.005566 1.703462 16.256824 2021 Sep 13 16:00:28 14.278776 0.012663 1.880569 16.172008 2021 Sep 13 16:00:28 14.470381 0.019008 1.793124 16.282513 2021 Sep 13 16:00:28 14.950281 0.003875 1.816067 16.770223 2021 Sep 13 16:00:28 14.547664 0.004792 1.457710 16.010166 2021 Sep 13 16:00:28 14.418724 0.019254 0.008703 14.446681 2021 Sep 13 16:00:28 ...
From the data above it looks like we are running out of connections in our connection pool which I believe would cause requests to be put onto the queue to wait until another connection is available.
Note: When we put 5 servers behind a nginx server and the DB on a separate VM our status checks for page success was 100% and our wait times in the 95th percentile were around 7 seconds.
Any suggestions you have that would help us fine tune our configuration for one server would be greatly appreciated.
Thanks for your help, MartyIt was our understanding that the new xotcl may take care of some of the long runners automatically without us defining a specific thread pool and telling xotcl about it. Is this not a good assumption
Well, newer version of the xotcl-request monitor can automatize some steps, but it needs configuration. The xotcl-request monitor can map requests to a "slow" pool when it is defined by the site-admin. The site-admin has also to configure (e.g. min/max threads, rejectoverruns, queue size, ...) this connection pool. From the snippet of the configuration files, it seem, that you have no "slow" pool defined, so no automatic mapping will happen.
From the output i see that the default pool has minthreads 2 and maxthreads 10. I would recommend to define 25/25 on your testing instance based on:
live 24 cores and 80 GB of RAM, staging 16 cores and 200 GBFor a rough estimate, one can run with 24 cores 96 connection threads (assuming typical OpenACS loads) and with 16 about 64.
When looking at the 140 VUs output i see
Request Handling: requests 1.7K, queued 1K (61.48%), spooled 1K (62.24%)
The number of queued requests is way to high. On our production system (typically between 600 and 3000 concurrent users) we have typically less the 0.01% queued requests in the default pool.
Request Timing: avg queue time 660.56ms, avg filter time 5.44ms, avg run time 218.25ms avg trace time 1.31ms
You see here that the avg queue time (660ms, time waiting to be served) is much longer the than the processing time (avg run time). On our production system (currently 650 users) we have now
avg queue time 465.04µs, avg filter time 4.18ms, avg run time 104.08ms avg trace time 4.27ms
the output
[13/Sep/2021:16:00:16][1.7f1a15c5a700][-driver:nsssl_v4:0-] Notice: [east_intrastage pool ] All available connections are used, waiting 90 idle 0 current 10tells that the queue is filled up (100 entries). The queue size is defined per pool via "maxconnections", we have on our live system this value set to 600. One can fine-tune the behavior of queue overruns via "rejectoverrun". A larger queue makes the system more friendly when there are sometime large request bursts.
While the values above are average values, the values below (long-calls) are from individual requests:
14.371059 0.003934 1.550067 15.925060 2021 Sep 13 16:00:30
The columns are: Queuetime Filtertime Runtime Totaltime.
The requests above were waiting 14secs in the queue before being served, whereas the actual run-time was about 1.5 secs. When you get rid of the queuing time, the performance will improve by a factor of 10. Increasing the threads as indicated above should help significantly.
For this test, setting up the "slow" pool will not improve the benchmark times, since the pools make sure that "slow" requests do not kill the "fast" requests on your server. This is important to ensure liveliness of the server, when the majority of requests is "fast", and only a few are "slow", and you do not want that someone firing e.g. 100 "slow" requests brings you server to its knees.
By increasing the number of maxthreads form 10 (default) to e.g. 50 should increase the throughput by a factor of 5.
Hope, this helps
-gn
We appreciate your expertise in helping us tune our system. It has helped a lot!
I was successful in setting up the "slow" pool. I was wondering if there is a setting to only map a request to the slow pool if it is only over 30 or 40 seconds. I found that it started to put our effort reporting page on the slow pool when the system was a little stressed. So I removed the slow pool for the following tests for now.
I have been playing around with the configuration and we have decided to go up to 48 cores on our staging system with the 200 Gig of RAM.
Here is what I found to be somewhat optimal for the 48 cores.
We are not running out of Available connections anymore and the queued request are a lot lower as well as the avg queue times. The K6 tests show that all http request are status 200 as well.
I was wondering if you see any problems with the configuration and if you had suggestions for changes.
Again, thanks for your help
Martyns_section ns/server/${server} ns_param maxconnections 2000 ;# 100; number of allocated connection structures ns_param maxthreads 200 ;# 10; maximal number of connection threads ns_param minthreads 200 ;# 1; minimal number of connection threads ns_param connsperthread 10000 ;# 10000; number of connections (requests) handled per thread ;# Setting connsperthread to > 0 will cause the thread to ;# graciously exit, after processing that many ;# requests, thus initiating kind-of Tcl-level ;# garbage collection. # ns_param threadtimeout 120 ;# 120; timeout for idle threads. ;# In case, minthreads < maxthreads, threads ;# are shutdown after this idle time until ;# minthreads are reached. # ns_param lowwatermark 10 ;# 10; create additional threads above this queue-full percentage ns_param highwatermark 100 ;# 80; allow concurrent creates above this queue-is percentage ;# 100 means to disable concurrent creates ns_param rejectoverrun true ;# If set to true; NaviServer sends on queue overruns an error "503 service unavailable". ;# The request is not added to the waiting list. # Compress response character data: ns_return, ADP etc. # ns_param compressenable off ;# false, use "ns_conn compress" to override ns_section ns/server/${server}/pools ns_param monitor "Monitoring actions to check the health of the system" ns_param image "Image file requests" #ns_param slow "Slow requests" ns_param fast "Fast requests" ns_section ns/server/${server}/pool/monitor ns_param minthreads 3 ns_param maxthreads 5 ns_param map "GET /admin/nsstats" ns_param map "GET /SYSTEM" ns_param map "GET /ds" ns_param map "POST /ds" ns_section ns/server/${server}/pool/image ns_param minthreads 3 ns_param maxthreads 5 ns_param map "GET /*.png" ns_param map "GET /*.jpg" ns_param map "GET /*.PNG" ns_param map "GET /*.JPG" ns_param map "GET /*.JPEG" ns_param map "GET /*.jpeg" ns_param map "GET /*.mp4" ns_param map "GET /*.MP4" ns_param map "GET /*.pdf" ns_param map "GET /*.PDF" ns_param map "GET /*.gif" ns_param map "GET /*.GIF" #ns_section ns/server/${server}/pool/slow #ns_param minthreads 15 #ns_param maxthreads 15 ns_section ns/server/${server}/pool/fast ns_param minthreads 6 ns_param maxthreads 6
# Test w 48 CPU's 200 Gig RAM # maxconnections 2000 # maxthreads 200 # minthreads 200 # connsperthread 10000 # highwatermark 100 # compressenable off # rejectoverrun true 150 VUs Connection Threads: min 200 max 200 current 200 idle 151 stopping 0 waiting 0 Request Handling: requests 1.8K, queued 0 (0.00%), spooled 1K (63.55%) Request Timing: avg queue time 161.33µs, avg filter time 78.53ms, avg run time 212.77ms avg trace time 972.63µs 300 VUs Connection Threads: min 200 max 200 current 200 idle 157 stopping 0 waiting 0 Request Handling: requests 5.4K, queued 508 (9.46%), spooled 4K (66.81%) Request Timing: avg queue time 295.87ms, avg filter time 1.74s, avg run time 251.96ms avg trace time 1.1ms 150 VUs Connection Threads: min 200 max 200 current 200 idle 189 stopping 0 waiting 0 Request Handling: requests 7.1K, queued 627 (8.80%), spooled 5K (65.20%) Request Timing: avg queue time 260.7ms, avg filter time 2s, avg run time 255.49ms avg trace time 1.13ms 70 VUs Connection Threads: min 200 max 200 current 200 idle 141 stopping 0 waiting 0 Request Handling: requests 7.9K, queued 627 (7.98%), spooled 5K (65.70%) Request Timing: avg queue time 236.39ms, avg filter time 1.94s, avg run time 264.67ms avg trace time 1.12ms 10 VUs Connection Threads: min 200 max 200 current 200 idle 199 stopping 0 waiting 0 Request Handling: requests 8.4K, queued 627 (7.48%), spooled 6K (66.52%) Request Timing: avg queue time 221.7ms, avg filter time 1.85s, avg run time 275.66ms avg trace time 1.13ms K6 Summary running (4m09.3s), 000/300 VUs, 1359 complete and 38 interrupted iterations default ✓ [======================================] 000/300 VUs 4m0s ✓ status is 200 ✗ page succeeded ↳ 99% — ✓ 5542 / ✗ 8 █ setup █ teardown checks.........................: 99.92% ✓ 11092 ✗ 8 data_received..................: 1.3 GB 5.2 MB/s data_sent......................: 7.0 MB 28 kB/s http_req_blocked...............: avg=1.46ms min=1.27µs med=6.61µs max=51.42ms p(90)=4.1ms p(95)=4.87ms http_req_connecting............: avg=147.33µs min=0s med=0s max=48.09ms p(90)=376.22µs p(95)=442.16µs ✗ http_req_duration..............: avg=2.32s min=4.86ms med=224.41ms max=24.15s p(90)=8.5s p(95)=13.05s { expected_response:true }...: avg=2.32s min=4.86ms med=224.41ms max=24.15s p(90)=8.5s p(95)=13.05s http_req_failed................: 0.00% ✓ 0 ✗ 8355 http_req_receiving.............: avg=689.98µs min=39.82µs med=179.82µs max=70.57ms p(90)=1.71ms p(95)=1.89ms http_req_sending...............: avg=39.42µs min=8.56µs med=35.11µs max=15.19ms p(90)=66.52µs p(95)=76.12µs http_req_tls_handshaking.......: avg=1.28ms min=0s med=0s max=42.24ms p(90)=3.62ms p(95)=4.51ms http_req_waiting...............: avg=2.32s min=4.66ms med=223.99ms max=24.15s p(90)=8.5s p(95)=13.05s http_reqs......................: 8355 33.512465/s iteration_duration.............: avg=23.56s min=1.64µs med=22.73s max=51.14s p(90)=36.62s p(95)=38.12s iterations.....................: 1359 5.45104/s vus............................: 1 min=1 max=300 vus_max........................: 300 min=300 max=300
Hi Marty,
wondering ... there is a setting to only map a request to the slow pool if it is only over 30 or 40 seconds
Well, there should be absolutely NO request processed in the foreground taking longer than say 10 secs. Currently, the various thresholds are hard-coded in the request monitor (mapping to slow pool, warnings for queue-times and filter times, ...., mapping to color-schemes) based on best practice of several large systems, but it is certainly possible to make these configurable in the future. The threshold for slow-pool mapping is 3 seconds. Note, that in case, your server has many such requests (which can't be made faster), one should define a larger "slow" pool. Requests in the "slow" pool are processed with the same efficiency as requests in a "fast" pool ("slow" and "fast" are just names), one can give the slow pool as many threads as you like.
I was wondering if you see any problems with the configuration and if you had suggestions for changes.
You have currently:
min max
default 200 200
monitor 3 5
image 3 5
slow 15 15
fast 6 6
200 min threads and 200 max threads for the default pool is a lot, but certainly, the right values depend on the application and on the machine you are using. When your jobs are CPU-intense, then it can happen that concurrent jobs slow each other more down than it helps. Many processors have a degressive clock rate (when many CPUs are busy, the temperature in the CPU goes up and the clock rate is reduced). Choosing the right CPU-type, is not so easy, often deactivating hyper-threading can speed up things (don't believe your vendor), etc. .... but this is a long story and brings us away from your question.
When there is the DB + nsd on the same machine, don't forget that the DB needs as well a lot of resources (CPU and memory). Every (active) thread will require a DB-connection, and the resources available per SQL query (postgres process) will be less, the SQL queries become slower, etc.
My recommendation is: if you have multiple connection pools, and the default pool is responsible for most of the hard work, don't make this much larger than the number of cores you have. When you have long running queries, make sure, you can't handle these via spooling or background processing.
From your configuration, i see a pool "fast", which is probably unused (and can be removed), since there is no mapping defined. I would also recommend to use the "image" pool minthreads=maxthreads. You should see from the per-pool statistics how many threads you need. Here are is the output from our "fast" pool used for images (with the given load, 6 threads can server 210 requests in this pool per second).
Connection Threads: min 6 max 6 current 6 idle 6 stopping 0 waiting 0 started 23 maxconnections 600
Request Handling: requests 194.5K, queued 149 (0.08%), spooled 124K (63.55%), dropped 0 possible-max-reqs 210.5rps
Request Timing: avg queue time 239.75µs, avg filter time 4.69ms, avg run time 22.1ms avg trace time 1.72ms
We have on our production environment currently:
min max
default 60 60
bots 3 3
monitor 2 4
fast 6 6
slow 15 15
What i can see from your figures is that the filter time is very high, already for 150 VUs:
Request Timing: avg queue time 161.33µs, avg filter time 78.53ms, avg run time 212.77ms avg trace time 972.63µs
Typically, the filters are responsible for authorization and setup.... but seeing there 78ms, and for 300 VUs 1.7s makes me wonder, what's going on. Maybe you are running out of DB connections (as configured in the NaviServer configuration file). Or you are running out of SQL connection from you DB configuration.
Check out the section DB-Pools in nsstats. It should look like the snipped below.
pool1: statements 99M gethandles 2.8M handles 170 connected 64 used 135 waittime 3.872729 sqltime 82712.668212 avgwaittime 1.4µs avgsqltime 835.8µs
pool2: statements 0 gethandles 2 handles 2 connected 0 used 1 waittime 0.002325 sqltime 0.000000 avgwaittime 1.2ms
pool3: statements 0 gethandles 2 handles 2 connected 0 used 1 waittime 0.002555 sqltime 0.000000 avgwaittime 1.3ms
If avgwaittime is high, you have the problem. Newer OpenACS versions don't use pool2 and pool3 (for nested SQL calls, which were troublesome for rollbacks).
Hope this helps!
-g
You were right the avgwaittime on pool1 was high
pool1: statements 1.1M gethandles 8.6K handles 15 connected 0 used 15 waittime 14693.460994 sqltime 1724.604281 avgwaittime 1.7s avgsqltime 1.5ms
The pool1 connections were still at 15. I increased them to 70 and brought our maxthreads down to 60.
Here is my test with this configuration
# 48 cores 200 Gig RAM # maxconnections 2000 # maxthreads 60 # minthreads 60 # connsperthread 10000 # highwatermark 100 # compressenable off # rejectoverrun true # image pool 6/6 # ns_section ns/db/pool/pool1 ns_param connections 70150 VUs Connection Threads: min 60 max 60 current 60 idle 33 stopping 0 waiting 0 Request Handling: requests 2.4K, queued 0 (0.00%), spooled 2K (64.39%) Request Timing: avg queue time 167.94µs, avg filter time 4.87ms, avg run time 231.84ms avg trace time 996.02µs
pool1: statements 288.5K gethandles 2.6K handles 70 connected 5 used 34 waittime 0.477450 sqltime 447.052551 avgwaittime 186.3µs avgsqltime 1.5ms
300 VUs Connection Threads: min 60 max 60 current 60 idle 18 stopping 0 waiting 0 Request Handling: requests 7.4K, queued 2K (21.37%), spooled 5K (64.86%) Request Timing: avg queue time 87.8ms, avg filter time 8.85ms, avg run time 333.63ms avg trace time 3.23ms
pool1: statements 923.4K gethandles 7.6K handles 70 connected 17 used 61 waittime 0.998778 sqltime 1997.236181 avgwaittime 132.1µs avgsqltime 2.2ms
150 VUs Connection Threads: min 60 max 60 current 60 idle 7 stopping 0 waiting 0 Request Handling: requests 11.3K, queued 3K (29.84%), spooled 7K (66.01%) Request Timing: avg queue time 324.46ms, avg filter time 9.84ms, avg run time 416.1ms avg trace time 3.87ms
pool1: statements 1.5M gethandles 11.5K handles 70 connected 7 used 62 waittime 1.052769 sqltime 3914.273814 avgwaittime 91.8µs avgsqltime 2.5ms
70 VUs Connection Threads: min 60 max 60 current 60 idle 57 stopping 0 waiting 0 Request Handling: requests 13K, queued 3K (25.79%), spooled 9K (66.52%) Request Timing: avg queue time 280.5ms, avg filter time 9.11ms, avg run time 414.92ms avg trace time 3.51ms
pool1: statements 1.8M gethandles 13.2K handles 70 connected 57 used 62 waittime 1.054629 sqltime 4429.716496 avgwaittime 79.6µs avgsqltime 2.4ms
K6 Summary running (4m09.2s), 000/300 VUs, 2225 complete and 0 interrupted iterations default ✓ [======================================] 000/300 VUs 4m0s ✗ status is 200 ↳ 99% — ✓ 8889 / ✗ 1 ✗ page succeeded ↳ 99% — ✓ 8886 / ✗ 4 checks.........................: 99.97% ✓ 17775 ✗ 5 data_received..................: 2.1 GB 8.3 MB/s data_sent......................: 11 MB 44 kB/s http_req_blocked...............: avg=2.35ms min=1.27µs med=4.83µs max=1.85s p(90)=4.9ms p(95)=7.77ms http_req_connecting............: avg=762.21µs min=0s med=0s max=1.84s p(90)=340.97µs p(95)=392.74µs ✓ http_req_duration..............: avg=699.68ms min=4.73ms med=43.58ms max=5.8s p(90)=2.63s p(95)=3.23s { expected_response:true }...: avg=699.74ms min=4.73ms med=43.59ms max=5.8s p(90)=2.63s p(95)=3.23s http_req_failed................: 0.00% ✓ 1 ✗ 13342 http_req_receiving.............: avg=927.2µs min=42.66µs med=167.63µs max=1.63s p(90)=1.68ms p(95)=2.29ms http_req_sending...............: avg=32.23µs min=7.95µs med=28.46µs max=472.34µs p(90)=58.33µs p(95)=66.47µs http_req_tls_handshaking.......: avg=1.56ms min=0s med=0s max=37.52ms p(90)=4.48ms p(95)=7.27ms http_req_waiting...............: avg=698.72ms min=4.45ms med=43.41ms max=5.8s p(90)=2.63s p(95)=3.23s http_reqs......................: 13343 53.54756/s iteration_duration.............: avg=14.18s min=1.74µs med=12.87s max=22.24s p(90)=18.83s p(95)=19.99s iterations.....................: 2225 8.929275/s vus............................: 1 min=1 max=300 vus_max........................: 300 min=300 max=300
These numbers look a lot better! If you see anything that seems amiss let me know. We appreciate all of your help and expertise.
Sincerely,
Marty
It would be nice to be under 8 seconds in the 95th percentile....
NGINX setup .... 95th percentile were around 7 seconds.now, the http_req_duration has p(95)=3.23s
i see for 150 VU no queued request, but queuing starts with 300 VU. With 300 VU, the filter time has nearly doubled, and the runtime is 50% higher. So there is a resource bottleneck somewhere.
The avgwaittime seems stable now (enough connections now), but it is interestingly much higher than on the posted numbers from our system. ... but the avgsqltime goes quickly up from 150 to 300 VUs.
I would also recommend to set log_min_duration_statement to e.g. 1 sec and check the SQL log file for slow SQL calls to identify potentially DB bottlenecks.
Do you experience locks during the tests? Check in nsstats [1].
all the best
-gn
[1] admin/nsstats?@page=locks&reversesort=1&col=8
I did set the log_min_duration_statement to 1 second and found some queries to optimize. I then ran our K6 load test and we are down to 1.01 seconds in the 95th percentile! :)
I very much appreciate your master class on tuning naviserver.
Now that we have optimized one instance. I did run the nginx with 3 naviservers test and found that we are geting the same times with nginx serving 3 naviservers with the following config on each of the 3 naviservers. However, I had not put in the clustering yet with that test.
# 48 CPU's 200 Gig RAM # Nginx with 3 Naviservers # maxconnections 1000 # maxthreads 20 # minthreads 20 # connsperthread 10000 # highwatermark 100 # compressenable off # rejectoverrun true # image pool 6/6 # ns_section ns/db/pool/pool1 ns_param connections 23 # DB on Same VM
There were three main reason for us looking into nginx in the beginning of this.
- To get over the 'All available connections are used up' errors that we would see from time to time. This is now solved! thank you :)
- If one naviserver fails (core dumps with signal 11) and restarts, nginx just marks that server as down and skips it in the round robin and we see no downtime
- Nginx can be set up to allow non-schema change upgrades to happen in the middle of the working day without us restarting the server and therefore seeing no downtime for our users. See: https://www.nginx.com/faq/how-does-zero-downtime-configuration-testingreload-in-nginx-plus-work/. Our developers and users would like this because we could make a bug fix and push it out during the day without restarting naviserver which can take about 1 minute to restart
Let me explain number 2 a bit more, in case you have seen this error before. If you want me to open a new discussion thread I can if you like. We have for some years now been getting a signal 11 core dump from time to time. It has not been too big of a deal because naviserver would restart by it's self and we would only be down for 30 seconds or so back then. However, over the past year we have noticed it happens with more frequency (probably 3 or 4 times a week) but we have not been able to track it down. In the log file the core dump looks like the following:
... [16/Sep/2021:08:59:39][1.7f36daffd700][-socks-] Fatal: received [16/Sep/2021:08:59:39][1.7f36e3fff700][-driver:nsssl_v4:0-] Notice: ... sockAccept accepted 2 connections [16/Sep/2021:08:59:39][1.7f36e3fff700][-driver:nsssl_v4:0-] Notice: ... sockAccept accepted 2 connections [16/Sep/2021:08:59:39][1.7f36daffd700][-socks-] Fatal: received fatal signal 11 ORA-24550: signal received: [si_signo=6] [si_errno=0] [si_code=-6] [si_int=0] [si_ptr=(nil)] [si_addr=0x1] kpedbg_dmp_stack()+396 -kpeDbgCrash()+204 -kpeDbgSignalHandler()+113 -skgesig_sigactionHandler()+258 -__sighandler() -gsignal()+203 -abort()+299[16/Sep/2021:08:59:48][1.7efe2e4f6800][-main:conf-] Notice: nsmain: NaviServer/4.99.21 (tar-4.99.21) starting ...
As you can see it is an oracle error from what looks like the oracle driver. Our main DB is postgres (99%) but we also connect up to an oracle DB for certain queries.
Again I can open another discussion for this item if you like. Just wanted you to know the three reasons we were looking into nginx.
Thanks again, for your willingness to share your expertise and for helping us scale/tune our server!
Sincerely, Martyand we are down to 1.01 seconds in the 95th percentile!sounds good. hopefully this not only on this synthetic test so, but as well on the real application.
The messages "all connection are used" is a condition, which can be used as well with all other servers, but naviserver tells the user what is going on.
NaviServer crashes are seldom (on our production site very few times a year, i see the last one from jan 18, most times in explainable cases). In case, you see many of these (3 cores per week qualifies as many), you should compile NaviServer and Tcl with -g an keep the cores for determining the cause.
For the snipped you are showing, It would be easy for me to say "it the problem of the oracle driver", ... but probably it is not. Oracle just tell the system that there was an abort() happening. It would be certainly interesting, where this abort() is coming from (via core).
bug fix and push it out during the day without restarting naviserver which can take about 1 minute to restart
When using OpenACS, you can use the reload feature from the package manager to reload packages without a restart. Why you are not using this?
I am not saying that horizontal scaling is a bad idea, but - depending on your application - that might be a longer way, especially for cache coherency. One has in NaviServer per-request caches, per-thread caches, caches via ns_cache* and nsv, where ns_cache supports transaction semantics (if there is a complex transaction with API calls performed, where the API calls cache some content, but the transaction fails, the cached content needs to follow as well the first three characters of the ACID properties). Getting these correct for all packages is a longer way to go. The easiest approach would be to deactivate all caching in OpenACS, but this has some performance impact as well.
I will look into compiling naviserver with the -g option to see if we can generate a core file that can generate a backtrace with gdb.
We had started a standard some time ago to always restart the server after an upgrade because of some oddities some developers were seeing. We believe it was when a developer would instantiate a proc in the shell (for debug purposes) that the issue would arise. After talking to Tony about it we think that you are right and we should just use the "Reload" option during day releases
Your point is well taken about caching. It would be interesting to disable the caching for testing purposes. Is there a parameter that does this or a compile option? Or would we need to disable it for each type of cache by modifying the code?
Thanks for your help, MartyI can run the install-ns.sh script to build naviserver but I do not see that it has a -g option? I can see that "make" is run inside the script with --enable-symbols though. Does this mean it has the debug symbols by default?
if [ $with_mongo = "1" ] ; then echo "------------------------ WITH MONGO" ./configure --enable-threads --enable-symbols \ --prefix=${ns_install_dir} --exec-prefix=${ns_install_dir} --with-tcl=${ns_install_dir}/lib \ --with-nsf=../../ \ --with-mongoc=/usr/local/include/libmongoc-1.0/,/usr/local/lib/ \ --with-bson=/usr/local/include/libbson-1.0,/usr/local/lib/ else ./configure --enable-threads --enable-symbols \ --prefix=${ns_install_dir} --exec-prefix=${ns_install_dir} --with-tcl=${ns_install_dir}/lib fi ${make} ${make} installMarty
gcc -DPACKAGE_NAME=\"tdom\" -DPACKAGE_TARNAME=\"tdom\" -DPACKAGE_VERSION=\"0.9.1\" -DPACKAGE_STRING=\"tdom\ 0.9.1\" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DHAVE_GETRANDOM=1 -DXML_DEV_URANDOM=1 -DXML_POOR_ENTROPY=1 -DBUILD_tdom=/\*\*/ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_MEMMOVE=1 -DHAVE_BCOPY=1 -DXML_POOR_ENTROPY=1 -DUSE_THREAD_ALLOC=1 -D_REENTRANT=1 -D_THREAD_SAFE=1 -DTCL_THREADS=1 -DUSE_TCL_STUBS=1 -DUSE_TCLOO_STUBS=1 -DMODULE_SCOPE=extern\ __attribute__\(\(__visibility__\(\"hidden\"\)\)\) -DHAVE_HIDDEN=1 -DHAVE_CAST_TO_UNION=1 -D_LARGEFILE64_SOURCE=1 -DTCL_WIDE_INT_IS_LONG=1 -DUSE_TCL_STUBS=1 -DXML_DTD=1 -DXML_NS=1 -DTDOM_NO_UNKNOWN_CMD=1 -DUSE_NORMAL_ALLOCATOR=1 -I../expat -I../generic -I"/usr/local/ns/include" -I. -g -O2 -pipe -O2 -fomit-frame-pointer -DNDEBUG -Wall -fPIC -c `echo ../generic/domxpath.c` -o domxpath.o
The line you are showing is from compiling tdom. The install-ns script does not try to enforce same compiler flags for all packages, one has to care about this separately (already the tcl packages have different flags definitions).
Assuming have have a recent install-ns configured for naviserver 4.99.22 and tcl8.6.11, the following command make sure, tcl + naviserver is complied with "-g":
cd /usr/local/src/naviserver-4.99.22
make clean all install "CFLAGS_OPTIMIZE=-O2 -g"
cd /usr/local/src/tcl8.6.11/unix
make clean all "CFLAGS_OPTIMIZE=-O2 -g"
(cd /usr/local/src/tcl8.6.11/unix/pkgs/thread2.8.6; make clean all "CFLAGS_DEFAULT=-O2 -g -DNS_AOLSERVER=1 -I/usr/local/ns/include")
make install
Is there a parameter that does this or a compile option?
no, ... and given the different is would not be feasible (how should a call to nsv_* know by itself that this is for caching or not). For caching via ns_cache_* it would be relative simple. Depending on the application, deactivating caching is still the lower hanging fruit...
-g
By this you can estimate the performance impact, when caching via ns_cache is deactivated for your site when considering horizontal scaling. of course, the other forms for caching are not addressed by this.
Small update: the parameter for controlling the caching mode is now called "cachingmode", since this leaves more room for potential other future caching modes (e.g. some improved cluster mode or special caching nodes)
ns_section ns/parameters {
...
ns_param cachingmode none ;# experimental caching mode
...
}
The acs-cache-procs are updated such that the per-thread-cache works like a per-request-cache when the "cachingmode" is configured to "none".
Sorry for the late reply, I had been pulled away on migrating some servers. I did compile up the newest version back on the 22nd and set the nocache parameter. We did get acceptable speeds without caching from this particular test. I had not setup the cluster parameters yet though. Below are the results of the tests with and without nocache.
Note: These tests were run with our optimized queries that we fixed in our effort reporting package. When I ran the tests with the non-optimized slower queries, the difference was almost 1 second slower between cached and non cached tests in the 95th percentile
with nocache = false
# Test #23 48 CPU's 200 Gig RAM # Nginx with 3 Naviservers # maxconnections 1000 # maxthreads 20 # minthreads 20 # connsperthread 10000 # highwatermark 100 # compressenable off # rejectoverrun true # image pool 6/6 # ns_section ns/db/pool/pool1 ns_param connections 23 # DB on Same VM # ns param nocache = false 150 VUs Connection Threads: min 20 max 20 current 20 idle 16 stopping 0 waiting 0 Request Handling: requests 858, queued 0 (0.00%), spooled 554 (64.57%) Request Timing: avg queue time 105.18µs, avg filter time 4.24ms, avg run time 161.33ms avg trace time 1.03ms pool1: statements 107.5K gethandles 970 handles 23 connected 7 used 13 waittime 0.165362 sqltime 99.706249 avgwaittime 170.5µs avgsqltime 927.5µs 300 VUs Connection Threads: min 3 max 5 current 3 idle 2 stopping 0 waiting 0 Request Handling: requests 3, queued 0 (0.00%), spooled 2 (66.67%) Request Timing: avg queue time 74.33µs, avg filter time 490.51ms, avg run time 19.45ms avg trace time 7.14ms pool1: statements 370.2K gethandles 2.9K handles 23 connected 7 used 20 waittime 0.246876 sqltime 371.667285 avgwaittime 85.4µs avgsqltime 1ms 150 VUs Connection Threads: min 20 max 20 current 20 idle 17 stopping 0 waiting 0 Request Handling: requests 4.7K, queued 0 (0.00%), spooled 3K (66.18%) Request Timing: avg queue time 130.8µs, avg filter time 4.27ms, avg run time 201.81ms avg trace time 1.15ms pool1: statements 672.1K gethandles 4.9K handles 23 connected 15 used 20 waittime 0.248934 sqltime 686.242544 avgwaittime 51.2µs avgsqltime 1ms 70 VUs Connection Threads: min 3 max 5 current 3 idle 2 stopping 0 waiting 0 Request Handling: requests 5, queued 0 (0.00%), spooled 4 (80.00%) Request Timing: avg queue time 73.2µs, avg filter time 296.22ms, avg run time 16.18ms avg trace time 4.63ms pool1: statements 764.7K gethandles 5.4K handles 23 connected 15 used 20 waittime 0.249513 sqltime 773.483103 avgwaittime 46µs avgsqltime 1ms K6 summary running (4m09.2s), 000/300 VUs, 2757 complete and 0 interrupted iterations default ✓ [======================================] 000/300 VUs 4m0s ✗ status is 200 ↳ 99% — ✓ 11022 / ✗ 2 ✗ page succeeded ↳ 99% — ✓ 11022 / ✗ 2 █ setup █ teardown checks.........................: 99.98% ✓ 22044 ✗ 4 data_received..................: 2.5 GB 10 MB/s data_sent......................: 9.7 MB 39 kB/s http_req_blocked...............: avg=80.69µs min=0s med=4.3µs max=32.55ms p(90)=6.31µs p(95)=7.37µs http_req_connecting............: avg=11.55µs min=0s med=0s max=28.93ms p(90)=0s p(95)=0s ✓ http_req_duration..............: avg=207.03ms min=0s med=34.16ms max=1.45s p(90)=946.89ms p(95)=1s { expected_response:true }...: avg=207.05ms min=5.21ms med=34.17ms max=1.45s p(90)=946.89ms p(95)=1s http_req_failed................: 0.01% ✓ 2 ✗ 16535 http_req_receiving.............: avg=519.41µs min=0s med=115.54µs max=63.53ms p(90)=1.37ms p(95)=1.56ms http_req_sending...............: avg=26.43µs min=0s med=23.84µs max=855.82µs p(90)=43.09µs p(95)=50.97µs http_req_tls_handshaking.......: avg=63µs min=0s med=0s max=29.88ms p(90)=0s p(95)=0s http_req_waiting...............: avg=206.48ms min=0s med=33.99ms max=1.45s p(90)=945.49ms p(95)=1s http_reqs......................: 16537 66.369066/s iteration_duration.............: avg=11.23s min=1.61µs med=11.24s max=13.28s p(90)=11.35s p(95)=11.39s iterations.....................: 2757 11.064855/s vus............................: 1 min=1 max=300 vus_max........................: 300 min=300 max=300
nocache = true
# Test #22 48 CPU's 200 Gig RAM # Nginx with 3 Naviservers # maxconnections 1000 # maxthreads 20 # minthreads 20 # connsperthread 10000 # highwatermark 100 # compressenable off # rejectoverrun true # image pool 6/6 # ns_section ns/db/pool/pool1 ns_param connections 23 # DB on Same VM # ns param nocache = true 150 VUs Connection Threads: min 20 max 20 current 20 idle 15 stopping 0 waiting 0 Request Handling: requests 836, queued 0 (0.00%), spooled 539 (64.47%) Request Timing: avg queue time 100.37µs, avg filter time 6.93ms, avg run time 157.66ms avg trace time 1.25ms pool1: statements 117.6K gethandles 938 handles 23 connected 6 used 13 waittime 0.170185 sqltime 96.248608 avgwaittime 181.4µs avgsqltime 818.6µs 300 VUs Connection Threads: min 20 max 20 current 20 idle 10 stopping 0 waiting 0 Request Handling: requests 2.8K, queued 9 (0.33%), spooled 2K (64.85%) Request Timing: avg queue time 166.06µs, avg filter time 7.66ms, avg run time 187.04ms avg trace time 1.45ms pool1: statements 407.8K gethandles 2.9K handles 23 connected 8 used 21 waittime 0.267667 sqltime 375.288688 avgwaittime 93.4µs avgsqltime 920.3µs 150 VUs Connection Threads: min 20 max 20 current 20 idle 12 stopping 0 waiting 0 Request Handling: requests 4.7K, queued 10 (0.21%), spooled 3K (65.63%) Request Timing: avg queue time 167.3µs, avg filter time 7.76ms, avg run time 202.97ms avg trace time 1.51ms pool1: statements 739K gethandles 4.8K handles 23 connected 12 used 21 waittime 0.269563 sqltime 694.840813 avgwaittime 56.1µs avgsqltime 940.3µs 70 VUs Connection Threads: min 20 max 20 current 20 idle 18 stopping 0 waiting 0 Request Handling: requests 5.3K, queued 10 (0.19%), spooled 3K (65.96%) Request Timing: avg queue time 160.41µs, avg filter time 7.7ms, avg run time 205.11ms avg trace time 1.5ms pool1: statements 846.7K gethandles 5.4K handles 23 connected 17 used 21 waittime 0.270101 sqltime 787.473052 avgwaittime 50.2µs avgsqltime 930.1µs K6 Summary running (4m10.2s), 000/300 VUs, 2736 complete and 0 interrupted iterations default ✓ [======================================] 000/300 VUs 4m0s ✓ status is 200 ✓ page succeeded █ setup █ teardown checks.........................: 100.00% ✓ 21888 ✗ 0 data_received..................: 2.5 GB 10 MB/s data_sent......................: 9.6 MB 38 kB/s http_req_blocked...............: avg=81.66µs min=1.6µs med=4.39µs max=39.67ms p(90)=6.49µs p(95)=7.68µs http_req_connecting............: avg=12.47µs min=0s med=0s max=35.3ms p(90)=0s p(95)=0s ✓ http_req_duration..............: avg=218.86ms min=7.34ms med=37.47ms max=1.55s p(90)=978.96ms p(95)=1.04s { expected_response:true }...: avg=218.86ms min=7.34ms med=37.47ms max=1.55s p(90)=978.96ms p(95)=1.04s http_req_failed................: 0.00% ✓ 0 ✗ 16418 http_req_receiving.............: avg=526.47µs min=33.71µs med=121.17µs max=22.53ms p(90)=1.39ms p(95)=1.58ms http_req_sending...............: avg=27.35µs min=9.2µs med=24.76µs max=188.41µs p(90)=45.25µs p(95)=53.62µs http_req_tls_handshaking.......: avg=62.98µs min=0s med=0s max=30.12ms p(90)=0s p(95)=0s http_req_waiting...............: avg=218.3ms min=7.26ms med=37.27ms max=1.55s p(90)=977.52ms p(95)=1.04s http_reqs......................: 16418 65.62951/s iteration_duration.............: avg=11.3s min=1.7µs med=11.31s max=13.26s p(90)=11.45s p(95)=11.49s iterations.....................: 2736 10.936919/s vus............................: 1 min=1 max=300 vus_max........................: 300 min=300 max=300
Also, thanks for adding the cachingmode. It would be good to run some tests on that as well. We would appreciate any help and insights you can give on how to setup a special cachingmode for clustering. One thing we are wanting to test was adding the ::acs::clusterwide to the nsv_unset in ad_parameter_cache. Are there other places you would recommend us adding the ::acs::clusterwide to, for our testing of the cluster? Or any other directions you might suggest?
Thanks for all you help, -MartyWe got a signal 11 core today on our live system. I was able to connect up to it with gdb and get a backtrace. Here is the backtrace.
Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/local/ns/bin/nsd... ... Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". --Type RET for more, q to quit, c to continue without paging-- Core was generated by `/usr/local/ns/bin/nsd -u root -g web -i -t /web/etc/config.tcl'. Program terminated with signal SIGSEGV, Segmentation fault. #0 __GI_abort () at abort.c:107 107 abort.c: No such file or directory. [Current thread is 1 (Thread 0x7fcc74ff9700 (LWP 37))] (gdb) bt #0 __GI_abort () at abort.c:107 #1 0x00007fccba4b3064 in skgdbgcra () from /opt/oracle/instantclient_21_1/libclntsh.so.21.1 #2 0x00007fccba480693 in skgesigCrash () from /opt/oracle/instantclient_21_1/libclntsh.so.21.1 #3 0x00007fccba4809de in skgesig_sigactionHandler () from /opt/oracle/instantclient_21_1/libclntsh.so.21.1 #4 signal handler called #5 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #6 0x00007fccc2292859 in __GI_abort () at abort.c:79 #7 0x00007fccc24cf27f in Panic (fmt=optimized out) at log.c:943 #8 0x00007fccc215bf8d in Tcl_PanicVA () from /usr/local/ns/lib/libtcl8.6.so #9 0x00007fccc215c0ff in Tcl_Panic () from /usr/local/ns/lib/libtcl8.6.so #10 0x00007fccc25143ca in Abort (signal=optimized out) at unix.c:1119 #11 signal handler called #12 0x00007fccc24b825c in NsTclConnChanProc (UNUSED_sock=optimized out, arg=0x7fcbf581d460, why=1) at connchan.c:694 #13 0x00007fccc24e279d in SockCallbackThread (UNUSED_arg=optimized out) at sockcallback.c:531 #14 0x00007fccc2210fd4 in NsThreadMain (arg=optimized out) at thread.c:232 #15 0x00007fccc221169f in ThreadMain (arg=optimized out) at pthread.c:870 #16 0x00007fccc1b72609 in start_thread (arg=optimized out) at pthread_create.c:477 #17 0x00007fccc238f293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Any Ideas on what could be causing the signal 11 issue? If you want me to open a new discussion on it I can.. Thanks! Marty
We will work on getting on this latest release for our live system.
In looking closer at the docker image that I build with naviserver it appears that we are built with naviserver-4.99.22. In my /usr/local/src/ directory the 'wget' from 'install-ns.sh build' pulled down naviserver-4.99.22.tar.gz and unpacked it and compiled it.
Once this was done I ran the commands you gave me to clean and compile again with debug symbols
cd /usr/local/src/naviserver-4.99.22 make clean all install "CFLAGS_OPTIMIZE=-O2 -g" cd /usr/local/src/tcl8.6.11/unix make clean all "CFLAGS_OPTIMIZE=-O2 -g" (cd /usr/local/src/tcl8.6.11/unix/pkgs/thread2.8.6; make clean all "CFLAGS_DEFAULT=-O2 -g -DNS_AOLSERVER=1 -I/usr/local/ns/include") make install
Also when we hit our live site with Chrome browser it reports in the HTML headers that it is using Server: NaviServer/4.99.22.
In our docker image (which has starting base of Ubuntu 20.04) I also brought down gdb debugger and that is what I used to get the backtrace along with the nsd that was compiled and installed in /usr/local/ns/bin/nsd
gdb /usr/local/ns/bin/nsd /usr/local/ns/core_1/core.1
Was this the right nsd executable to run with the core file? Any Idea how could we have gotten the wrong debug symbols?
My doubts come from the fact that the traceback shows that the signal is generated from connchan.c in line 694
#12 0x00007fccc24b825c in NsTclConnChanProc (UNUSED_sock=optimized out, arg=0x7fcbf581d460, why=1) at connchan.c:694but this can't be from 4.99.22 (see [1]) but rather looks to be from 4.99.19 [2] or older. I've double-checked, the tar file delivered from sourceforge is correct.
[1] [https://bitbucket.org/naviserver/naviserver/src/naviserver-4.99.22/nsd/connchan.c#lines-694
[2] [https://bitbucket.org/naviserver/naviserver/src/naviserver-4.99.19/nsd/connchan.c#lines-694
I double checked our connchan.c and it is in deed the right file for 4.99.22. Line 694 is the "return success" in NsTclConnChanProc function on our system too.
I believe I have the right executable and core file for it
I found that I can do a 'bt full' with gdb and it gives you the variables as well. It looks like servPtr is set to a memory location that it does not own in the NsTclConnChanProc function.
It looks like gdb is not returning the exact line number of the NsTclConnChanProc function where the servPtr had the problem. But it does look like it has all of the other information correct
#10 0x00007fccc25143ca in Abort (signal=optimized out) at unix.c:1119 No locals. #11 signal handler called No locals. #12 0x00007fccc24b825c in NsTclConnChanProc (UNUSED_sock=optimized out, arg=0x7fcbf581d460, why=1) at connchan.c:694 whenBuffer = "r\000\314\177\000" servPtr = error reading variable servPtr (Cannot access memory at address 0x646e65732d203140) cbPtr = 0x7fcbf581d460 success = true #13 0x00007fccc24e279d in SockCallbackThread (UNUSED_arg=optimized out) at sockcallback.c:531 stop = optimized out pollTimeout = optimized out nfds = optimized out now = {sec = 1632772110, usec = 116003} diff = {sec = 0, usec = 0} c = 0 '\000' when = {32716, 2, 20} events = {1, 4, 2} n = optimized out i = 0 isNew = 0 maxPollfds = 100 cbPtr = 0x7fcbf539f940 nextPtr = optimized out hPtr = optimized out search = {tablePtr = 0x7fccc2565640 activeCallbacks, nextIndex = 12, nextEntryPtr = 0x0} pfds = 0x7fcc640012b0 #14 0x00007fccc2210fd4 in NsThreadMain (arg=optimized out) at thread.c:232 thrPtr = 0x7fcc94f231e0 #15 0x00007fccc221169f in ThreadMain (arg=optimized out) at pthread.c:870 --Type RET for more, q to quit, c to continue without paging--1 No locals. #16 0x00007fccc1b72609 in start_thread (arg=optimized out) at pthread_create.c:477 ret = optimized out pd = optimized out unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140516112963328, 1098100496289630482, 140517183382942, 140517183382943, 140517183383088, 140516112961280, -1106243481039927022, -1106548523360057070}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = 0 #17 0x00007fccc238f293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
It looks like gdb is not returning the exact line number of the NsTclConnChanProc
Probably. Maybe, the function pointer and optimization contributes to the confusion. One should turn off optimization and turn on assertions, so it becomes easier to identify the root cause.
Since the flag "-DNDEBUG" is inherited from Tcl, the easiest thing is to remove the flag manually from the Makefile, eg. with sed, as shown below. Note that i have as well adjusted the optimize flags.
All the best
-g
cd /usr/local/src/naviserver-4.99.22
sed -i 's/-DNDEBUG//' include/Makefile.global
make clean all install "CFLAGS_OPTIMIZE=-O0 -g"
concerning your original question concerning clustering, I've done some work over the weekend. If one is using a recent version of NaviServer (at least 4.99.20), the parameter::* interface should be now cluster proof (added acs::clusterwide for cleaning and callbacks), and does no caching, when the cachingmode is set to "none" When the regression test is run with cachingmode set to "none". I've also cleaned up xo::clusterwide vs. acs::clusterwide (the former is gone). When running the regression test with cachingmode "none", some errors show up, but these are probably just due to the testing whether caching works (not sure about the client-property errors, but I am running right now out of time).
Concerning cluster modes: i see the following directions:
Classical OpenACS clustering: The basic idea is to setup intra-server talk to achieve cache-coherency. When some data is flushed in the cache of one server, it communicates this to the other servers in the cluster (via HTTP request). This is a multi-master mode, where every server talks to the same database. The NaviServers are symmetrical, one is responsible for scheduled procedures. When files are stored in the content repository, the content repository directory has to be shared between the instances. OpenACS 5.10 contains experimental support for storing content repository files in gridfs (which is horizontally scaling, based on mongodb). Some older packages bypassing the content-repository API have still to be adjusted to use the API.
However, the classical OpenACS clustering mode was never fully implemented, but might still work for certain applications. One could extend this clustering mode and make it "work as designed". Several cluster-flush operations are missing, probably some callbacks have to executed cluster-wide as well (e.g., change parameters), etc.
Known weaknesses: potential issue with scalability - when e.g., a single request requires e.g. 100 flush operations (not unusual), and e.g. 10 nodes are defined in the cluster, a single incoming request will cause 1000 additional HTTP requests for flushing. This can be probably improved via batching (bundle multiple flush requests in one HTTP request). Since the requests are HTTP requests, these will need very short timeouts, otherwise, these will block a while, when some nodes are down, slow, some connections might "hang" in the middle of a request, etc. After a network separation, full-cache-flush is required (but is not implemented).
Classical OpenACS clustering via UDP: This can reduce the clusterwide operation latency (no TCP setup times); one could consider multicast operations in the cluster to reduce the intra-server traffic substantially - the basic architecture will be the same.
Quick test for intra-server talk to a server on the same host (my notebook, localhost, mean value from 1000 requests)
TCP 0.273ms UDP 0.171ms UDP 0.059ms (no reply)So, flushing operations can be made faster by a factor of 4 or more by using UDP instead of TCP.
Clustering with no-caching: if nothing is cached (except maybe per-request caches), there will be no problem with cache-coherency, but there will be more load on the DB. The big advantage of no-caching is that the intra-server talk to preserve cache coherency is zero. For several types of applications, this will be the simplest setup.
Clustering via caching server: One could set up a system, where the caching is not done via a (central) caching server (e.g., REDIS; memcached, or a special purpose NaviServer) instead of NaviServer native caching. If there is a central cache, no intra-server talk is needed to achieve cache coherency, Disadvantages: NaviServer native caching is significantly more efficient than memcached/REDIS, and supports cache transactions (rollback, commit of cache settings). One could get this back via using one NaviServer instance as a purely caching instance ... but this requires deeper analysis. Some figures:
NaviServer native cache performance: 0.33us == 0.00033ms time {ns_cache_eval ns:memoize 1 {set x 1}} 100000 REDIS https://k6.io/blog/benchmarking-redis-with-k6/ iteration_duration...: avg=782.57µs min=67.35µs med=732.92µs max=15.86ms p(90)=1.1ms p(95)=1.3ms iterations...........: 506755 50660.636169/s redis_latency........: avg=764.8µs min=0s med=1ms max=16ms p(90)=1ms p(95)=1ms 100VUs iteration_duration...: avg=4.39ms min=46.8µs med=3.32ms max=87.24ms p(90)=9.5ms p(95)=12.51ms
When assuming 1ms latency for external cache, NaviServer's internal cache is about 3000x faster than going to an external cache, so, when intra-server talk is not overwhelming, one should get the best performance from OpenACS clustering via UDP - unless more options pop up.... but getting the majority of the OpenACS packages "cluster-safe" will take some time and debugging.
I am very interested in testing out the "Clustering with no-caching=none" option. From what you have mentioned above it sounds like this option currently exists in 4.99.22 now. Is this correct? If so, I could pull down the latest 4.99.22 and run a k6 test with cachingmode=none and setup the clustering parameters to set a baseline. Then maybe compare that to running with no cachingmode set.
The other options also would be interesting to test out.
Thanks for taking the time to implement the cachingmode=none
Sincerely, Martyit sounds like this option currently exists in 4.99.22 now. Is this correct?
No-caching (cachingmode=none) requires the head version of NaviServer from the bitbucket repository. To get this, activate in install_ns.sh "version_ns=HEAD".
The other options also would be interesting to test out.I'll keep you updated when this progresses. For us the winter semester starts right now, so this is a very busy time.
Thanks for taking the time to implement the cachingmode=noneNote that you need as well the newest code from OpenACS from the oacs-5-10 branch. Let me know, how this works out for you. I would not be surprised, if there is some more caching that should be bypassed, but such cases should be relatively easy to fix.
ns_cache operations saved since the start of the server 18m 39s on 286.69K requests (3.9ms per request on average)
3.9 ms is the average value on requests, including static pages, for which the impact is most likely lower than on content pages with caching from DB requests. Some particular caches, which are not used on every request, might yield on a hit much larger savings, so there will be some spread.
Thanks for adding the nsstats feature for average time saved with caching. I will have to check it out
I found that the proc ad_canonical_server_p found in server-cluster-procs.tcl, did not work for us as we are using https (port 443). Included below is a diff of how I made it work. Feel fee to modify as needed. Note: We use nsssl_v4 in our config.tcl
Thanks, Martydiff --git a/packages/acs-tcl/tcl/server-cluster-procs.tcl b/packages/acs-tcl/tcl/server-cluster-procs.tcl index 5cafb058d..882bd562a 100644 --- a/packages/acs-tcl/tcl/server-cluster-procs.tcl +++ b/packages/acs-tcl/tcl/server-cluster-procs.tcl @@ -120,13 +120,26 @@ ad_proc -private ad_canonical_server_p {} { set canonical_ip $canonical_server } - set driver_section [ns_driversection -driver nssock] + if { $canonical_port eq "443" } { + set driver_section [ns_driversection -driver nsssl_v6] + } else { + set driver_section [ns_driversection -driver nssock_v6] + } + + if { $driver_section eq "" && $canonical_port eq "443" } { + set driver_section [ns_driversection -driver nsssl_v4] + } else { + set driver_section [ns_driversection -driver nssock_v4] + } + if { [ns_config $driver_section Address] == $canonical_ip - && [ns_config $driver_section Port 80] == $canonical_port + && [ns_config $driver_section Port] == $canonical_port } { + ns_log notice "Is CanonicalServer" return 1 } + ns_log notice "Not CanonicalServer" return 0 }
Concerning the the driver names (nssock_v4, nssock_v4, nsssl_v4, nsssl_v6): the names *v4 and *v6 are deprecated and not used in the recommended configuration file: the plain communication is just called "nssock", secure is "nsssl" [1]. The necessity for that somewhat hacky suffix was removed with the introduction of NaviServer 4.99.19 (1.5 years ago) which supports multiple IP addresses registered to a single driver [2]. Unless you have to support older versions of NaviServer, i would recommend to switch to the prefix-less version.
When i have some more time, i should setup a cluster for testing, i see many things in the old server-cluster-procs.tcl which should be streamlined and modernized. The code with the prefixes should not go to the common repository.
In case you want to make the intra-server talk via SSL, this will slowdown the communication relative to the figures i've posted above. So, i would not recommend it. But certainly, in case a company policy requires that all communications (including intranet) have to be secure, the code should support it. Otherwise restricting access to the HTTP port via firewall allowing via firewall for a few selected hosts should be ok.
-g
[1] https://bitbucket.org/naviserver/naviserver/src/naviserver-4.99.22/openacs-config.tcl
[2] https://bitbucket.org/naviserver/naviserver/src/naviserver-4.99.22/NEWS#lines-489
I was not aware of those changes to config.tcl. Thanks for pointing out the configuration changes and providing an example of an up to date config.tcl configuration.
Our company does require https for our web server communications so we will keep that in mind with our testing.
We appreciate whatever help you can give, knowing that this your busy time,
Thanks,
Marty
Note that you need as well the newest code from OpenACS from the oacs-5-10 branch. Let me know, how this works out for you.
Do I need to download it directly from CVS somehow. I am unsure how to download the latest oacs 5.10 from CVS.
I went to https://openacs.org/projects/openacs/download/ and downloaded the openacs 5.10 and 5.10 beta tar files but it does not look like the change you were talking about were there.
parameter::* interface should be now cluster proof (added acs::clusterwide for clearning and callbacks)I was expecting to see an acs::clusterwide inside the ad_parameter_cache proc. Where exactly did you make your changes?. Marty.
[1] https://cvs.openacs.org/changelog/OpenACS?cs=oacs-5-10%3Agustafn%3A20211004093700
Just now, I have committed an overhauled and streamlined version of the caching infrastructure to oacs-5-10 in CVS. Everything uses now the new infrastructures, the code should be now more in line with modern NaviServer capabilities (IPv6, multiple IP addresses and multiple ports per server, global vs. local network drivers, etc.).
Since the cluster servers are provided as IP addresses, there is no configuration about the protocol (HTTP vs. HTTPS); so for the time being, all servers have to listen (in addition to the public port) on a plain HTTP port for cluster operations. This restriction can be lifted probably in the future, but first, the basics should work correctly.
In my setup, i have set essentially the CanonicalServer to"127.0.0.1:8100" and ClusterPeerIP to "127.0.0.1:8100 127.0.0.1:8101 127.0.0.1:8102", which works now essentially. When you test the cluster configuration, please update to the newest code in oacs-5-10 to avoid the problems i faced.
all the best
-g
I will pull the latest and try it. I did test the pervious version with cachingmode=none with the latest openACS 5.10 changes as of ~Oct 10th. It worked great from my testing. 1.59s in the 95th percentile. So I thought that was great.
I also verified that there was no caching with parameters
I will let you know what I find with the latest pull.
Thanks! Marty