Forum OpenACS Q&A: Re: NaviServer stops serving connections when slow external api calls pile up.

After finding out that we had turned on the http logging a while ago, I went through and pulled out all of the http_log entries that correspond to the error_log entries I posted earlier and have listed them here along side the error_log entries. When comparing these entries note that the timing in the error_log is reported in milliseconds and the timing in the http_log is in seconds.

A couple of things of interest.

  • Among the entries that Marty posted above, most of them show that the request did time out but, long after the specified expire time of 500ms. This isn't expected behavior, is it?
  • The requests that show as succeeding in the error_log show a status of ok in the http_log (even the one that took over 3 seconds, weird) and those that show a timeout in the error_log show a socktimeout in the http_log. So the two logs seem to match well.
  • The request that errored-out with the ssl error did not have an entry in the http_log which makes sense if it is expected that the log entry occurs after a request completes, whether or not it was successful.
######  -- Correct timeout
## error_log
[04/Jan/2023:05:07:44][1.7fb883fff700][-conn:east_intra:default:19:4500-] Notice: pkg-notices: in_box_notices; run_num: 03A965C1; end   run time: 2071 ms; call: wt::jira::notice::get_notice; dur: 534 ms
## http_log
[04/Jan/2023:05:07:44 -0700] -conn:east_intra:default:19:4500- 408 GET https://target.site.url 0.518629 250 0 socktimeout

######  -- Correct timeout
## error_log
[04/Jan/2023:10:28:12][1.7fb8cfc7e700][-conn:east_intra:default:4:64408-] Notice: pkg-notices: in_box_notices; run_num: D9D43629; end   run time: 2162 ms; call: wt::jira::notice::get_notice; dur: 532 ms
## http_log
[04/Jan/2023:10:28:12 -0700] -conn:east_intra:default:4:64408- 408 GET https://target.site.url 0.515892 249 0 socktimeout

######  -- Successful run NO timeout
## error_log
[04/Jan/2023:13:18:42][1.7fb8cdc7a700][-conn:east_intra:default:8:108041-] Notice: pkg-notices: in_box_notices; run_num: DDF11C73; end   run time: 2034 ms; call: wt::jira::notice::get_notice; dur: 428 ms
## http_log
[04/Jan/2023:13:18:42 -0700] -conn:east_intra:default:8:108041- 200 GET https://target.site.url 0.412023 255 1022 ok

######  -- Bad run, does not appear to timeout but has a duration of longer than 500 ms and no timeout  It is possible that the http call timed out but, something else got stuck. Wat!
## error_log
[04/Jan/2023:13:18:55][1.7fb8a6ffd700][-conn:east_intra:default:21:108096-] Notice: pkg-notices: in_box_notices; run_num: 4F6A7C4C; end   run time: 5552 ms; call: wt::jira::notice::get_notice; dur: 3844 ms {{no timeout?}}
## http_log
[04/Jan/2023:13:18:55 -0700] -conn:east_intra:default:21:108096- 200 GET https://target.site.url 3.828102 251 796 ok

######  -- Bad run timed out but long after the 500 ms
## error_log
[04/Jan/2023:13:19:55][1.7fb8cec7c700][-conn:east_intra:default:5:108393-] Notice: pkg-notices: in_box_notices; run_num: 2C241021; end   run time: 11185 ms; call: wt::jira::notice::get_notice; dur: 9623 ms
## http_log
[04/Jan/2023:13:19:55 -0700] -conn:east_intra:default:5:108393- 408 GET https://target.site.url 9.607222 248 0 socktimeout

######  -- Bad run timed out but long after the 500 ms
## error_log
[04/Jan/2023:13:20:02][1.7fb8cdc7a700][-conn:east_intra:default:8:108439-] Notice: pkg-notices: in_box_notices; run_num: D9370223; end   run time: 13709 ms; call: wt::jira::notice::get_notice; dur: 12061 ms
## http_log
[04/Jan/2023:13:20:02 -0700] -conn:east_intra:default:8:108439- 408 GET https://target.site.url 12.045099 250 0 socktimeout

###### -- SSL connect failed and does not timeout for 30 seconds!
## error_log
[04/Jan/2023:13:20:23][1.7fb8a7fff700][-conn:east_intra:default:11:108445-] Notice: pkg-notices: in_box_notices; run_num: B881E1A9; end   run time: 32666 ms; call: wt::jira::notice::get_notice; dur: 30991 ms
## http_log
-- no entry was found in the http log to match with this request

######  -- Super bad run no ssl error, timeout but appears to be stuck for 4 minutes!
## error_log
[04/Jan/2023:13:25:30][1.7fb8a7fff700][-conn:east_intra:default:11:108951-] Notice: pkg-notices: in_box_notices; run_num: E5E4A969; end   run time: 242778 ms; call: wt::jira::notice::get_notice; dur: 241139 ms
## http_log
[04/Jan/2023:13:25:30 -0700] -conn:east_intra:default:11:108951- 408 GET https://target.site.url 241.122625 249 0 socktimeout

Sorry, if it feels like we are bombarding this thread with a bunch of information. We are just hoping that something we are finding can help make sense of what we are experiencing.

As always, thanks for your great help,

-Tony