Forum OpenACS Q&A: Re: NaviServer stops serving connections when slow external api calls pile up.
Posted by
Tony Kirkham
on 01/19/23 11:32 PM
After finding out that we had turned on the http logging a while ago, I went through and pulled out all of the http_log
entries that correspond to the error_log
entries I posted earlier and have listed them here along side the error_log
entries.
When comparing these entries note that the timing in the error_log
is reported in milliseconds and the timing in the http_log
is in seconds.
A couple of things of interest.
- Among the entries that Marty posted above, most of them show that the request did time out but, long after the specified expire time of
500ms
. This isn't expected behavior, is it? - The requests that show as succeeding in the
error_log
show a status ofok
in thehttp_log
(even the one that took over 3 seconds, weird) and those that show a timeout in theerror_log
show asocktimeout
in thehttp_log
. So the two logs seem to match well. - The request that errored-out with the ssl error did not have an entry in the
http_log
which makes sense if it is expected that the log entry occurs after a request completes, whether or not it was successful.
###### -- Correct timeout
## error_log
[04/Jan/2023:05:07:44][1.7fb883fff700][-conn:east_intra:default:19:4500-] Notice: pkg-notices: in_box_notices; run_num: 03A965C1; end run time: 2071 ms; call: wt::jira::notice::get_notice; dur: 534 ms
## http_log
[04/Jan/2023:05:07:44 -0700] -conn:east_intra:default:19:4500- 408 GET https://target.site.url 0.518629 250 0 socktimeout
###### -- Correct timeout
## error_log
[04/Jan/2023:10:28:12][1.7fb8cfc7e700][-conn:east_intra:default:4:64408-] Notice: pkg-notices: in_box_notices; run_num: D9D43629; end run time: 2162 ms; call: wt::jira::notice::get_notice; dur: 532 ms
## http_log
[04/Jan/2023:10:28:12 -0700] -conn:east_intra:default:4:64408- 408 GET https://target.site.url 0.515892 249 0 socktimeout
###### -- Successful run NO timeout
## error_log
[04/Jan/2023:13:18:42][1.7fb8cdc7a700][-conn:east_intra:default:8:108041-] Notice: pkg-notices: in_box_notices; run_num: DDF11C73; end run time: 2034 ms; call: wt::jira::notice::get_notice; dur: 428 ms
## http_log
[04/Jan/2023:13:18:42 -0700] -conn:east_intra:default:8:108041- 200 GET https://target.site.url 0.412023 255 1022 ok
###### -- Bad run, does not appear to timeout but has a duration of longer than 500 ms and no timeout It is possible that the http call timed out but, something else got stuck. Wat!
## error_log
[04/Jan/2023:13:18:55][1.7fb8a6ffd700][-conn:east_intra:default:21:108096-] Notice: pkg-notices: in_box_notices; run_num: 4F6A7C4C; end run time: 5552 ms; call: wt::jira::notice::get_notice; dur: 3844 ms {{no timeout?}}
## http_log
[04/Jan/2023:13:18:55 -0700] -conn:east_intra:default:21:108096- 200 GET https://target.site.url 3.828102 251 796 ok
###### -- Bad run timed out but long after the 500 ms
## error_log
[04/Jan/2023:13:19:55][1.7fb8cec7c700][-conn:east_intra:default:5:108393-] Notice: pkg-notices: in_box_notices; run_num: 2C241021; end run time: 11185 ms; call: wt::jira::notice::get_notice; dur: 9623 ms
## http_log
[04/Jan/2023:13:19:55 -0700] -conn:east_intra:default:5:108393- 408 GET https://target.site.url 9.607222 248 0 socktimeout
###### -- Bad run timed out but long after the 500 ms
## error_log
[04/Jan/2023:13:20:02][1.7fb8cdc7a700][-conn:east_intra:default:8:108439-] Notice: pkg-notices: in_box_notices; run_num: D9370223; end run time: 13709 ms; call: wt::jira::notice::get_notice; dur: 12061 ms
## http_log
[04/Jan/2023:13:20:02 -0700] -conn:east_intra:default:8:108439- 408 GET https://target.site.url 12.045099 250 0 socktimeout
###### -- SSL connect failed and does not timeout for 30 seconds!
## error_log
[04/Jan/2023:13:20:23][1.7fb8a7fff700][-conn:east_intra:default:11:108445-] Notice: pkg-notices: in_box_notices; run_num: B881E1A9; end run time: 32666 ms; call: wt::jira::notice::get_notice; dur: 30991 ms
## http_log
-- no entry was found in the http log to match with this request
###### -- Super bad run no ssl error, timeout but appears to be stuck for 4 minutes!
## error_log
[04/Jan/2023:13:25:30][1.7fb8a7fff700][-conn:east_intra:default:11:108951-] Notice: pkg-notices: in_box_notices; run_num: E5E4A969; end run time: 242778 ms; call: wt::jira::notice::get_notice; dur: 241139 ms
## http_log
[04/Jan/2023:13:25:30 -0700] -conn:east_intra:default:11:108951- 408 GET https://target.site.url 241.122625 249 0 socktimeout
Sorry, if it feels like we are bombarding this thread with a bunch of information. We are just hoping that something we are finding can help make sense of what we are experiencing.
As always, thanks for your great help,
-Tony