Forum OpenACS Q&A: Re: NaviServer stops serving connections when slow external api calls pile up.
Posted by
Tony Kirkham
on 01/19/23 11:32 PM
After finding out that we had turned on the http logging a while ago, I went through and pulled out all of the http_log entries that correspond to the error_log entries I posted earlier and have listed them here along side the error_log entries.
When comparing these entries note that the timing in the error_log is reported in milliseconds and the timing in the http_log is in seconds.
A couple of things of interest.
- Among the entries that Marty posted above, most of them show that the request did time out but, long after the specified expire time of
500ms. This isn't expected behavior, is it? - The requests that show as succeeding in the
error_logshow a status ofokin thehttp_log(even the one that took over 3 seconds, weird) and those that show a timeout in theerror_logshow asocktimeoutin thehttp_log. So the two logs seem to match well. - The request that errored-out with the ssl error did not have an entry in the
http_logwhich makes sense if it is expected that the log entry occurs after a request completes, whether or not it was successful.
###### -- Correct timeout
## error_log
[04/Jan/2023:05:07:44][1.7fb883fff700][-conn:east_intra:default:19:4500-] Notice: pkg-notices: in_box_notices; run_num: 03A965C1; end run time: 2071 ms; call: wt::jira::notice::get_notice; dur: 534 ms
## http_log
[04/Jan/2023:05:07:44 -0700] -conn:east_intra:default:19:4500- 408 GET https://target.site.url 0.518629 250 0 socktimeout
###### -- Correct timeout
## error_log
[04/Jan/2023:10:28:12][1.7fb8cfc7e700][-conn:east_intra:default:4:64408-] Notice: pkg-notices: in_box_notices; run_num: D9D43629; end run time: 2162 ms; call: wt::jira::notice::get_notice; dur: 532 ms
## http_log
[04/Jan/2023:10:28:12 -0700] -conn:east_intra:default:4:64408- 408 GET https://target.site.url 0.515892 249 0 socktimeout
###### -- Successful run NO timeout
## error_log
[04/Jan/2023:13:18:42][1.7fb8cdc7a700][-conn:east_intra:default:8:108041-] Notice: pkg-notices: in_box_notices; run_num: DDF11C73; end run time: 2034 ms; call: wt::jira::notice::get_notice; dur: 428 ms
## http_log
[04/Jan/2023:13:18:42 -0700] -conn:east_intra:default:8:108041- 200 GET https://target.site.url 0.412023 255 1022 ok
###### -- Bad run, does not appear to timeout but has a duration of longer than 500 ms and no timeout It is possible that the http call timed out but, something else got stuck. Wat!
## error_log
[04/Jan/2023:13:18:55][1.7fb8a6ffd700][-conn:east_intra:default:21:108096-] Notice: pkg-notices: in_box_notices; run_num: 4F6A7C4C; end run time: 5552 ms; call: wt::jira::notice::get_notice; dur: 3844 ms {{no timeout?}}
## http_log
[04/Jan/2023:13:18:55 -0700] -conn:east_intra:default:21:108096- 200 GET https://target.site.url 3.828102 251 796 ok
###### -- Bad run timed out but long after the 500 ms
## error_log
[04/Jan/2023:13:19:55][1.7fb8cec7c700][-conn:east_intra:default:5:108393-] Notice: pkg-notices: in_box_notices; run_num: 2C241021; end run time: 11185 ms; call: wt::jira::notice::get_notice; dur: 9623 ms
## http_log
[04/Jan/2023:13:19:55 -0700] -conn:east_intra:default:5:108393- 408 GET https://target.site.url 9.607222 248 0 socktimeout
###### -- Bad run timed out but long after the 500 ms
## error_log
[04/Jan/2023:13:20:02][1.7fb8cdc7a700][-conn:east_intra:default:8:108439-] Notice: pkg-notices: in_box_notices; run_num: D9370223; end run time: 13709 ms; call: wt::jira::notice::get_notice; dur: 12061 ms
## http_log
[04/Jan/2023:13:20:02 -0700] -conn:east_intra:default:8:108439- 408 GET https://target.site.url 12.045099 250 0 socktimeout
###### -- SSL connect failed and does not timeout for 30 seconds!
## error_log
[04/Jan/2023:13:20:23][1.7fb8a7fff700][-conn:east_intra:default:11:108445-] Notice: pkg-notices: in_box_notices; run_num: B881E1A9; end run time: 32666 ms; call: wt::jira::notice::get_notice; dur: 30991 ms
## http_log
-- no entry was found in the http log to match with this request
###### -- Super bad run no ssl error, timeout but appears to be stuck for 4 minutes!
## error_log
[04/Jan/2023:13:25:30][1.7fb8a7fff700][-conn:east_intra:default:11:108951-] Notice: pkg-notices: in_box_notices; run_num: E5E4A969; end run time: 242778 ms; call: wt::jira::notice::get_notice; dur: 241139 ms
## http_log
[04/Jan/2023:13:25:30 -0700] -conn:east_intra:default:11:108951- 408 GET https://target.site.url 241.122625 249 0 socktimeout
Sorry, if it feels like we are bombarding this thread with a bunch of information. We are just hoping that something we are finding can help make sense of what we are experiencing.
As always, thanks for your great help,
-Tony