Forum OpenACS Q&A: How to correctly configure the reverse proxy.

I have been using the reverse proxy module to have NaviServer/OACS perform authentication and serve some of the URL space from a Node instance. My configuration has been working fairly well, but we have run into a problem with uploading some files. Most files upload fine, but with some we receive 504 Gateway Timeout responses. These responses seem to be size related. I am trying to figure out if this is an issue with the Naviserver reverse proxy configuration or limits set in the Node instance.

Focusing specifically on the ::revproxy::upstream call, I have configured and understand the -url_rewrite_callback arg (see here) and I can see that there are options for three timeouts: -timeout, -sendtimeout, -receivetimeout as well as other args.

What do each of these timeouts affect and where can I go to understand better the other arguments in this call?

Thanks,

-Tony

Collapse
Posted by Tony Kirkham on

I can't seem to figure out how to edit my question so I will respond to it. I have some more details that I would like to add.

I have attempted to set the -timeout and -sendtimeout args to 60 and 120 seconds and it just spins for the specified amount of time and then closes with the error. By setting the -timeout to 1 second I can make most of the uploads fail, so I know that it is affecting the request. The files that upload without a problem are on the order of 30 KB where as the file that has never succeeded is on the order of 740 KB.

Collapse
Posted by Gustaf Neumann on
Which version of NaviServer are you using? There was a change in timeout semantics this year, making the behavior more useful.

How does backend behave? Does it send data incrementally or long time nothing and all data together?

For answering these questions, you can use in revproxy-procs the "verbose" flag, and "ns_logctl severity Debug(connchan) on" (e.g. in ds/shell) to get low-level debug information.

[1] https://bitbucket.org/naviserver/naviserver/commits/177e4e66a54810ed0bf4434ed0e60510584e3b90

Collapse
Posted by Tony Kirkham on

I'm just getting back to this problem. We have been limping along with only the ability to upload smaller files. I wanted to thank you very much for your reply a couple of months ago. I believe we have the changes you are referring to because we are running NaviServer/4.99.18 and from the link to the commit you indicated it appears that those changes were included in this version. Also I am running revproxy version 0.12.

I appreciate the suggestions (and directions) to increase the logging level and I am including the results below. I added validation and backend reply call backs that make a simple ns_log call so I could see just where they are called in the process. The uploads that succeed contain files Here is the call that I am currently making which includes some timeouts where I am trying to make sure that there is plenty of time to upload the files before timing out. This has not helped.

lappend upstream_args -timeout 10
lappend upstream_args -sendtimeout 60
lappend upstream_args -receivetimeout 60
lappend upstream_args -url_rewrite_callback pmd::rewrite_url
lappend upstream_args -validation_callback pmd::val_call
lappend upstream_args -exception_callback pmd::exc_call
lappend upstream_args -backend_reply_callback pmd::rep_call

ns_register_filter \
    postauth * http://server_name_path/pmd/* \
    ::revproxy::upstream {*}$upstream_args

This segment of the log is a successful attempt where the file is uploaded and stored correctly. The file size is ~ 40 kB. I have had files up to ~ 170 kB succeed.

[29/Jan/2021:22:39:45][1.7f570aacf700][-conn:server_name:1:8-] Notice: Inside validation call back pmd::val_call
[29/Jan/2021:22:39:45][1.7f570aacf700][-conn:server_name:1:8-] Notice: PROXY: upstream: send max 16000 bytes from string to conn0 (length 38094)
[29/Jan/2021:22:39:45][1.7f570aacf700][-conn:server_name:1:8-] Notice: PROXY: upstream: send max 16000 bytes from string to conn0 (length 38094)
[29/Jan/2021:22:39:45][1.7f570aacf700][-conn:server_name:1:8-] Notice: PROXY: upstream: send max 16000 bytes from string to conn0 (length 38094)
[29/Jan/2021:22:39:45][1.7f570aacf700][-conn:server_name:1:8-] Notice: PROXY: backendChan conn0 frontendChan conn1 method POST version 1.0 http://server_name_path/pmd/api/program/25689/logo?user_id=1234
[29/Jan/2021:22:39:45][1.7f56f1ffb700][-socks-] Notice: socks: starting
[29/Jan/2021:22:39:46][1.7f56f1ffb700][-socks-] Notice: update interpreter to epoch 1, trace none, time 0.216013 secs
[29/Jan/2021:22:39:46][1.7f56f1ffb700][-socks-] Notice: PROXY: backendReply: send 223 bytes from conn0 to conn1 (http://server_name_path/pmd/api/program/25689/logo?user_id=1234)
[29/Jan/2021:22:39:46][1.7f56f1ffb700][-socks-] Notice: PROXY: backendReply: first <HTTP/1.1 200 OK> HEAD <X-Powered-By: Express
Content-Type: application/json; charset=utf-8
Content-Length: 16
ETag: W/"10-oV4hJxRVSENxc/wX8+mA4/Pe4tA"
Date: Fri, 29 Jan 2021 22:39:46 GMT
Connection: close>
[29/Jan/2021:22:39:46][1.7f56f1ffb700][-socks-] Notice: Inside backend reply callback call back pmd::rep_call
[29/Jan/2021:22:39:46][1.7f56f1ffb700][-socks-] Notice: PROXY: backendReply: from http://server_name_path/pmd/api/program/25689/logo?user_id=1234
HTTP/1.1 200 OK
X-Powered-By: Express
Content-Type: application/json; charset=utf-8
Content-Length: 16
ETag: W/"10-oV4hJxRVSENxc/wX8+mA4/Pe4tA"
Date: Fri, 29 Jan 2021 22:39:46 GMT
Connection: close

[29/Jan/2021:22:39:46][1.7f56f1ffb700][-socks-] Notice: PROXY: backendReply: from conn0 to conn1 towrite 223 written 223 spooled(conn1) 16
[29/Jan/2021:22:39:46][1.7f56f1ffb700][-socks-] Notice: PROXY: spool from conn0 (exists 1) to conn1 (exists 1): condition r
[29/Jan/2021:22:39:46][1.7f56f1ffb700][-socks-] Notice: PROXY: ... auto closing conn0 manual conn1: http://server_name_path/pmd/api/program/25689/logo?user_id=1234 (suspended 0)
[29/Jan/2021:22:39:46][1.7f56f1ffb700][-socks-] Notice: PROXY: cleanup channel conn1, spooled 16 bytes (to spool 16)

This next segment is where the upload failed. The file size is : 789 kB (The upload also fails with a file size of 453 kB.)

[29/Jan/2021:22:39:46][1.7f56f1ffb700][-socks-] Notice: PROXY: cleanup channel conn5, spooled 37844 bytes (to spool 37844)
[29/Jan/2021:22:42:49][1.7f570aacf700][-conn:server_name:1:11-] Notice: Inside validation call back pmd::val_call
[29/Jan/2021:22:42:49][1.7f570aacf700][-conn:server_name:1:11-] Notice: PROXY: upstream: send max 16000 bytes from string to conn6 (length 808993)
[29/Jan/2021:22:42:49][1.7f570aacf700][-conn:server_name:1:11-] Notice: PROXY: upstream: send max 16000 bytes from string to conn6 (length 808993)
... There are 51 of these lines here in total.  I have eliminated them here for brevety ...
[29/Jan/2021:22:42:49][1.7f570aacf700][-conn:server_name:1:11-] Notice: PROXY: upstream: send max 16000 bytes from string to conn6 (length 808993)
[29/Jan/2021:22:42:49][1.7f570aacf700][-conn:server_name:1:11-] Notice: PROXY: upstream: send max 16000 bytes from string to conn6 (length 808993)
[29/Jan/2021:22:42:49][1.7f570aacf700][-conn:server_name:1:11-] Notice: PROXY: backendChan conn6 frontendChan conn7 method POST version 1.0 http://server_name_path/pmd/api/program/25689/logo?user_id=1234
[29/Jan/2021:22:42:59][1.7f56f1ffb700][-socks-] Notice: PROXY: spool from conn7 (exists 1) to conn6 (exists 1): condition t
[29/Jan/2021:22:42:59][1.7f56f1ffb700][-socks-] Notice: PROXY: timeout occurred while spooling conn7 to conn6
[29/Jan/2021:22:42:59][1.7f56f1ffb700][-socks-] Notice: PROXY: ... auto closing conn7 manual conn6: client (suspended 0)
[29/Jan/2021:22:42:59][1.7f56f1ffb700][-socks-] Notice: PROXY: cleanup channel conn6, spooled 0 bytes (to spool ???)

Here is the last part of another run that started the same but ended slightly differently after the 51 "send max ..." lines with the following ...

...
[29/Jan/2021:17:28:03][1.7fe179e74700][-conn:server_name:1:6-] Notice: PROXY: upstream: send max 16000 bytes from string to conn12 (length 808991)
[29/Jan/2021:17:28:03][1.7fe179e74700][-conn:server_name:1:6-] Notice: PROXY: backendChan conn12 frontendChan conn13 method POST version 1.0 http://server_name_path/pmd/api/program/25689/logo?user_id=1234
[29/Jan/2021:17:28:13][1.7fe1697fa700][-socks-] Notice: PROXY: timeout occurred while waiting for backend reply conn12 to conn13
[29/Jan/2021:17:28:13][1.7fe1697fa700][-socks-] Notice: PROXY: backendReply: ... auto closing conn12 http://server_name_path/pmd/api/program/25689/logo?user_id=1234

I am not sure which side is timing out, but I believe it is the Node side given the more verbose logging. You can see, in the first failed output that there is 10 seconds between the Notice: PROXY: backendChan... and the next line with a condition t. When I set the three timeout params to 60 seconds the time difference extends to 60 seconds. So all of the upstream: send max... lines happen immediately and then things just stop until the timeout time is passed and then the other lines get logged. However, if I make a request to upload the file directly from the client browser to the Node server, there is not a timeout and the upload succeeds on the files that fail going through NaviServer as the reverse proxy. Also, your link to the updated code you mention is an update to ns_http run|queue. I am not sure how that applies. Is it something that is used automatically when using the revproxy or is it something that I have the ability to manipulate from withing the ::revproxy::upstream call?

From this information does it appear that I have configured the revproxy correctly and the problem is with the upstream server or have I misconfigured the revproxy in some way?

Thank you very much for any help you can give.

-Tony

Collapse
Posted by Gustaf Neumann on

First of all, i would recommend to upgrade to NaviServer 4.99.20, since there are many changes in the code, also affecting low-level I/O, connchan handling, etc. This will be much more robust also in connection with partial reads/writes and addresses changes in the behavior of newer versions of OpenSSL.

Anyhow, what i can read from your lines above is the following:

  1. setting "-timeout 10"
  2. start of request "29/Jan/2021:22:42:49"
  3. timeout" 29/Jan/2021:22:42:59" while spooling from conn7 to conn6

Well, between (2) and (3) there are 10 seconds, the timeout value. Since it spools some data to conn6, i would assume that conn6 is the client (browser) and conn7 is the backend (node) . Since there is no reply from conn7 within this time span, the timeout occurs.

One question is, what happens on the node side? Maybe there are some entries in the log-file?

You can intensify the logging by turning on more log levels on the NaviServer side, especially on connchan. When you add for example the following line to the begin of the revproxy-procs, then you get more detailed information (be careful, if this is a busy site, you might get too much). One can certainly turn more detailed logging dynamically (maybe per revproxy PUT requests, but you have to add activation/deactivation yourself to the proc).

ns_logctl severity Debug(connchan) on
Collapse
Posted by Tony Kirkham on

Another few weeks have gone by and I am back to working this.

After turning up the logging with : ns_logctl severity Debug(connchan) 1 I can now see the following when I attempt to upload a larger file:

... ; #
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: Inside validation call back pmd::val_call
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): DriverSend sent 1325 bytes <Success>
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan open http://server-name/package/api/program/12345/logo?user_id=555 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0

... ; # it starts out okay with 10 of these lines that appear to show good sends

[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): DriverSend conn90: partial write operation, sent 14120 instead of 16000 bytes
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): DriverSend conn90: would block, no timeout configured, origLength 16000 still to send 1880 already sent 14120
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): DriverSend conn90: would block, no timeout configured, origLength 16000 still to send 16000 already sent 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0

... ; # then something appears to break with the "partial write operation" and we have 23 of the "would block, no timeout configured" lines

[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): DriverSend conn90: would block, no timeout configured, origLength 16000 still to send 16000 already sent 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): DriverSend conn90: would block, no timeout configured, origLength 16000 still to send 16000 already sent 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): DriverSend conn90: would block, no timeout configured, origLength 16000 still to send 16000 already sent 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0

... ; # followed by 10 lines that seem to succeed

[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): DriverSend conn90: partial write operation, sent 728 instead of 16000 bytes
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): DriverSend conn90: would block, no timeout configured, origLength 16000 still to send 15272 already sent 728
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): DriverSend conn90: would block, no timeout configured, origLength 16000 still to send 16000 already sent 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0

... ; # followed by 8 failing lines which account for the 51 sends that I see with the reduced logging which I posted earlier in this thread.

[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): DriverSend conn90: would block, no timeout configured, origLength 16000 still to send 16000 already sent 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: upstream: send max 16000 bytes from string to conn90 (length 808989)
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): DriverSend conn90: would block, no timeout configured, origLength 8989 still to send 8989 already sent 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan write conn90 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ConnChanDetachObjCmd sock 7
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan detach conn91 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Notice: PROXY: backendChan conn90 frontendChan conn91 method POST version 1.0 http://server-name/package/api/program/12345/logo?user_id=555
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan callback conn91 returns 0
[18/Mar/2021:12:39:06][1.7f799d60a700][-conn:server_name:3:3424-] Debug(connchan): ns_connchan callback conn90 returns 0
[18/Mar/2021:12:39:16][1.7f7957fff700][-socks-] Debug(connchan): NsTclConnChanProc why 64
[18/Mar/2021:12:39:16][1.7f7957fff700][-socks-] Debug(connchan): ns_connchan exists conn91 returns 0
[18/Mar/2021:12:39:16][1.7f7957fff700][-socks-] Debug(connchan): ns_connchan exists conn90 returns 0
[18/Mar/2021:12:39:16][1.7f7957fff700][-socks-] Notice: PROXY: spool from conn91 (exists 1) to conn90 (exists 1): condition t
... ; #

So now I can see that there are periods where there are what appears to be successful transfers and periods where there is a partial write and then the following sends fail with "would block, no timeout configured". I am working on increasing the Node logging, but have not gotten there.

The error I can see on the NaviServer side indicates that there is something interferring with the send at times, but that it clears after a little bit and that there is a way to configure for a longer timeout for this case. As I have indicated in an earlier post in this thread, I have tried to set the only timeouts I know of without any success.

What timeout configuration is this warning referring to?

We are planning to get to NaviServer/4.99.20 sometime soon, but if possible we need to get this working with what we have.

Thank you for any assistance you can give. I really appreciate it.

-Tony

Collapse
Posted by Gustaf Neumann on
As in my last replies, i recommend to upgrade to NaviServer 4.99.20 (.18 is two years old, many things have changed around this area). However, i can answer to your specific questions

What timeout configuration is this warning referring to?

I am not sure, you want to hear this on this detail, but a short answer would raise more questions.... When turning on debugging, the system exposes more details that normally an application developer does not have to care about. ... but it provides as well debugging hints. When a socket say "would block, no timeout configured" this means, that on a read/write operation the OS-kernel reported that it refuses right now to read/write as the user-space program tried (here NaviServer) in non-blocking mode. For this concrete read/write operation, no timeout was provided (which is intended behavior for non-blocking socket operations in the socks thread. The same function is uses as well in other contexts). If the socket would be defined as blocking, it would block right now. Avoiding blocking operation is essential for the socks thread, since otherwise, all other concurrent operations in the same thread would halt as well.

The situation with "partial writes" is very similar: the user-space program want to write N bytes, but actually the kernel refuses to do this and writes only some X bytes (where X < N). The kernel can do so whenever it wants, this is allowed behavior from POSIX (see e.g. [1]). Different OSes do it so far rarely (macOS), other ones do this often (Linux, especially in newer versions).

To cope with these conditions, the user-level program has to retry the read/write (on our sockets receive/send) operations with the remaining data whenever the socket becomes readable or writable again. Handling of these operations in Tcl is unfortunately somewhat tricky, and is is also made more complex for TLS connections, where sometimes incomplete write operations mean that one has to wait for readability of the socket, since openssl does under the hood still some negotiations). On the Tcl-side we have two problem areas: where should the remaining data stored, how long, etc., what to do if in the meantime more data arrives, ... and what to do about binary/text data, especially when the byte-stream is separated inside a multi-byte character and the like. Earlier versions of revproxy solved this via nsv, switching different callbacks on the fly etc., while newer versions use newer socket functions in NaviServer that provide internal storage of the left-over bytes, such that the Tcl-side does not have to care about this.

So, under the hood, the code in 4.99.20 is very different compared to earlier versions, exactly to reduce the complexity sketched above. A similar situation is also around WebSockets. The good old WebSocket implementation of NaviServer worked perfectly fine over years for e.g. chat and similar other low-traffic, low volume operations, but when we started to use this for shared rooms with VR glasses (oculus) and WebVR/WebXR where one every movement substantial amount of data is sent, the old implementation run into many problems with partial reads/writes etc. This uses now the exactly same C-level operations.

So, sorry for being lengthy. Hope it explains reduced enthusiasm to dig into old code, since sooner or later you will want to upgrade.

[1] http://developerweb.net/viewtopic.php?id=4154

Collapse
Posted by Tony Kirkham on
Thank you very much for this answer. I really do appreciate it.

That is exactly what I needed to hear. I will put my efforts into getting a 4.99.20 version running and I will see how things go and report back what I find.

Thanks,

-Tony

Collapse
Posted by Tony Kirkham on

I now have a NaviServer 4.99.20 running and, unfortunately, I am seeing the same result; a 36.9 KB file uploads without a problem, but a 789 KB file does not.

Here is what I am now seeing in the logs for the failed upload. I have removed several of the repetitive lines.

[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: Inside validation call back pmd::val_call
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): ns_connchan open http://servern-pmd-pm-dashboard/pm-dashboard/api/program/25875/logo?user_id=2936 => conn304
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend try to send [0] 63 bytes (total 1331)
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend NsDriverSend returned result 1331 errorState 0 --- operation now in progress
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend check partialResult 1331 nSent 1331 toSend 0 partial ? 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ### check result 1331 == -1 || 1331 == 1331 (partial 0 && ok 1) => try again 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend sent 1331 bytes state Operation now in progress
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): ns_connchan open http://servern-pmd-pm-dashboard/pm-dashboard/api/program/25875/logo?user_id=2936 returns 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: PROXY: upstream: send max 16000 bytes from string to conn304 (length 808993)
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): WS: send msgLen toSend 16000
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend try to send [0] 16000 bytes (total 16000)
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend NsDriverSend returned result 16000 errorState 0 --- operation now in progress
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend check partialResult 16000 nSent 16000 toSend 0 partial ? 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ### check result 16000 == -1 || 16000 == 16000 (partial 0 && ok 1) => try again 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ns_connchan write returns 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: PROXY: upstream: send max 16000 bytes from string to conn304 (length 808993)
... removed lines ...
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): WS: send msgLen toSend 16000
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend try to send [0] 16000 bytes (total 16000)
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend NsDriverSend returned result 16000 errorState 0 --- operation now in progress
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend check partialResult 16000 nSent 16000 toSend 0 partial ? 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ### check result 16000 == -1 || 16000 == 16000 (partial 0 && ok 1) => try again 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ns_connchan write returns 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: PROXY: upstream: send max 16000 bytes from string to conn304 (length 808993)
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): WS: send msgLen toSend 16000
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend try to send [0] 16000 bytes (total 16000)
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend NsDriverSend returned result 12744 errorState 0 --- operation now in progress
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend check partialResult 12744 nSent 12744 toSend 3256 partial ? 1
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: conn304 ConnchanDriverSend partial write operation, sent 12744 (so far 12744) remaining 3256 bytes, full length 16000
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ### check result 12744 == -1 || 16000 == 12744 (partial 1 && ok 1) => try again 1
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend try to send [0] 3256 bytes (total 3256)
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend NsDriverSend returned result 0 errorState 0 --- resource temporarily unavailable
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend would block, no timeout configured, origLength 16000 still to send 16000 already sent 12744
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: ... partial write: to send 16000 sent 12744 remaining 3256
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ns_connchan write returns 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: PROXY: upstream: send max 16000 bytes from string to conn304 (length 808993)
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): WS: send msgLen toSend 16000
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend try to send [0] 16000 bytes (total 16000)
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend NsDriverSend returned result 0 errorState 0 --- resource temporarily unavailable
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend would block, no timeout configured, origLength 16000 still to send 16000 already sent 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: ... partial write: to send 16000 sent 0 remaining 16000
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ns_connchan write returns 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: PROXY: upstream: send max 16000 bytes from string to conn304 (length 808993)
... removed lines ...
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): WS: send msgLen toSend 16000
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend try to send [0] 16000 bytes (total 16000)
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend NsDriverSend returned result 0 errorState 0 --- resource temporarily unavailable
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend would block, no timeout configured, origLength 16000 still to send 16000 already sent 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: ... partial write: to send 16000 sent 0 remaining 16000
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ns_connchan write returns 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: PROXY: upstream: send max 16000 bytes from string to conn304 (length 808993)
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): WS: send msgLen toSend 8993
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend try to send [0] 8993 bytes (total 8993)
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend NsDriverSend returned result 0 errorState 0 --- resource temporarily unavailable
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ConnchanDriverSend would block, no timeout configured, origLength 8993 still to send 8993 already sent 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: ... partial write: to send 8993 sent 0 remaining 8993
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ns_connchan write returns 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn305 ConnChanDetachObjCmd sock 4
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn305 ns_connchan detach returns 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Notice: PROXY: backendChan conn304 frontendChan conn305 method POST version 1.0 http://servern-pmd-pm-dashboard/pm-dashboard/api/program/25875/logo?user_id=2936
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn305 ns_connchan callback returns 0
[03/May/2021:17:03:32][1.7fc60f646700][-conn:servern_pmd:default:0:216-] Debug(connchan): conn304 ns_connchan callback returns 0
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): conn304 NsTclConnChanProc why  (64)
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Notice: PROXY: timeout occurred while waiting for backend reply conn304 to conn305
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): WS: send msgLen toSend 32
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): conn305 ConnchanDriverSend try to send [0] 32 bytes (total 32)
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): conn305 ConnchanDriverSend NsDriverSend returned result 32 errorState 1 --- Success
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): conn305 ConnchanDriverSend check partialResult 32 nSent 32 toSend 0 partial ? 0
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): conn305 ### check result 32 == -1 || 32 == 32 (partial 0 && ok 1) => try again 0
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): conn305 ns_connchan write returns 0
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Notice: PROXY: backendReply: ... auto closing conn304 http://servern-pmd-pm-dashboard/pm-dashboard/api/program/25875/logo?user_id=2936
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): conn305 ns_connchan close connChanPtr 0x7fc609287ca0
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): conn305 connchan: CancelCallback 0x7fc60a862700
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): conn305 ns_connchan close returns 0
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): conn304 NsTclConnChanProc Tcl eval <::revproxy::backendReply> returned <0>
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): NsTclConnChanProc <::revproxy::backendReply -callback pmd::rep_call conn304 conn305 http://servern-pmd-pm-dashboard/pm-dashboard/api/program/25875/logo?user_id=2936 {-timeout 10 -sendtimeout 60 -receivetimeout 60} 0 t> numeric result 0
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): conn304 NsTclConnChanProc free channel
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): conn304 connchan: CancelCallback 0x7fc5e8fd6200
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): NsTclConnChanProc called on a probably deleted callback 0x7fc60a862700
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): connchan: callbackCallbackFree cbPtr 0x7fc60a862700 why 32
[03/May/2021:17:03:42][1.7fc5f4ff9700][-socks-] Debug(connchan): connchan: callbackCallbackFree cbPtr 0x7fc5e8fd6200 why 32

Here are the corresponding lines from the Node/Express logs, which do not appear deep enough to be of much use.

Mon, 03 May 2021 23:03:32 GMT express:router dispatching POST /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router query  : /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router expressInit  : /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router logger  : /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router logger  : /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router jsonParser  : /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router urlencodedParser  : /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router cookieParser  : /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router trim prefix (/pm-dashboard) from url /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router serveStatic /pm-dashboard : /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router trim prefix (/pm-dashboard) from url /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router serveStatic /pm-dashboard : /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router <anonymous>  : /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router trim prefix (/pm-dashboard/api) from url /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router router /pm-dashboard/api : /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router dispatching POST /program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router trim prefix (/program) from url /program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router router /program : /pm-dashboard/api/program/25875/logo?user_id=2936
Mon, 03 May 2021 23:03:32 GMT express:router dispatching POST /25875/logo?user_id=2936

The upload starts off just fine and then about half-way through a partial send happens and all further sends fail. Everything happens in the first second of the request and is not affected by the 10 second timeout I have tried to set in the least. It does now report that "resource temporarily unavailable" along with the "would block, no timeout configured".

Another part of this puzzle, that I did not mention before, is that I am running Naviserver and Node in separate Docker containers. As I mentioned above, if I change the upload code to bypass NaviServer as the reverse proxy and go directly to the Node server all of the files upload without a problem. So it does not seem to be a size limit problem for Node or Naviserver individually, but only in the transfer between them.

Does this give any more insight into this problem?

I really appreciate all the help,

-Tony

Collapse
Posted by Tony Kirkham on

I have found a solution that I would like some feedback on. I am now able to upload files > 2.5 MB through the revproxy. I have not tried much bigger than that yet, but all I try are now working.

With the 4.99.20 NaviServer I also obtained and installed the 0.14 version of the revproxy and then after some more analysis of our docker environment, I decided that perhaps the problem was now a bit higher up than I had thought and I looked closer at the the revproxy tcl calls themselves. The following are the modifications I have made to the proc upstream that fixes my problem.

diff --git a/revproxy-procs.tcl b/revproxy-procs.tcl
index 4e12711..0ace547 100644
--- a/revproxy-procs.tcl
+++ b/revproxy-procs.tcl
@@ -135,7 +135,21 @@ namespace eval ::revproxy {
                     while {$i < $length} {
                         log notice "upstream: send max $chunk bytes from string to $backendChan " \
                             "(length $contentLength)"
-                        ns_connchan write $backendChan [string range $data $i $j]
+                        set wrote [ns_connchan write $backendChan [string range $data $i $j]]
+
+                        set remain [expr {(($j+1 < $length ? $j+1 : $length) - $i) - $wrote}]
+                        set ti $i
+                        set tries 3
+                        while {$remain && $tries} {
+                            after 100; # I hate solving race conditions with a race condition - ACK 2021-05-05
+                            set ti [expr {$ti + $wrote}]
+                            log notice "upstream: - try the rest - send from $ti to $j from string to $backendChan " \
+                                "(length $contentLength)"
+                            set wrote [ns_connchan write $backendChan [string range $data $ti $j]]
+
+                            incr remain -$wrote
+                            incr tries -1
+                        }
                         incr i $chunk
                         incr j $chunk
                     }

I started to think that the "resource temporarily unavailable" may be due to an inability of the system (Node) to handle how fast the data was being sent to it. These modifications give it a breather that seems to work.

One problem with this solution is that I do not know how to access any status of the connection write so I have relied on 3 retries of a call to after, which I feel is just a bigger Band-Aid. I also still see in the logs the message "resource temporarily unavailable" on every write, even though the writes complete successfully.

Is there a way outside of ns_connchan to reset that status? Does this look like an appropriate solution or is there a better one?

Thanks,

-Tony

Collapse
Posted by Tony Kirkham on

One other question on this, I only have modified the "string content" side because that is the path that is being followed. I assume something similar should be done on the "file content" path as well.

-Tony

Collapse
Posted by Gustaf Neumann on

hi Tony, great that you see some progress.

When i said "please upgrade" this means "get the server with the modules from the same release". Always get matching versions (as distributed in the modules tar file from sourceforge).

The approach you are following is not a good one, since the newly introduced loop will bring all other revproxy requests to a halt. When i try in my setup uploads via revproxy, everything works fine, no matter how large the file is.

i have currently a big backlog, so the earliest time i could look at this is this weekend, where other projects are waiting. To provide a setup similar to yours will take some time. I have lost track, what your exact setup is (there was so much time between my replies and your answers)?

Assume, C, R and T are the involved programs, where C is the client, R is the reverse proxy and T is the target server.

     C <-> R <-> T

Is the communication between CR and RT plain text or https?

What is the operating system of R and T? When these are running in Docker environments, the OS inside docker counts.

Are C, R and T running on different hosts?

Are you aware on performance/bandwidth problems on T?

Collapse
Posted by Tony Kirkham on

The communication between CR is https and RT is http

Our host system is CentOS 7 running on a VM. We are running Docker on this system with each server in its own container (all on this host). They are all on the same network and reference each other via container names rather than ip addresses. As I understand it, this set-up should be very fast. I don't know where there would be any bandwidth problems on T or otherwise.

Just for reference, the first set-up we were working with:

R (NaviServer 4.99.18 on CentOS 7 container) <-> T (Node 12.22.1 on Debian - container node:12-slim)

The upgraded system is a little different. We are moving away from CentOS due to its end-of-life announcement:

R (NaviServer 4.99.20 on Ubuntu 20.04.1) <-> T (Node 12.22.1 on Debian - container node:12-slim)

Sorry, this took me a bit to get to. Thank you for your insight and help.

-Tony

Collapse
Posted by Gustaf Neumann on

Hi Tony,

i have created now a setup with a reverse proxy and a backend server where POST requests to the reverse proxy are send to the backend. The reverse proxy receives requests with HTTP or HTTPS; the communication between the reverse proxy and the backed happens via HTTP; both servers are on the same physical machine.

I have tested this setup with POST requests of various sizes, but could not replicate your error. So, please try this simple setup in your environment, which we can rule out configuration issues.

To ease replication of the setup I have added a small repository on GitHub for testing (https://github.com/gustafn/ns-revproxy-test), containing everything needed (configuration files, a receiver on the back end, test files).

Start backend and reverse proxy server with the configuration files from the repository (for debugging, start these in different terminals):

 NS_HTTP_PORT=18091 /usr/local/ns/bin/nsd -f -u nsadmin -g nsadmin -t nsd-config-backend.tcl
 NS_HTTPS_PORT=48090 NS_HTTP_PORT=18090 /usr/local/ns/bin/nsd -f -u nsadmin -g nsadmin -t nsd-config-revproxy.tcl

Direct backend requests (port 18091):

curl -v -X POST -T 100KB http://127.0.0.1:18091/receiver.tcl
curl -v -X POST -T 1MB http://127.0.0.1:18091/receiver.tcl
curl -v -X POST -T 10MB http://127.0.0.1:18091/receiver.tcl

Requests to the backend via the Reverse proxy server (http port 18090 https port 48090):

curl -v -X POST -H Expect: -T 100KB http://127.0.0.1:18090/receiver.tcl
curl -v -X POST -H Expect: -T 1MB http://127.0.0.1:18090/receiver.tcl
curl -v -X POST -H Expect: -T 10MB http://127.0.0.1:18090/receiver.tcl

curl -v -k -X POST -H Expect: -T 100KB https://127.0.0.1:48090/receiver.tcl
curl -v -k -X POST -H Expect: -T 1MB https://127.0.0.1:48090/receiver.tcl
curl -v -k -X POST -H Expect: -T 10MB https://127.0.0.1:48090/receiver.tcl

The only thing that i know to be different is the node.js backend, ... but it would be certainly also interesting to know, whether to problem can be replicated just with NaviServers.

Collapse
Posted by Gustaf Neumann on
The mentioned issue was not so easy to reproduce, but nevertheless, it is now resolved. I just write here a quick summary for everybody interested.

The test-setup from above worked as well for Tony, but with the same versions of software, there was still the problem on the real site, using docker images. So, the suspicion was with the docker setup. Tony provided me with a setup for a very similar installation based on docker, ... but this still worked for me without problems. In the next round i build a setup with VMs, t is quite easy to slow down the network interface), ... and i finally could reproduce and resolve the issue. The problem was indeed the slow backend, which causes partial write operations. The version of reverse proxy was using already the new C-level interface for handling these cases, ... but not on all places.

There was still a potential issue for reverse proxy operations with large input (typically POST or PUT requests). NaviServer should work with these conditions, so this was clearly an issue of the rev-proxy module that it was not prepared for this. The updated version in the public repository [1] fixes this issue by using the new C-level support for the partial writes also in these cases (in essence via "ns_connchan write -buffered") ... as we all know: if it can go wrong, it will go wrong.....

Many thanks for Tony the patience and for the good test cases!

-gn

[1] https://bitbucket.org/naviserver/revproxy/