Bug #3108
closedFile upload is broken after upgrade from 1.4.59 to 1.4.60 w/ server.max-request-size
Description
After upgrade from 1.4.59 to 1.4.60 file upload hangs for big files sending via JavaScript and connection is closed after timeout.
Data chunk is plain/text. Size is bit more than 64KB, but value is not constant (64932 bytes for example).
Small files are ok. In 1.4.59 all files are ok.
Header:
------WebKitFormBoundaryehNwpxfpAKliTrKq
Content-Disposition: form-data; name="fileToUpload"; filename="blob"
Content-Type: application/octet-stream
111111111111111111111111111111111111111111111111111 and so on...
error_log:2021-10-07 10:07:45: (response.c.407) -- parsed Request-URI
2021-10-07 10:07:45: (response.c.409) Request-URI : /files_upload.php?BucketID=0
2021-10-07 10:07:45: (response.c.411) URI-scheme : https
2021-10-07 10:07:45: (response.c.413) URI-authority : example.com
2021-10-07 10:07:45: (response.c.415) URI-path (clean): /files_upload.php
2021-10-07 10:07:45: (response.c.417) URI-query : BucketID=0
2021-10-07 10:07:45: (response.c.495) -- logical -> physical
2021-10-07 10:07:45: (response.c.497) Doc-Root : /var/www/qqq
2021-10-07 10:07:45: (response.c.499) Basedir : /var/www/qqq
2021-10-07 10:07:45: (response.c.501) Rel-Path : /files_upload.php
2021-10-07 10:07:45: (response.c.503) Path : /var/www/qqq/files_upload.php
2021-10-07 10:07:45: (response.c.522) -- handling subrequest
2021-10-07 10:07:45: (response.c.524) Path : /var/www/qqq/files_upload.php
2021-10-07 10:07:45: (response.c.526) URI : /files_upload.php
2021-10-07 10:07:45: (response.c.528) Pathinfo :
2021-10-07 10:07:45: (gw_backend.c.2677) handling the request using fastcgi
2021-10-07 10:07:45: (gw_backend.c.2677) handling the request using fastcgi
2021-10-07 10:08:46: (connections.c.1422) request aborted - read timeout: 20
2021-10-07 10:08:46: (connections.c.1422) request aborted - read timeout: 20
PHP script does not start for such requests.
Files
Updated by gstrauss about 3 years ago
Please review How to get support
Is the connection HTTP/1.1 or HTTP/2? TLS or clear-text? What is the lighttpd.conf (or at least relevant sections?) "PHP script does not start for such requests." has little meaning to others without appropriate context from lighttpd.conf. If you are using PHP-FPM, then PHP-FPM is responsible for running your PHP scripts.
Updated by SM about 3 years ago
HTTP/2 and TLS
I meant that hanging is not in PHP.
config:
server.upload-dirs = ( "/var/tmp" ) server.modules += ( "mod_rewrite" ) server.modules += ( "mod_fastcgi" ) server.modules += ( "mod_openssl" ) server.feature-flags += ("server.h2proto" => "enable", "server.h2c" => "enable") debug.log-request-handling = "enable" fastcgi.server = ( ".php" => (( "bin-path" => "/usr/local/bin/php-cgi -q", "socket" => "/tmp/php.socket", "max-procs" => 8, "idle-timeout" => 20, "bin-environment" => ( "PHP_FCGI_CHILDREN" => "1", "PHP_FCGI_MAX_REQUESTS" => "1000" ), "bin-copy-environment" => ( "PATH", "SHELL", "USER" ), "broken-scriptfilename" => "enable" )) )
Updated by gstrauss about 3 years ago
- Status changed from New to Need Feedback
I am not able to reproduce this with a simple echo.php
and various settings for lighttpd.conf server.stream-request-body
and server.stream-response-body
$ cat lighttpd.conf # partial content server.modules += ("mod_fastcgi") fastcgi.server = ( ".php" => ( ( "socket" => "/tmp/fcgi.sock", "bin-path" => "/usr/bin/php-cgi", "check-local" => "disable", "max-procs" => 1, ) ) ) $ cat /tmp/echo.php <?php $putfp = fopen('php://input', 'r'); $putdata = ''; while($data = fread($putfp, 1024)) $putdata .= $data; fclose($putfp); echo $putdata; ?> $ ls -l /tmp/up -rw-rw-r--. 1 xxxxxx xxxxxx 66560 Oct 7 04:10 /tmp/up $ curl -k --http1.1 -v --data-binary "@/tmp/up" https://localhost:8443/echo.php $ curl -k --http2 -v --data-binary "@/tmp/up" https://localhost:8443/echo.php
Updated by SM about 3 years ago
May be part of strace output will be useful.
FOR SMALL FILE (OK):
16:57:50.537278 write(6, "2021-10-07 16:57:50: (gw_backend.c.2677) handling the request using fastcgi\n", 76) = 76
16:57:50.537522 epoll_wait(17, [{EPOLLIN, {u32=23661552, u64=23661552}}], 2049, 1000) = 1
16:57:50.540969 read(20, "\27\3\3\2{", 5) = 5
16:57:50.541263 read(20, ...skipped...
16:57:50.541505 socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 21
16:57:50.541714 connect(21, {sa_family=AF_UNIX, sun_path="/tmp/php.socket-3"}, 20) = 0
16:57:50.541949 getsockname(20, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("192.168.180.32")}, [112->16]) = 0
16:57:50.542207 epoll_ctl(17, EPOLL_CTL_ADD, 21, {EPOLLIN|EPOLLERR|EPOLLHUP|EPOLLRDHUP, {u32=23553408, u64=23553408}}) = 0
16:57:50.542403 writev(21, [{iov_base="...skipped... PHP is executing
FOR BIG FILE (NOK):
16:47:08.961298 write(6, "2021-10-07 16:47:07: (gw_backend.c.2677) handling the request using fastcgi\n", 76) = 76
16:47:08.961981 epoll_wait(17, [], 2049, 1000) = 0
...more epoll_wait
16:47:28.369025 epoll_wait(17, [], 2049, 1000) = 0
16:47:30.375567 epoll_wait(17, [], 2049, 1000) = 0
...more epoll_wait
16:47:31.388241 epoll_wait(17, [], 2049, 1000) = 0
16:47:43.594420 madvise(0x16bb000, 4096, MADV_DONTNEED) = 0
16:47:43.595296 madvise(0x16af000, 4096, MADV_DONTNEED) = 0
16:47:43.596105 madvise(0x168c000, 4096, MADV_DONTNEED) = 0
16:47:43.596413 madvise(0x169b000, 4096, MADV_DONTNEED) = 0
16:47:43.596732 madvise(0x16b1000, 4096, MADV_DONTNEED) = 0
16:47:43.597234 madvise(0x16b3000, 4096, MADV_DONTNEED) = 0
16:47:30.375567 epoll_wait(17, [], 2049, 1000) = 0
...more epoll_wait
16:47:31.388241 epoll_wait(17, [], 2049, 1000) = 0
16:48:08.171649 write(6, "2021-10-07 16:48:08: (connections.c.1422) request aborted - read timeout: 20\n", 77) = 77
16:48:08.172115 write(7, "192.168.180.29 example.com - [07/Oct/2021:16:48:08 +0300] \"PRI * HTTP/2.0\" 100 95 \"-\" \"-\"\n", 95) = 95
16:48:08.172502 write(20, "\...", 24) = 24
16:48:08.172892 read(20, 0x16c26c3, 5) = -1 EAGAIN (Resource temporarily unavailable)
16:48:08.173142 read(20, 0x16c26c3, 5) = -1 EAGAIN (Resource temporarily unavailable)
16:48:08.173365 epoll_wait(17, [{EPOLLIN|EPOLLRDHUP, {u32=23698304, u64=23698304}}], 2049, 1000) = 1
16:48:08.175280 read(20, "", 5) = 0
16:48:08.175736 epoll_ctl(17, EPOLL_CTL_DEL, 20, NULL) = 0
16:48:08.175989 close(20) = 0
Updated by gstrauss about 3 years ago
Looks like lighttpd might be waiting to read the input from the client, as I do not see that in your strace. Of course, it could be earlier in the strace in the part that you omitted.
Have you read How to get support ? You have yet to include basic information such as your platform, and since you are asking a question, consider that you do not know enough to provide only part of your lighttpd.conf.
Has this worked previously? (probably, based on your title, but you have not said so explicitly.) Did you test with and without HTTP/2 enabled? Did you test changing server.stream-request-body
settings?
Updated by SM about 3 years ago
Nothing was changed except lighttpd version.
I wrote
"In 1.4.59 all files are ok."
"HTTP/2 and TLS"
HTTP/1 works well
Changing "server.stream-request-body" does not help
root@devserv:~# uname -a
Linux devserv 4.19.0-8-686-pae #1 SMP Debian 4.19.98-1+deb10u1 (2020-04-27) i686 GNU/Linux
root@devserv:~# lighttpd -v
lighttpd/1.4.60 (ssl) - a light and fast webserver
Updated by gstrauss about 3 years ago
If there is a missed signal somewhere, I would like to find it. If you can, please share your javascript, your browser version, and client OS so that I can try to reproduce the issue in a browser. What I hope that I have established with my example echo.php
some 8 hours ago is that this is not an issue that happens all the time for all the vague usage you have described (upload slightly more than 64k of data).
PHP script does not start for such requests.
PHP should be started if the request headers are complete and server.stream-request-body = 1
(or 2, but not 0)
Updated by gstrauss about 3 years ago
Is your browser and client aggressively sending stream requests when the HTTP/2 connection is opened?
There were numerous changes made between lighttpd 1.4.59 and lighttpd 1.4.60 to attempt to better handle overly aggressive clients.
#3093
#3100
#3078
If your javascript is making more than 8 stream requests upon the initial connection -- before it finds out from lighttpd via an HTTP/2 SETTINGS frame that the client should not open more than 8 simultaneously active streams at a time -- and those requests have request body (e.g. POST or PUT), and some of the HTTP/2 DATA frames for the active 8 streams occur after a 9th stream is requested, then those 8 active streams might fail to make progress and complete if all are waiting for more DATA and lighttpd has deferred processing of the 9th stream (HTTP/2 HEADERS frame) and any HTTP/2 frames behind it. lighttpd tries to detect this situation and send REFUSED_STREAM for the 9th and further streams when active streams are all waiting for a request body.
Review request headers that lighttpd is receiving with: debug.log-request-header = "enable"
Updated by SM about 3 years ago
Headers are the same. Only difference is content-length & boundary id
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: :method: POST
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: :authority: example.com
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: :scheme: https
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: :path: /files_upload.php?BucketID=0
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: content-length: 65558
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: sec-ch-ua: "Chromium";v="94", "Microsoft Edge";v="94", ";Not A Brand";v="99"
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: sec-ch-ua-mobile: ?0
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36 Edg/94.0.992.38
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: sec-ch-ua-platform: "Windows"
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: content-type: multipart/form-data; boundary=----WebKitFormBoundaryFxJB33cBifctkwjr
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: accept: */*
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: origin: https://example.com
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: sec-fetch-site: same-origin
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: sec-fetch-mode: cors
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: sec-fetch-dest: empty
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: referer: https://example.com/files.php
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: accept-encoding: gzip, deflate, br
2021-10-07 21:47:04: (h2.c.1264) fd:20 id:1 rqst: accept-language: ru,en;q=0.9,en-GB;q=0.8,en-US;q=0.7
Updated by SM about 3 years ago
It seems that it fails when content-length > 65535
File size still the same, but header size is 1-2 symbols more or less.
2021-10-07 21:59:07: (h2.c.1264) fd:20 id:5 rqst: content-length: 65536
Updated by gstrauss about 3 years ago
Does it fail on the first request where there is only one request on the connection? (id:1
)
Would you review my previous requests for additional information on how to reproduce this?
Updated by SM about 3 years ago
Yes, only id:1 in log
Sorry, cannot send you JS. It is based on FormData and XMLHttpRequest
https://developer.mozilla.org/en-US/docs/Web/API/FormData/Using_FormData_Objects
Updated by gstrauss about 3 years ago
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36 Edg/94.0.992.38
Have you tried any other platforms (e.g. Windows, Android) or browsers (Firefox)? Are you on macOS or iOS? Safari?
Updated by SM about 3 years ago
Debian is server where lighttpd is running
Client OS is Windows 10
Chrome, MS Edge, Opera - upload does not work
Firefox - upload works
Chrome:
2021-10-07 22:48:12: (h2.c.1264) fd:20 id:1 rqst: content-length: 65552
2021-10-07 22:48:12: (h2.c.1264) fd:20 id:1 rqst: sec-ch-ua: "Chromium";v="94", "Google Chrome";v="94", ";Not A Brand";v="99"
2021-10-07 22:48:12: (h2.c.1264) fd:20 id:1 rqst: sec-ch-ua-mobile: ?0
2021-10-07 22:48:12: (h2.c.1264) fd:20 id:1 rqst: user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36
2021-10-07 22:48:12: (h2.c.1264) fd:20 id:1 rqst: sec-ch-ua-platform: "Windows"
2021-10-07 22:48:12: (h2.c.1264) fd:20 id:1 rqst: content-type: multipart/form-data; boundary=----WebKitFormBoundaryT9hHSNhQRT8n10mV
2021-10-07 22:48:12: (h2.c.1264) fd:20 id:1 rqst: accept-language: ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7
Firefox:
2021-10-07 22:47:42: (h2.c.1264) fd:20 id:15 rqst: user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:93.0) Gecko/20100101 Firefox/93.0
2021-10-07 22:47:42: (h2.c.1264) fd:20 id:15 rqst: accept-language: ru-RU,ru;q=0.8,en-US;q=0.5,en;q=0.3
2021-10-07 22:47:42: (h2.c.1264) fd:20 id:15 rqst: content-type: multipart/form-data; boundary=---------------------------31216304779701498193550526257
2021-10-07 22:47:42: (h2.c.1264) fd:20 id:15 rqst: content-length: 65660
2021-10-07 22:47:42: (h2.c.1264) fd:20 id:15 rqst: te: trailers
Updated by gstrauss about 3 years ago
Chrome, MS Edge, Opera - upload does not work
Those are all Chromium-based, so I'll have to see how HTTP/2 frames are being sent compared to Firefox. This will take me some time, so I probably won't add any more comments today.
Updated by gstrauss about 3 years ago
I am unable to reproduce this on a Windows 10 machine (up-to-date with patches) and Chrome (up-to-date with patches). My test machine is Windows 10 Home running Windows Defender, but not SmartScreen. I tested with both a local lighttpd 1.4.60 webserver (running under cygwin) and with a remote lighttpd 1.4.60 webserver running on a Fedora Linux VM in the public cloud. For testing, my test.html was a simple web form which makes a POST request when Submit is clicked. I tested with form data <64k, =64k, and >64k.
Is your environment a corporate environment, perhaps with other software installed which intercepts and rewrites web requests? Are your web requests going through a local proxy, or through an external proxy?
While this might be easily reproducible for you, it is not so for me. I would like to track down this issue, but am unable to do so with the information you have provided, most likely since I do not have an environment configured to be the similar enough to yours. I am willing to work with you to provide some diagnostic patches to issue trace if you are willing to test in your environment.
Updated by SM about 3 years ago
There is no need in PHP at all.
Try to run the following in browser console opened on site running 1.4.60:
fd = new FormData (); fd.append ( "data", [...Array(66000)] ); xhr = new XMLHttpRequest (); xhr.open ( "POST", '/' ); xhr.send ( fd );
Switch to Network tab - request hangs.
Updated by SM about 3 years ago
If value is too big (...Array(10000000)) then it closes connection much faster and returns 413
Updated by SM about 3 years ago
to provide some diagnostic patches to issue trace if you are willing to test in your environment
it's fine. I will try your patch
Updated by gstrauss about 3 years ago
Try to run the following in browser console opened on site running 1.4.60:
fd = new FormData (); fd.append ( "data", [...Array(66000)] ); xhr = new XMLHttpRequest (); xhr.open ( "POST", '/' ); xhr.send ( fd );
That worked properly for me from Windows 10 running Chrome with a simple HTML page running the above code in a <script> block in the <head> section.
(I modified only the "POST" target from the code snippet above).
it's fine. I will try your patch
I am working on an initial diagnostics patch which may provide guidance for adding subsequent diagnostics.
If value is too big (...Array(10000000)) then it closes connection much faster and returns 413
That would be due to server.max-request-size
in your lighttpd.conf
Please consider sharing the output of lighttpd -f /etc/lighttpd/lighttpd.conf -p
, replacing any sensitive info with "xxxxxxx"
Updated by gstrauss about 3 years ago
Perhaps the problem is caused by the combination of certain lighttpd.conf directives. If you are unable to share your lighttpd.conf, please try to disable everything other than what is minimally required, and see if the problem still occurs for you.
server.document-root = "/path/to/docroot" server.port = 443 server.modules += ("mod_cgi") cgi.assign = (".php" => "/usr/bin/php-cgi") server.modules += ("mod_openssl") $SERVER["socket"] == ":443" { ssl.engine = "enable" ssl.pemfile = "/path/to/cert.pem" }
Updated by gstrauss about 3 years ago
This patch is not for production use, since it can result in lighttpd spinning on the CPU under certain conditions. Please temporarily test with this patch, but I do not recommend leaving it in place. If it make your issue go away, then it suggests that lighttpd has received some data into buffers, has not completely processed the buffers, and is not rescheduling the connection to process the remaining data (HTTP/2 frames) in the buffers.
--- a/src/connections.c +++ b/src/connections.c @@ -1231,7 +1231,9 @@ connection_state_machine_h2 (request_st * const h2r, connection * const con) } /* process requests on HTTP/2 streams */ - int resched = 0; + int resched = (h2c->sent_goaway <= 0 + && !chunkqueue_is_empty(con->read_queue) + && h2c->rused != sizeof(h2c->r)/sizeof(*h2c->r)); if (h2c->sent_goaway <= 0 && h2c->rused) { /* coarse check for write throttling * (connection.kbytes-per-second, server.kbytes-per-second)
Time allowing, I'll probably whip up a patch to issue diagnostics trace for received and sent HTTP/2 frames. (Probably Tuesday)
Updated by SM about 3 years ago
- File lighttpd1.conf lighttpd1.conf added
Patch did't help
Config is attached
Updated by SM about 3 years ago
Minimum config did'nt help.
Issue disappears after replacing index.php by index.html
Both index.php and index.html have the same content:
<html> <body> 1 </body> </html>
PHP 8.0.7 (cli) (built: Jun 4 2021 10:11:02) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.7, Copyright (c) Zend Technologies
with Zend OPcache v8.0.7, Copyright (c), by Zend Technologies
8.0.1 - the same
Updated by gstrauss about 3 years ago
Patch did't help
Ok, that suggests the problem is elsewhere. I wanted to rule out the situation where lighttpd had recv
'd data from the socket, but not processed all of the received HTTP/2 frames, and then did not get rescheduled to process the remaining HTTP/2 frames.
Issue disappears after replacing index.php by index.html
Since default index.html handling does not attempt to read the request body, that would be expected.
Thank you for sharing your config. Nothing specific jumped out at me to look into further, but I will try testing with your config. (If you could share the minimal config you tested with, and that still exhibited the issue for you, then that might help me to narrow things down.)
The diagnostics patch (~ Tue) should help determine whether or not lighttpd is processing the frames with the HTTP/2 request body.
In the meantime, debug.log-response-header = "enable"
might show you if lighttpd is responding to the request with an HTTP response.
Updated by SM about 3 years ago
I have installed Debian VM from scratch. Issue is reproduced.
If you want and you have OpenVPN client I can give you OpenVPN config and you will be able to login directly to the VM.
Updated by gstrauss about 3 years ago
If you want and you have OpenVPN client I can give you OpenVPN config and you will be able to login directly to the VM.
Thanks! That will hopefully let us converge more quickly on a solution.
AFK today. Will contact you tomorrow afternoon.
Updated by gstrauss about 3 years ago
- Status changed from Need Feedback to Patch Pending
- Target version changed from 1.4.xx to 1.4.61
There appears to be an issue in commit 28f1d010 in the branch of code taken when server.max-request-size
is non-zero.
It looks like I mistakenly applied the SETTINGS for initial window size sent by the client over the initial window size sent by the server (lighttpd), and so am (incorrectly) not sending a window update frame for the stream id. The lighttpd window size for the stream goes to 0, resulting in the client pausing sending more data, and lighttpd eventually timing out the connection.
I'll put together a patch with some fresh eyes after I get some sleep.
Updated by gstrauss about 3 years ago
Candidate patch:
--- a/src/h2.c +++ b/src/h2.c @@ -2552,7 +2552,7 @@ h2_init_stream (request_st * const h2r, connection * const con) /* XXX: TODO: assign default priority, etc. * Perhaps store stream id and priority in separate table */ h2c->r[h2c->rused++] = r; - r->h2_rwin = h2c->s_initial_window_size; + r->h2_rwin = 65535; /* must keep in sync with h2_init_con() */ r->h2_swin = h2c->s_initial_window_size; r->http_version = HTTP_VERSION_2;
Since way back when I first wrote the HTTP/2 code in lighttpd, the value assigned to r->h2_rwin was not correct if the initial window size was changed by the client. However, it did not make a real difference until commit 28f1d010
Updated by gstrauss about 3 years ago
- Subject changed from File upload is broken after upgrade from 1.4.59 to 1.4.60 to File upload is broken after upgrade from 1.4.59 to 1.4.60 w/ server.max-request-size
Updated by gstrauss about 3 years ago
- Status changed from Patch Pending to Fixed
Applied in changeset f5b5537ef16463ad357030f1e281b3a478bab614.
Also available in: Atom