Bug #3102
closedChrome gives random net::ERR_HTTP2_PROTOCOL_ERROR
Description
When trying with https://git.lighttpd.net/lighttpd/lighttpd1.4/src/branch/master I get random error from Chrome.
net::ERR_HTTP2_PROTOCOL_ERROR
It is very difficult to reproduce the error. I have to work in a web application that I have created that have a lot of GET and PUT for some time to get the error.
When turning on
debug.log-request-header = "enable"
debug.log-response-header = "enable"
debug.log-request-handling = "enable"
I'm unable to get the error.
I've tried to add log_error lines in h2.c in all places that could produce the error. (I get "normal" output with "h2_send_refused_stream" when trying 12 PUT from my application as expected.) I'm unsure if this (more than 8 PUTs) is required prior to getting the protocol error. When restarting the application (refresh in browser) I get h2_send_goaway with H2_E_NO_ERROR. But when the error "net::ERR_HTTP2_PROTOCOL_ERROR" comes in the browser there is no additional lines in the error log.
I need some more advice on how to track this down.
Files
Updated by gstrauss about 3 years ago
I need some more advice on how to track this down.
Well, I need some sleep :), so here are some ideas to start.
The debug logging would probably be helpful, but you've tried that.
There are quite a few places in src/h2.c
which might send HTTP/2 PROTOCOL_ERROR (lighttpd enum H2_E_PROTOCOL_ERROR
), but we do not know if lighttpd is sending it, or if lighttpd is doing something else and Chrome is reporting net::ERR_HTTP2_PROTOCOL_ERROR
While you could build lighttpd with --with-libunwind
and annotate src/h2.c
to print a stack trace (ck_bt()
from src/ck.h
) in the h2_send_*()
funcs if H2_E_PROTOCOL_ERROR
is the error code, that might not be worth the effort unless you find a way to reproduce the error more easily.
Instead, I would suggest starting with lighttpd 1.4.59 -- perhaps add the final result of the patches for #3093 and #3100 -- and see if you can reproduce the error with that. As you suggested, perhaps the issue is exposed in combination with > 8 PUTs at the beginning of the HTTP/2 connection. That would be my hunch, too. After all, #3100 modified the patch for #3093 specifically to restore lighttpd 1.4.59 behavior to handle a large burst of requests received immediately at the beginning of the HTTP/2 connection. The behavior was modified for #3093 (and then again in #3100) for the flurry of PUT requests, and sends REFUSED_STREAM for the excess PUTs beyond what lighttpd is willing to handle in parallel for that HTTP/2 connection.
If you find that you can trigger the bug a little bit more easily (not currently the case), you might build and test lighttpd for each revision where src/h2.c
has been modified since lighttpd 1.4.59, as there are not too many changes, and most are fairly small changes.
Updated by gstrauss about 3 years ago
You might test with a burst of requests at connect time (after waiting for lighttpd to close the HTTP/2 connection after idle timeout (server.max-read-idle
))
The default for server.max-read-idle is 60 seconds, so you probably want to tune that down to something like 2 or 3 (seconds) during testing.
Send a burst of GET requests followed by PUTs followed by GETs, and test PUTs followed by GETs, varying the first amount by < 8 or > 8.
Updated by stenvaag about 3 years ago
I managed to get the error with request and response logging enabled. I'm still unsure if this is my fcgi backend or lighttpd.
I observe that id:9 is reused prior to the last one getting a response. Prior to this error I am sending the usual 12 PUTs in a burst, then some GETs, and finally a small burst of PUT's where the failing one is the first PUT. The PUT is from a ajax call the GET is from image loading in the map component in my application.
2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: :method: PUT 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: :authority: gollum.agromatic.no 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: :scheme: https 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: :path: /tstwst/amweb/data/maps/22951?_dc=1631641308834&ID=FDTFMV64HFAGN5O6TRN4CMDT08&TID=1 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: content-length: 123 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: sec-ch-ua: "Google Chrome";v="93", " Not;A Brand";v="99", "Chromium";v="93" 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: sec-ch-ua-mobile: ?0 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: sec-ch-ua-platform: "Windows" 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: content-type: application/json 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: accept: */* 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: origin: https://gollum.agromatic.no 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: sec-fetch-site: same-origin 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: sec-fetch-mode: cors 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: sec-fetch-dest: empty 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: referer: https://gollum.agromatic.no/amweb/?ID=FDTFMV64HFAGN5O6TRN4CMDT08 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: accept-encoding: gzip, deflate, br 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: accept-language: nb,no;q=0.9,da;q=0.8,sv;q=0.7,en;q=0.6 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: cookie: G_ENABLED_IDPS=google 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: :method: GET 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: :authority: gollum.agromatic.no 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: :scheme: https 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: :path: /tstwst/amweb/wms?ID=FDTFMV64HFAGN5O6TRN4CMDT08&LAYERS=sptema&SRS=EPSG%3A32632&FORMAT=image%2Fpng&TRANSPARENT=TRUE&SERVICE=WMS&VERSION=1.1.1&REQUEST=GetMap&STYLES=&_OLSALT=0.4320661165774786&BBOX=608481,6744861,609158,6745538&WIDTH=256&HEIGHT=256 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: sec-ch-ua: "Google Chrome";v="93", " Not;A Brand";v="99", "Chromium";v="93" 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: sec-ch-ua-mobile: ?0 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: sec-ch-ua-platform: "Windows" 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: accept: image/jxl,image/avif,image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: sec-fetch-site: same-origin 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: sec-fetch-mode: no-cors 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: sec-fetch-dest: image 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: referer: https://gollum.agromatic.no/amweb/?ID=FDTFMV64HFAGN5O6TRN4CMDT08 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: accept-encoding: gzip, deflate, br 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: accept-language: nb,no;q=0.9,da;q=0.8,sv;q=0.7,en;q=0.6 2021-09-14 19:41:55: h2.c.1264) fd:11 id:9 rqst: cookie: G_ENABLED_IDPS=google 2021-09-14 19:41:55: h2.c.1888) fd:11 id:9 resp: :status: 200 2021-09-14 19:41:55: h2.c.1877) fd:11 id:9 resp: content-type: image/png 2021-09-14 19:41:55: h2.c.1877) fd:11 id:9 resp: content-length: 116 2021-09-14 19:41:55: h2.c.1877) fd:11 id:9 resp: accept-ranges: bytes 2021-09-14 19:41:55: h2.c.1888) fd:11 id:9 resp: date: Tue, 14 Sep 2021 17:41:55 GMT 2021-09-14 19:41:55: h2.c.1888) fd:11 id:9 resp: server: lighttpd/1.4.60
Updated by gstrauss about 3 years ago
- Is Chrome erroneously repeating the same id?
- Why is lighttpd not rejecting the repeated id?
Updated by stenvaag about 3 years ago
There is probably a "goaway" here that is not logged. I removed all my logging (of goaway). I could try to reinsert, but it is difficult to get the error.
I will try to modify my fcgi backend to return an empty response on a request. If this creates a similar result this could be the error.
Updated by gstrauss about 3 years ago
lighttpd src/h2.c
currently ignores HTTP trailers. Here's a patch to detect and reject if an HTTP/2 HEADERS frame is treated as HTTP trailers, and contains an HTTP/2 pseudo-header (starting with ":").
--- a/src/h2.c +++ b/src/h2.c @@ -1265,9 +1265,12 @@ h2_parse_headers_frame (request_st * const restrict r, const unsigned char *psrc "fd:%d id:%u rqst: %.*s: %.*s", r->con->fd, r->h2id, (int)hpctx.klen, hpctx.k, (int)hpctx.vlen, hpctx.v); - r->http_status = http_request_parse_header(r, &hpctx); - if (0 != r->http_status) + const int http_status = http_request_parse_header(r, &hpctx); + if (__builtin_expect( (0 != http_status), 0)) { + if (r->http_status == 0) /*might be set if processing trailers*/ + r->http_status = http_status; break; + } } #if 0 /*(see catch-all below)*/ /* Send GOAWAY (further below) (decoder state not maintained on error) @@ -1319,7 +1322,7 @@ h2_parse_headers_frame (request_st * const restrict r, const unsigned char *psrc rq->bytes_in += (off_t)hpctx.hlen; rq->bytes_out += (off_t)hpctx.hlen; - if (0 == r->http_status && LSHPACK_OK == rc) { + if (0 == r->http_status && LSHPACK_OK == rc && !trailers) { if (hpctx.pseudo) r->http_status = http_request_validate_pseudohdrs(r, hpctx.scheme, --- a/src/request.c +++ b/src/request.c @@ -749,6 +749,9 @@ http_request_parse_header (request_st * const restrict r, http_header_parse_ctx } } else { /*(trailers)*/ + if (*k == ':') + return http_request_header_line_invalid(r, 400, + "invalid pseudo-header in trailers -> 400"); /* ignore trailers (after required HPACK decoding) if streaming * request body to backend since headers have already been sent * to backend via Common Gateway Interface (CGI) (CGI, FastCGI,
Updated by gstrauss about 3 years ago
id:9 for the PUT request is the 5th client stream on the HTTP/2 connection (1,3,5,7,9), so you might check the logs and match up the file descriptors (e.g. fd:11) to see the sequence of requests on the connection that had the issue (e.g. 5 PUTs and then GET?)
For testing purposes to try to reproduce the issue, have you tried tuning down the read idle timeout? server.max-read-idle = 1
Updated by stenvaag about 3 years ago
With the above patches for trailers I imediately got the protocol error from the first PUT of the 12, the rest succeeded.
Updated by stenvaag about 3 years ago
- File error_log.zip error_log.zip added
Here is the complete error.log from yestday.
Updated by gstrauss about 3 years ago
Looking at error_log.zip and trying to match up against what you posted above, it looks like you may have selected fd:11 id:9
from multiple sections of the log file, and those look to be on different connections.debug.log-state-handling = "enable"
will show connection open and close, as well as request state (so might get a little noisy).
I do see that the one PUT you selected did not get an HTTP response, though it may have gotten an HTTP/2 error on the stream.
With the above patches for trailers I imediately got the protocol error from the first PUT of the 12, the rest succeeded.
Are you including trailers in your HTTP/2 requests? The patches I posted above should only affect HTTP/2 connections when trailers are sent on the client request stream.
The prior patches for #3093 and #3100 should reject the 9th PUT request on the stream (id 17) if all initial requests on the connection are PUT, and all are sent prior to the client sending a SETTINGS ACK to lighttpd. After the client sends the SETTINGS ACK to lighttpd, the client "should not" send requests that would result in more than 8 active streams at a time, since lighttpd sends SETTINGS_MAX_CONCURRENT_STREAMS 8
in SETTINGS as part of the HTTP/2 Connection Preface that lighttpd sends to the client.
If you can now more easily reproduce this, would you attach a log with debugging enabled and include debug.log-state-handling = "enable"
?
Also, is the behavior with the above patches (for trailers) what you expect?
Updated by stenvaag about 3 years ago
- File error_log_timout_1.zip error_log_timout_1.zip added
Attached is a log with server.max-read-idle = 1. A grep for "driftsenheter" in the log, the last one at "12:49:19" fails with protocol error. Looks like there is a new "fd" opened just after.
Updated by stenvaag about 3 years ago
I'm unable to get the error with debug.log-state-handling = "enable"
Updated by gstrauss about 3 years ago
The connection with the GET for "driftsenheter" in the log, the last one at "12:49:19" looks like it (the connection) was opened a few seconds earlier at 12:49:13. The logs do not contain every HTTP/2 frame sent back and forth (that would be noisy!) so we don't see what is happening at that layer.
Is there anything in the lighttpd error log around that time?
I'll try to put together some patches later today to add some additional instrumentation. I hope to have them ready for you by tomorrow morning (your timezone).
Updated by stenvaag about 3 years ago
Is there anything in the lighttpd error log around that time?
You have a complete copy of the error.log.
Looks like the connection is closed and opened again here
2021-09-15 12:47:44: h2.c.1888) fd:10 id:259 resp: server: lighttpd/1.4.60
2021-09-15 12:49:13: h2.c.1264) fd:10 id:1 rqst: :method: GET
Updated by stenvaag about 3 years ago
I modified connections.c to only include "connection closed" messages. Then I was able to get the protocol error:
... 2021-09-15 13:55:28: h2.c.1888) fd:10 id:277 resp: server: lighttpd/1.4.60 2021-09-15 13:55:34: connections.c.1458) connection closed - keep-alive timeout: 10 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: :method: PUT 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: :authority: gollum.agromatic.no 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: :scheme: https 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: :path: /tstwst/amweb/data/maps/22951?_dc=1631706927914&ID=A3P03BA18GKPTR2ENTRJFH20D8&TID=1 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: content-length: 118 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: sec-ch-ua: "Google Chrome";v="93", " Not;A Brand";v="99", "Chromium";v="93" 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: sec-ch-ua-mobile: ?0 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: sec-ch-ua-platform: "Windows" 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: content-type: application/json 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: accept: */* 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: origin: https://gollum.agromatic.no 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: sec-fetch-site: same-origin 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: sec-fetch-mode: cors 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: sec-fetch-dest: empty 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: referer: https://gollum.agromatic.no/amweb/?ID=A3P03BA18GKPTR2ENTRJFH20D8 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: accept-encoding: gzip, deflate, br 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: accept-language: nb,no;q=0.9,da;q=0.8,sv;q=0.7,en;q=0.6 2021-09-15 13:55:34: h2.c.1264) fd:10 id:279 rqst: cookie: G_ENABLED_IDPS=google 2021-09-15 13:55:34: connections.c.109) connection closed for fd 10 2021-09-15 13:55:34: h2.c.1264) fd:10 id:1 rqst: :method: PUT ...
Updated by gstrauss about 3 years ago
At first glance, that appears to be proper behavior.2021-09-15 13:55:34: connections.c.1458) connection closed - keep-alive timeout: 10
After that, lighttpd will have sent GOAWAY to the client, marking stream id:277 as the last handled request, so when stream id:279 comes in, lighttpd will parse the HPACK and then discard the new request (see end of h2_recv_headers()
)
Updated by gstrauss about 3 years ago
Some additional patches which might help your tracing. There are all optional and for testing only. The patch to log_buffer_prepare()
adds a high-precision timestamp. lighttpd sends SETTINGS to client at the beginning of a new connection.
--- a/src/h2.c +++ b/src/h2.c @@ -313,6 +313,7 @@ h2_send_rst_stream_id (uint32_t h2id, connection * const con, const request_h2er rst_stream.u[3] = htonl(e); chunkqueue_append_mem(con->write_queue, /*(+3 to skip over align padding)*/ (const char *)rst_stream.c+3, sizeof(rst_stream)-3); +log_error(con->srv->errh, __FILE__, __LINE__, "fd:%d id:%u RST_STREAM", con->fd, h2id); } @@ -396,6 +397,7 @@ h2_send_goaway (connection * const con, const request_h2error_t e) goaway.u[4] = htonl(e); chunkqueue_append_mem(con->write_queue, /*(+3 to skip over align padding)*/ (const char *)goaway.c+3, sizeof(goaway)-3); +log_error(con->srv->errh, __FILE__, __LINE__, "fd:%d GOAWAY last-stream-id:%u", con->fd, h2c->h2_cid); } @@ -1803,6 +1805,7 @@ h2_init_con (request_st * const restrict h2r, connection * const restrict con, c /*h2r->h2_rwin += 65537;*//*(enable if WINDOWS_UPDATE is sent above)*/ chunkqueue_append_mem(con->write_queue, (const char *)h2settings, sizeof(h2settings)); +log_error(con->srv->errh, __FILE__, __LINE__, "fd:%d SETTINGS", con->fd); if (!h2_recv_client_connection_preface(con)) { /*(alternatively, func ptr could be saved in an element in (h2con *))*/ --- a/src/log.c +++ b/src/log.c @@ -103,6 +103,19 @@ static int log_buffer_prepare(const log_error_st *errh, const char *filename, un if (-1 == errh->fd) return -1; /* cache the generated timestamp */ if (__builtin_expect( (tlast != log_epoch_secs), 0)) { + if (1) { /*(do not cache if high-precision; do not update tlast)*/ + struct timespec ts = { 0, 0 }; + log_clock_gettime_realtime(&ts); + buffer_clear(b); + /* alt: buffer_append_strftime(b, "%F %T", &tm); */ + buffer_append_int(b, TIME64_CAST(ts.tv_sec)); + buffer_append_string_len(b, CONST_STR_LEN(".000000000: (")); + char n[LI_ITOSTRING_LENGTH]; + const size_t nlen = + li_utostrn(n, sizeof(n), (unsigned long)ts.tv_nsec); + memcpy(b->ptr+buffer_clen(b)-nlen-3, n, nlen); + break; + } struct tm tm; tlast = log_epoch_secs; tlen = (uint32_t)
Updated by stenvaag about 3 years ago
1631777393.420149005: (h2.c.1891) fd:10 id:71 resp: server: lighttpd/1.4.60 1631777398.812910701: (connections.c.1458) connection closed - keep-alive timeout: 10 1631777398.812985521: (h2.c.1266) fd:10 id:73 rqst: :method: PUT 1631777398.812993524: (h2.c.1266) fd:10 id:73 rqst: :authority: gollum.agromatic.no 1631777398.812998263: (h2.c.1266) fd:10 id:73 rqst: :scheme: https 1631777398.813004961: (h2.c.1266) fd:10 id:73 rqst: :path: /tstwst/skplweb/data/skiftevekster.json?_dc=1631777396335&ID=SQSAT1JH6DGCUNLID1L65A0HBG&TID=1 1631777398.813008681: (h2.c.1266) fd:10 id:73 rqst: content-length: 325 1631777398.813012768: (h2.c.1266) fd:10 id:73 rqst: sec-ch-ua: "Google Chrome";v="93", " Not;A Brand";v="99", "Chromium";v="93" 1631777398.813034670: (h2.c.1266) fd:10 id:73 rqst: sec-ch-ua-mobile: ?0 1631777398.813038789: (h2.c.1266) fd:10 id:73 rqst: user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36 1631777398.813043807: (h2.c.1266) fd:10 id:73 rqst: sec-ch-ua-platform: "Windows" 1631777398.813047609: (h2.c.1266) fd:10 id:73 rqst: content-type: application/json 1631777398.813051198: (h2.c.1266) fd:10 id:73 rqst: accept: */* 1631777398.813054996: (h2.c.1266) fd:10 id:73 rqst: origin: https://gollum.agromatic.no 1631777398.813058436: (h2.c.1266) fd:10 id:73 rqst: sec-fetch-site: same-origin 1631777398.813061696: (h2.c.1266) fd:10 id:73 rqst: sec-fetch-mode: cors 1631777398.813064918: (h2.c.1266) fd:10 id:73 rqst: sec-fetch-dest: empty 1631777398.813068162: (h2.c.1266) fd:10 id:73 rqst: referer: https://gollum.agromatic.no/amweb/?ID=SQSAT1JH6DGCUNLID1L65A0HBG 1631777398.813071560: (h2.c.1266) fd:10 id:73 rqst: accept-encoding: gzip, deflate, br 1631777398.813076091: (h2.c.1266) fd:10 id:73 rqst: accept-language: nb,no;q=0.9,da;q=0.8,sv;q=0.7,en;q=0.6 1631777398.813079753: (h2.c.1266) fd:10 id:73 rqst: cookie: G_ENABLED_IDPS=google 1631777398.813118960: (h2.c.400) fd:10 GOAWAY last-stream-id:73 1631777398.813123670: (h2.c.316) fd:10 id:73 RST_STREAM 1631777398.814291420: (connections.c.109) connection closed for fd 10 1631777398.814907361: (h2.c.1803) fd:10 SETTINGS 1631777398.826927622: (h2.c.1266) fd:10 id:1 rqst: :method: PUT
Updated by gstrauss about 3 years ago
It looks like there is a short period of time after lighttpd decides to close the idle connection, and during which new data can arrive on the connection, before lighttpd sends GOAWAY. Since lighttpd read the new stream request, it sends a RST_STREAM, which the browser might not retry. Please try this, which may result in lighttpd sending GOAWAY with the previous stream id (in the example it would be 71) before reading and then having to send RST_STREAM for the PUT in id 73. Chrome should then resend that request.
--- a/src/connections.c +++ b/src/connections.c @@ -1460,6 +1460,7 @@ static void connection_check_timeout (connection * const con, const unix_time64_ con->fd); } connection_set_state(r, CON_STATE_RESPONSE_END); + con->is_readable = 0; changed = 1; } }
Updated by stenvaag about 3 years ago
I'm now unable to get the error.
...
1631780629.700954970: (h2.c.1891) fd:10 id:2639 resp: server: lighttpd/1.4.60
1631780635.707591418: (connections.c.1458) connection closed - keep-alive timeout: 10
1631780635.707629042: (h2.c.400) fd:10 GOAWAY last-stream-id:2639
1631780635.708262156: (connections.c.109) connection closed for fd 10
1631780638.661998649: (h2.c.1803) fd:10 SETTINGS
1631780638.668634915: (h2.c.1266) fd:10 id:1 rqst: :method: PUT
...
1631780648.773547098: (h2.c.1891) fd:10 id:251 resp: server: lighttpd/1.4.60
1631780654.780188118: (connections.c.1458) connection closed - keep-alive timeout: 10
1631780654.780226409: (h2.c.400) fd:10 GOAWAY last-stream-id:251
1631780654.780949803: (connections.c.109) connection closed for fd 10
1631780656.944193513: (h2.c.1803) fd:10 SETTINGS
1631780656.949804056: (h2.c.1266) fd:10 id:1 rqst: :method: PUT
...
1631780661.073669198: (h2.c.1891) fd:10 id:59 resp: server: lighttpd/1.4.60
1631780666.757065202: (connections.c.1458) connection closed - keep-alive timeout: 10
1631780666.757160603: (h2.c.400) fd:10 GOAWAY last-stream-id:59
1631780666.758526292: (connections.c.109) connection closed for fd 10
1631780666.759218003: (h2.c.1803) fd:10 SETTINGS
1631780666.767826790: (h2.c.1266) fd:10 id:1 rqst: :method: GET
Updated by gstrauss about 3 years ago
- Status changed from New to Patch Pending
Great! Thank you very much for your time and persistence in helping track this down.
This tightens up the code a bit, which was not expecting to have a short race receiving data after a read timeout. After all, the read idle timer (measured in seconds) expired!
These patches will be on lighttpd git master shortly, and will be part of lighttpd 1.4.60, expected to be released this month.
BTW, you might want to double-check your FastCGI backend and how it handles reading errors or truncated reads, e.g. unexpected EOF. Before the above patch, lighttpd may have started the connection to your FastCGI backend, but then aborted by closing the connection. Your FastCGI probably read EOF and got 0 bytes, and if you looked in your read buffer for the FastCGI id without checking that you at least read enough bytes to fill the FastCGI header, the buffer would contain whatever garbage was previously there.
Updated by gstrauss about 3 years ago
- Status changed from Patch Pending to Fixed
Applied in changeset 94bac825dc03fe30f9d2b6067d589a31902502fb.
Updated by stenvaag about 3 years ago
Thanks for your tip about my FastCGI backend. You where correct about a random byte was checked when the connection was closed.
Also available in: Atom