https://redmine.lighttpd.net/https://redmine.lighttpd.net/favicon.ico?13667327412021-10-14T16:36:53Zlighty labsLighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128052021-10-14T16:36:53Zgstrauss
<ul></ul><p>might be related to the (also recently submitted) <a class="external" href="https://redmine.lighttpd.net/boards/2/topics/10063">https://redmine.lighttpd.net/boards/2/topics/10063</a></p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128062021-10-14T20:01:02Zgstrauss
<ul></ul><p>Do you have a non-TLS port on which lighttpd is listening? Does the issue occur with <code>curl --http-prior-knowledge</code> sent to the non-TLS (clear) port? I am trying to narrow down if this might possibly related to HTTP/2 + TLS buffering.</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128072021-10-14T20:20:17Zflynn
<ul></ul><p>For tests I enabled a non-TLS port and tested with <code>curl --http2-prior-knowledge</code>: no failure in 10 attempts.</p>
<p>Exact at the byte position, where the TLS connection sometimes hangs, curl in verbose mode reports<br /><pre>
Connection #0 to host xxx left intact
</pre><br />and continues.</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128082021-10-14T20:30:14Zflynn
<ul></ul><p>According to my observations the content length of the response should be a little below <code>server.chunkqueue-chunk-sz</code>, so that the whole response with headers is a slightly larger than <code>server.chunkqueue-chunk-sz</code>.</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128092021-10-14T20:46:22Zgstrauss
<ul></ul><p>Thanks. I'll have more time to dig into this later this evening.</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128102021-10-15T08:17:16Zgstrauss
<ul></ul><p>I have not been able to reproduce this with a trivial lighttpd backend listening on a unix domain socket, but am still trying.</p>
<p>How is the backend responding? HTTP/1.0 or HTTP/1.1? Is <code>Content-Length</code> provided? Is it using <code>Transfer-Encoding: chunked</code>? (You might try <code>proxy.header += ("force-http10" => "enable")</code>)<br />Does the backend send the response in full, or are there pauses between writes?</p>
<p>Is server.stream-response-body non-zero? (I am assuming that response streaming is enabled.)</p>
<p>Thinking aloud:</p>
<p>Default <code>server.chunkqueue-chunk-sz</code> is 8k, and lighttpd tries to send data to TLS modules in 16k blocks, so a HTTP/2 frames for response headers and response body should fit into a single 16k block, and should be sent in a single TLS record (if all data has been received from backend)</p>
<p>If you have alternate TLS modules installed, you might test with lighttpd mod_gnutls to see if that makes a difference.<br />What version of openssl libs are you using? 1.1.1l or 3.0.0?</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128152021-10-17T23:16:44Zgstrauss
<ul></ul><p>I found one issue, but there might be more.<br /><pre>
--- a/src/http-header-glue.c
+++ b/src/http-header-glue.c
@@ -1329,6 +1329,7 @@ handler_t http_response_read(request_st * const r, http_response_opts * const op
/* accumulate response in b until headers completed (or error)*/
if (r->resp_body_started) {
buffer_clear(b);
+ if (0 == r->resp_body_scratchpad) r->resp_body_finished = 1;
/* enable simple accumulation of small reads in some situations
* no Content-Length (will read to EOF)
* Content-Length (will read until r->resp_body_scratchpad == 0)
</pre><br />lighttpd 1.4.60 schedules together the processing of front-end and backend-end events for requests on the same client connection. This reduces duplicative checks and improves buffer reuse, but may have missed detecting if the response body was already fully received along with the headers when the backend provided <code>Content-Length</code> with the response (instead of <code>Transfer-Encoding: chunked</code> or neither).</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128162021-10-17T23:54:42Zgstrauss
<ul></ul><p>A few patches have been pushed to my dev branch.</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128172021-10-18T02:07:00Zgstrauss
<ul></ul><p>So far, I have been unable to reproduce the issue with response body at varying amounts less than, but close to, 8k.</p>
<p>Would you share some trace with <code>debug.log-state-handling = "enable"</code> for the hanging stream id?</p>
<blockquote>
<p>if I press Ctrl+C to terminate prematurely, the log message in the access log shows the length of a full download</p>
</blockquote>
<p>That suggests that the front-end lighttpd has read the data from the backend and has applied HTTP/2 framing when transferring the data from the request write queue to the connection write queue. Does the response have <code>Content-Length</code>? Does curl report that it was missing bytes? Or perhaps there is some reason that lighttpd is not sending HTTP/2 END_STREAM frame after lighttpd finishes sending the response body data?</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128182021-10-18T02:07:09Zgstrauss
<ul><li><strong>Category</strong> set to <i>core</i></li></ul> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128222021-10-19T15:00:49Zflynn
<ul></ul>Some answers:
<ul>
<li>openssl version is from Debian buster: 1.1.1d</li>
<li>the Debian package currently does not include the gnutls module, I added this myself, test results re reported later</li>
<li>using the nss TLS module the problem occurs too, but curl now reports immediately:<br /><pre>
curl: (92) HTTP/2 stream 0 was not closed cleanly: INTERNAL_ERROR (err 2)
</pre></li>
<li>the responses all have a <code>Content-Length</code></li>
<li>usually it takes some time to trigger the problem, after a restart usually all is fine. Also the affected urls may change after restart ...</li>
</ul>
<p>I'll test your new patches now.</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128232021-10-19T15:14:16Zflynn
<ul></ul><p>Share some trace with debug.log-state-handling = "enable" on a high volume server is not so easy, I cannot filter out the request in question.</p>
<p>Any ideas, how to limit thee debug messges to a certian request (type). As far as I know this option is always global, I cannot limit the scope with <code>$HTTP[...]...</code></p>
The nss module reports one error very often:
<ul>
<li>(mod_nss.c.2197) NSS: mod_nss_read_err: (PR_END_OF_FILE_ERROR) Encountered end of file</li>
</ul> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128242021-10-19T17:56:23Zgstrauss
<ul></ul><blockquote>
<p>Any ideas, how to limit thee debug messges to a certian request (type). As far as I know this option is always global, ...</p>
</blockquote>
<p>Most of the <code>debug.*</code> options are not global. Unfortunately, <code>debug.log-state-handling</code> and <code>debug.log-request-header-on-error</code> are global, since they are connection-level, or may occur before the request has been parsed (and is valid).</p>
<p>I'll put together a patch (below) to enable <code>debug.log-state-handling</code> after request headers have been read. (It does not make as much sense to do so for <code>debug.log-request-header-on-error</code>)</p>
<p>Aside: lighttpd mod_gnutls will be included in Debian in a future version of lighttpd. It was not included with lighttpd 1.4.56 (released Nov 2020) since at the time, gnutls libraries in Debian were very old and did not have support for TLS v1.3. gnutls 3.6.5 with support for TLS v1.3 was released Dec 2018, almost 2 years prior to lighttpd 1.4.56, but Debian still had an older version of gnutls at the time. (Debian updated to use newer gnutls libraries in Jan 2021)</p>
<p>Separate issue:</p>
<blockquote>
<p>(mod_nss.c.2197) NSS: mod_nss_read_err: (PR_END_OF_FILE_ERROR) Encountered end of file</p>
</blockquote>
<p>I guess I should add <code>PR_END_OF_FILE_ERROR</code> to those ignored unless <code>debug.log-ssl-noise = "enable"</code></p>
<p>Quick patch to allow <code>debug.log-state-handling</code> to be toggled after request has started:<br /><pre>
--- a/src/configfile.c
+++ b/src/configfile.c
@@ -192,6 +192,9 @@ static void config_merge_config_cpv(request_config * const pconf, const config_p
case 32:/* server.breakagelog */
if (cpv->vtype == T_CONFIG_LOCAL) pconf->serrh = cpv->v.v;
break;
+ case 33:/* debug.log-state-handling */
+ pconf->log_state_handling = (0 != cpv->v.u);
+ break;
default:/* should not happen */
return;
}
@@ -717,7 +720,7 @@ static int config_insert_srvconf(server *srv) {
T_CONFIG_SCOPE_SERVER }
,{ CONST_STR_LEN("debug.log-state-handling"),
T_CONFIG_BOOL,
- T_CONFIG_SCOPE_SERVER }
+ T_CONFIG_SCOPE_CONNECTION }
,{ CONST_STR_LEN("server.feature-flags"),
T_CONFIG_ARRAY_KVANY,
T_CONFIG_SCOPE_SERVER }
@@ -1005,6 +1008,9 @@ static int config_insert(server *srv) {
,{ CONST_STR_LEN("server.breakagelog"),
T_CONFIG_STRING,
T_CONFIG_SCOPE_CONNECTION }
+ ,{ CONST_STR_LEN("debug.log-state-handling"),
+ T_CONFIG_BOOL,
+ T_CONFIG_SCOPE_CONNECTION }
,{ NULL, 0,
T_CONFIG_UNSET,
T_CONFIG_SCOPE_UNSET }
</pre></p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128252021-10-19T17:58:24Zflynn
<ul></ul><p>The initial tests with the two patches in your personal repo on this ticket look promising, but I want to test again tomorrow after some runtime.</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128262021-10-23T05:16:38Zgstrauss
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Patch Pending</i></li></ul> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128272021-10-26T10:43:02Zflynn
<ul></ul><p>Until now I cannot reproduce this error anymore, the patches <a class="external" href="https://git.lighttpd.net/lighttpd/lighttpd1.4/commit/a9a3f2f0b97766b7a84ce19fe7ff88383a3f3a97">https://git.lighttpd.net/lighttpd/lighttpd1.4/commit/a9a3f2f0b97766b7a84ce19fe7ff88383a3f3a97</a> and <a class="external" href="https://git.lighttpd.net/lighttpd/lighttpd1.4/commit/71798d46f9c8d1ba3c71c4867edaac52c88f7d20">https://git.lighttpd.net/lighttpd/lighttpd1.4/commit/71798d46f9c8d1ba3c71c4867edaac52c88f7d20</a> solves this issue for me.</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128282021-10-27T08:24:41Zgstrauss
<ul></ul><p>Thank you for testing and for following up. lighttpd 1.4.61 (including these fixes) is slated to be released later this week.</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128292021-10-27T08:25:07Zgstrauss
<ul><li><strong>Status</strong> changed from <i>Patch Pending</i> to <i>Fixed</i></li></ul><p>Applied in changeset <a class="changeset" title="[core] short-circuit if response body recv w/ hdrs (fixes #3111) short-circuit if response body ..." href="https://redmine.lighttpd.net/projects/lighttpd/repository/14/revisions/0757d71e1444b7bde1426de38dbdba123514ee82">0757d71e1444b7bde1426de38dbdba123514ee82</a>.</p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128302021-10-28T06:59:42Zgstrauss
<ul></ul><p>There is another bug in lighttpd HTTP/2 support where HTTP/2 END_STREAM might not be sent if the entire response has been sent to the client, and then some time later the backend ends the response (or closes the pipe in the case of CGI). This bug is present in lighttpd 1.4.59 (and possible earlier), but may occur slightly more frequently in lighttpd 1.4.60 due to lighttpd 1.4.60 scheduling together the processing of front-end and backend-end events for requests on the same client connection, reducing the number of calls to the backend processing code.</p>
<p><a class="external" href="https://git.lighttpd.net/lighttpd/lighttpd1.4/commit/90f062cb07d362e346792964cf76aac65120d375">https://git.lighttpd.net/lighttpd/lighttpd1.4/commit/90f062cb07d362e346792964cf76aac65120d375</a></p> Lighttpd - Bug #3111: HTTP/2 requests sometimes take very long (missing last chunk)https://redmine.lighttpd.net/issues/3111?journal_id=128312021-10-28T08:27:09Zflynn
<ul></ul><p>We had an issue with an etherpad backend, this patch solves the issue, the etherpad backend has a reproducable response timing now.</p>