https://redmine.lighttpd.net/https://redmine.lighttpd.net/favicon.ico?13667327412022-04-06T09:03:01Zlighty labsLighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=131892022-04-06T09:03:01Zgstrauss
<ul></ul><p>Thanks for the thorough info. It looks like h2c has been cleaned up, and the SIGSEGV occurs on a NULL pointer dereference. I need to look into why...<br /><pre>
(gdb) bt full
#0 0x00000055555668f4 in connection_state_machine_h2 (h2r=0x555570a520, con=0x555570a520) at connections.c:1209
h2c = 0x0
</pre></p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=131902022-04-06T10:22:14Zgstrauss
<ul></ul><p>So far, nothing obvious has jumped out to me as an error. Given the stack trace you provided, the connection is still in the job queue after being retired (which can happen), but the connection should not have <code>con->request.http_version</code> set to <code>HTTP_VERSION_2</code> after being retired, and so should not reach <code>connection_state_machine_h2()</code></p>
<p><code>connection_handle_response_end_state()</code> is the only place which calls <code>h2_retire_con()</code>, which is the only place which sets <code>con->h2 = NULL</code>. (<code>con->h2</code> is allocated when <code>con->request.http_version</code> is set to <code>HTTP_VERSION_2</code>.)</p>
<p>Further along in <code>connection_handle_response_end_state()</code>, <code>request_reset()</code> is called (either directly or through the call to <code>connection_handle_shutdown()</code> which calls <code>connection_reset()</code>, which calls <code>request_reset()</code>), and <code>request_reset()</code> changes <code>con->request.http_version</code> to <code>HTTP_VERSION_UNSET</code>.</p>
<p>I'll have to look more into this later. In the meantime, are you able to share your lighttpd config (<code>lighttpd -f /etc/lighttpd/lighttpd.conf -p</code>) so that I can see what other modules might be involved?</p>
<p>While the following might be a workaround (and might need further minor adjustments to be sufficient), I would like to track down why this is occurring, so the following is a bandaid, and not a solution.<br /><pre>
--- a/src/connections.c
+++ b/src/connections.c
@@ -1205,6 +1205,7 @@ static void
connection_state_machine_h2 (request_st * const h2r, connection * const con)
{
h2con * const h2c = con->h2;
+ if (NULL == h2c) return;
if (h2c->sent_goaway <= 0
&& (chunkqueue_is_empty(con->read_queue) || h2_parse_frames(con))
</pre></p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=131912022-04-06T11:05:42Zgstrauss
<ul></ul><p>Here is a guess. Would you please try this patch if you can? Perhaps ALPN negotiates h2 in a TLS module before mod_sockproxy picks up the connection.<br /><pre>
--- a/src/mod_sockproxy.c
+++ b/src/mod_sockproxy.c
@@ -163,6 +163,7 @@ static handler_t mod_sockproxy_connection_accept(connection *con, void *p_d) {
hctx->create_env = sockproxy_create_env_connect;
hctx->response = chunk_buffer_acquire();
r->http_status = -1; /*(skip HTTP processing)*/
+ r->http_version = HTTP_VERSION_UNSET;
}
return HANDLER_GO_ON;
</pre></p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=131922022-04-06T12:48:41Zultimator
<ul><li><strong>File</strong> <a href="/attachments/2182">lighttpd.conf</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/2182/lighttpd.conf">lighttpd.conf</a> added</li></ul><p>Thank you for looking into this.<br /><code>lighttpd -f /etc/lighttpd/lighttpd.conf -p</code> is attached.<br />I'll try your patch and report back.</p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=131932022-04-07T16:22:19Zultimator
<ul></ul><p>It's been over a day now with the patch applied and no crash so far. <br />Looks promising :)<br />Thank you</p>
<p>I'll continue the testing and report back in a few days if sth changes</p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=131942022-04-07T17:26:42Zgstrauss
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Patch Pending</i></li><li><strong>Target version</strong> changed from <i>1.4.xx</i> to <i>1.4.65</i></li></ul><p>Thanks for the update. To confirm: Are you using only the one-line patch to mod_sockproxy? (and not the bandaid to connections.c?)</p>
<p>(If you still have core files from crash, you could <code>print h2r->handler_module->name</code> and (if <code>h2r->handler_module</code> is not NULL), then I would expect to see "sockproxy".)</p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=131952022-04-07T17:27:25Zgstrauss
<ul><li><strong>Subject</strong> changed from <i>Random Segfaults with version 1.4.64</i> to <i>Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2</i></li><li><strong>Category</strong> set to <i>mod_sockproxy</i></li></ul> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=131962022-04-07T18:14:41Zultimator
<ul></ul><p>Yes I only used the patch to <code>mod_sockproxy</code>.<br />Unfortunately I have no core dumps. However if the patch turns out to be working, I could remove it, wait for a crash and then check this if that's desired.</p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=131972022-04-07T18:33:51Zgstrauss
<ul></ul><blockquote>
<p>Unfortunately I have no core dumps. However if the patch turns out to be working, I could remove it, wait for a crash and then check this if that's desired.</p>
</blockquote>
<p>Thanks, but not necessary. The simple mod_sockproxy patch fixes something that I overlooked when adding HTTP/2 support to lighttpd.</p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=131992022-04-08T10:40:06Zgstrauss
<ul><li><strong>Status</strong> changed from <i>Patch Pending</i> to <i>Fixed</i></li></ul><p>Applied in changeset <a class="changeset" title="[mod_sockproxy] reset http vers, avoid rare crash (fixes #3152) (thx ultimator) x-ref: "Rando..." href="https://redmine.lighttpd.net/projects/lighttpd/repository/14/revisions/e5dc98faf3b873704290f4b6df3f631784ac294f">e5dc98faf3b873704290f4b6df3f631784ac294f</a>.</p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=132092022-04-19T08:31:09Zultimator
<ul></ul><p>Bad news. It just happened again with the exact same stacktrace as before. However I think the patch seems to work somehow because it took a lot longer this time. <br />I also inadvertently closed the debugger so I have to wait for the next crash for a core dump.</p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=132102022-04-19T15:33:54Zgstrauss
<ul></ul><p>Please save the core dump. If I can access the core file and the associated lighttpd executable file, I can examine the program state outside the stack trace.</p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=132112022-04-20T03:16:58Zgstrauss
<ul></ul><p>If the TLS ClientHello is received or processed after lighttpd mod_sockproxy has been initialized for the connection, I think that might explain what you are seeing. I think one answer is to avoid selecting a protocol using TLS ALPN if mod_sockproxy will be handling the connection. However, doing so will break someone using mod_sockproxy to connect to an HTTP/2 backend, and desiring to use HTTP/2. For that, an alternative patch (further below) might be better. I am leaning towards applying the second patch below, and applying a similar patch to other lighttpd TLS modules.<br /><pre>
--- a/src/mod_openssl.c
+++ b/src/mod_openssl.c
@@ -1874,6 +1874,8 @@ mod_openssl_alpn_select_cb (SSL *ssl, const unsigned char **out, unsigned char *
handler_ctx *hctx = (handler_ctx *) SSL_get_app_data(ssl);
unsigned short proto;
UNUSED(arg);
+ if (hctx->r->handler_module) /*(e.g. mod_sockproxy)*/
+ return SSL_TLSEXT_ERR_NOACK;
for (unsigned int i = 0, n; i < inlen; i += n) {
n = in[i++];
</pre></p>
<p>Alternative patch<br /><pre>
--- a/src/mod_openssl.c
+++ b/src/mod_openssl.c
@@ -1883,7 +1883,8 @@ mod_openssl_alpn_select_cb (SSL *ssl, const unsigned char **out, unsigned char *
if (in[i] == 'h' && in[i+1] == '2') {
if (!hctx->r->conf.h2proto) continue;
proto = MOD_OPENSSL_ALPN_H2;
- hctx->r->http_version = HTTP_VERSION_2;
+ if (hctx->r->handler_module == NULL)/*(e.g. not mod_sockproxy)*/
+ hctx->r->http_version = HTTP_VERSION_2;
break;
}
continue;
</pre></p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=132132022-04-23T14:18:19Zultimator
<ul></ul><p>It crashed again. I dumped the core but I can't upload it here due to filesize restrictions. So I put it on Google Drive:<br /><a class="external" href="https://drive.google.com/file/d/1JmruFxcxVg92QWGT7oY51KWloz94x4Xg/view?usp=drivesdk">https://drive.google.com/file/d/1JmruFxcxVg92QWGT7oY51KWloz94x4Xg/view?usp=drivesdk</a></p>
<p>I'll also apply the second patch now and see if that works.</p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=132142022-04-23T15:23:22Zgstrauss
<ul></ul><p>Thank you. I can see from the core that there are 4 connections, and the crash occurs on a connection being handled by mod_sockproxy before any non-TLS bytes are read. (I can't as easily look into mod_openssl.so state since you did not include that in the .tar.) Still, the core supports my hunch that the connection is negotiating h2 via ALPN, but that mod_sockproxy has already claimed the connection, so the http_version is not being reset. I think the latest patch will address your issue.</p>
<p>As an additional safeguard, my dev branch contains the following patch, which should (independently) avoid the issue.<br /><pre>
--- a/src/connections.c
+++ b/src/connections.c
@@ -1378,11 +1380,11 @@ connection_state_machine_h1 (request_st * const r, connection * const con)
void
connection_state_machine (connection * const con)
{
request_st * const r = &con->request;
- if (r->http_version == HTTP_VERSION_2)
+ if (con->h2)
connection_state_machine_h2(r, con);
else /* if (r->http_version <= HTTP_VERSION_1_1) */
connection_state_machine_h1(r, con);
}
</pre></p>
<p>As an aside should others find this, a short-term <em>workaround</em> is to disable h2 support in lighttpd, unless HTTP/2 support is critical to your needs<br /><code>server.feature-flags += ("server.h2proto" => "disable")</code></p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=132152022-04-23T16:48:55Zultimator
<ul></ul><p>gstrauss wrote in <a href="#note-15">#note-15</a>:</p>
<blockquote>
<p>I can't as easily look into mod_openssl.so state since you did not include that in the .tar.</p>
</blockquote>
<p>I added all the *.mod files to the archive. You can use the link from comment <a href="#note-14">#note-14</a></p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=132162022-04-23T17:35:44Zgstrauss
<ul></ul><p><a class="user active user-mention" href="https://redmine.lighttpd.net/users/14199">@ultimator</a>: The core contains sensitive info. Please change your TLS certificate(s) and revoke the current one(s), if necessary.</p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=132172022-04-23T17:47:12Zgstrauss
<ul></ul><p>The mod_openssl.c patch you are now testing should address the issue. My guess is that some scanner is connecting to port 853 and trying TLS ALPN h2. Later this weekend or on Mon, I'll try to reproduce this and confirm that it is fixed with either of the mod_openssl.c patches above.</p> Lighttpd - Bug #3152: Random Segfaults with version 1.4.64 w/ mod_sockproxy and ALPN h2https://redmine.lighttpd.net/issues/3152?journal_id=132182022-04-23T18:20:11Zultimator
<ul></ul><p>gstrauss wrote in <a href="#note-17">#note-17</a>:</p>
<blockquote>
<p><a class="user active user-mention" href="https://redmine.lighttpd.net/users/14199">@ultimator</a>: The core contains sensitive info. Please change your TLS certificate(s) and revoke the current one(s), if necessary.</p>
</blockquote>
<p>Thanks, already done.</p>