Project

General

Profile

TLS connection dropped regardless keep-alive.

Added by vevsvevs over 2 years ago

Good day! Could you please check if this condition ok or not?

https://github.com/lighttpd/lighttpd1.4/commit/1542e44bb722cd33f8e32dbce4b208977f3420f3#diff-f73da2f5a22e9e3b8fc4b928282446be064ca963fc12733689eeb2b0350c5ccbR153

For me, it causes the dropping of the TLS connection when the idle time limit is reached, regardless of the actual connection state (keep-alive). This behavior reproduces since the commit (version 1.4.54) till now.

Sorry if I said smth stupid, I'm not a coder at all (I mean it - at all), so just investigating intuitively. Reverting this condition to the pre-commit state solved my problem.


Replies (11)

RE: TLS connection dropped regardless keep-alive. - Added by gstrauss over 2 years ago

Multi-posted: https://github.com/lighttpd/lighttpd1.4/commit/1542e44bb722cd33f8e32dbce4b208977f3420f3#commitcomment-60591442
Please read the response there, but I think it might be better to continue the conversation here.

The commit you referenced contains in the commit message a reference to https://redmine.lighttpd.net/boards/2/topics/8491 for which the change was made.

Sorry if I said smth stupid,

You have not said anything of the sort. At the same time, you have not said enough.
Please see How to get help

RE: TLS connection dropped regardless keep-alive. - Added by vevsvevs over 2 years ago

Thank you for your answer!

I'll try to describe my setup, but not really sure if I can make it in some useful way :)

Client: an obfuscated Android app (I hacked it to override some API requests destination), which communication part implemented via OkHttp3 framework. For network activity monitoring I use Charles.

Server: I used Lighttpd 1.4.35 upon Ubuntu 16.04 for over 2 years without a hitch. But upgrading to 20.04 brought me the 1.4.55 version instead, which causes described behavior. Also, I've tested all Lighttpd versions up to the latest master (it was the second time I built smth from the sources in my life) with OpenSSL 1.1.1f and got the same result. During the tests, all timeouts and fine-tune server settings were left by default, none of them were set manually.

Behavior: some keep-alive requests from the app can't be done due to the connection close (Charles reports "Remote server closed the connection before sending response header"). Requests themselves are very small and normally get responses from the backend instantly, so it's definitely not a timeout issue. But I noticed that amount of dropped connections is related to the max-keep-alive-idle setting: the app gets dropped connection only if the reuse attempt comes after the idle timer value. I.e. app makes a keep-alive request and gets an answer -> I wait 6 sec (as the default server idle timeout is 5 sec) and force the app to make the same-pointed request -> no response due to connection closed. And if I set timeout to 15 - I have to wait for 15+ sec between the requests to reproduce the problem.

The app doesn't have such an issue talking to its original API, or talking to my API served by Lighttpd prior to the 1.4.54 version and any modern version with the modded condition in the connections.c file. From my absolutely dilettantish view, it looks like 1.4.54+ Lighttpd doesn't respect the keep-alive requests in some circumstances, rather silently close the connection by the timer.

Sorry, I don't know what SSL_shutdown and close notify really are, and how all this TLS negotiation stuff works... But I would be glad to gather any logs or auxiliary info and make any test, that will be possible with my knowledge and abilities. Please tell me if I can safely upload here Charles session (or trace/har) file with the issue representation?

RE: TLS connection dropped regardless keep-alive. - Added by gstrauss over 2 years ago

(FYI SSL_shutdown() is an OpenSSL API called by lighttpd. close notify is part of the TLS protocol (above the layer of TCP socket shutdown) when shutting down the encrypted connection.)

Please tell me if I can safely upload here Charles session (or trace/har) file with the issue representation?

You can attach files here, e.g. .gz or .xz, or .txt if small. Or you can use https://paste.lighttpd.net

Server: I used Lighttpd 1.4.35 upon Ubuntu 16.04

That is criminally old. lighttpd 1.4.35 was released Mar 2014, more than 7 years ago. Latest lighttpd release is lighttpd 1.4.61, so even lighttpd 1.4.55 is old, released almost 2 years ago.

Does your app support HTTP/2? That was added in lighttpd 1.4.56. Also in lighttpd 1.4.56 was a revamped mod_openssl to use newer OpenSSL APIs.
Please test your app with the latest lighttpd release (lighttpd 1.4.61) to confirm that this is still an issue for you with lighttpd 1.4.61.

RE: TLS connection dropped regardless keep-alive. - Added by gstrauss over 2 years ago

An HTTP server can close an idle HTTP keep-alive connection at any time between requests. The client should be able to handle this.

For testing purposes only:
The following patch against lighttpd 1.4.61 will skip the clean TLS protocol close_notify and will shutdown the TCP socket if lighttpd has not read any data from the client in the past second. If this patch seems to work for you, then it is likely that your app is not properly handling a clean TLS close notify from lighttpd indicating that the client should proceed to shut down the TLS connection. (If you try to apply this patch to lighttpd 1.4.55, then replace log_monotonic_secs with log_epoch_secs in two places in the patch)

--- a/src/connections.c
+++ b/src/connections.c
@@ -165,7 +165,8 @@ static void connection_handle_shutdown(connection *con) {

        /* close the connection */
        if (con->fd >= 0
-           && (con->is_ssl_sock || 0 == shutdown(con->fd, SHUT_WR))) {
+           && ((con->is_ssl_sock && con->read_idle_ts == log_monotonic_secs)
+               || 0 == shutdown(con->fd, SHUT_WR))) {
                con->close_timeout_ts = log_monotonic_secs;

                request_st * const r = &con->request;

RE: TLS connection dropped regardless keep-alive. - Added by vevsvevs over 2 years ago

Good day, and sorry for the late answer.

That is criminally old. lighttpd 1.4.35 was released Mar 2014, more than 7 years ago. Latest lighttpd release is lighttpd 1.4.61, so even lighttpd 1.4.55 is old, released almost 2 years ago.

As I mentioned - I'm not a coder nor a Linux admin at all, so I just grab and use what comes out of the box... And it worked pretty well for me, till I decided to upgrade :)

Does your app support HTTP/2? That was added in lighttpd 1.4.56. Also in lighttpd 1.4.56 was a revamped mod_openssl to use newer OpenSSL APIs.

According to the Charles - some requests are H2 and some HTTP1.1. Both of them face the same "connection closed" problem from time to time.

Please test your app with the latest lighttpd release (lighttpd 1.4.61) to confirm that this is still an issue for you with lighttpd 1.4.61.

As I wrote earlier - I tried all tagged versions starting from 1.4.54 up to the latest master (1.4.62) - all of them demonstrate the same issue for me.

An HTTP server can close an idle HTTP keep-alive connection at any time between requests. The client should be able to handle this.

It sounds logical for me too, but as I wrote - the app never experienced such a problem with the 1.4.35 version. I can't believe that smth is broken in this version in such a lucky way that allows the app to work more properly with the closed connections :)

If this patch seems to work for you, then it is likely that your app is not properly handling a clean TLS close notify from lighttpd indicating that the client should proceed to shut down the TLS connection.

With this patch, the app's behavior became the same as after the condition reverting that I tried myself before. But I've made a lot more test runs and unfortunately figured out that both solutions don't solve the problem for 100% (as I thought earlier): with both patches "connection closed" issue reproduces in a very specific time gap - exactly between 5th and 6th second since the previous request. Meanwhile, 1.4.35 version is rock solid under any circumstances.

RE: TLS connection dropped regardless keep-alive. - Added by gstrauss over 2 years ago

As I mentioned - I'm not a coder nor a Linux admin at all,

...

I can't believe that smth is broken in this version in such a lucky way that allows the app to work more properly with the closed connections :)

Do you give medical advice, too?

My assessment stands:

If this patch seems to work for you, then it is likely that your app is not properly handling a clean TLS close notify from lighttpd indicating that the client should proceed to shut down the TLS connection.

The behavior in the patch that you identified as the cause of your issue was written to solve a real issue: https://redmine.lighttpd.net/boards/2/topics/8491

You are suggesting that a forced and unclean shutdown of the TLS connection is what you want, so you can go ahead and patch lighttpd to do that, but doing so is not correct for everyone, and is definitely wrong for some, if not almost all.

RE: TLS connection dropped regardless keep-alive. - Added by gstrauss over 2 years ago

Here's someone else's rant on how many clients and servers do not implement the TLS close notify alert correctly. (lighttpd does implement TLS close notify alert, though with a timeout.)
http://garrett.damore.org/2017/11/tls-close-notify-what-were-they-thinking.html

RE: TLS connection dropped regardless keep-alive. - Added by vevsvevs over 2 years ago

Do you give medical advice, too?

I'm not quite sure what you mean by this, but if I insulted your pov somehow - I'm really sorry.

The behavior in the patch that you identified as the cause of your issue was written to solve a real issue

It looks like my English skills are not good enough to express myself clearly :) In my last answer, I tried to tell you that both "my patch" and your test solution didn't really solve the problem, unfortunately. So now I'm at the start point again, and still only may guessing why the app doesn't have any problems while talking its original API (Nginx) or my API while served 1.4.35 version. Oh, and I want to remind you that it's not my app, it's MiHome (Xiaomi smart home application) which API interaction based on the okhttp3 framework.

You are suggesting that a forced and unclean shutdown of the TLS connection is what you want

Once again: at this point, I can't even imagine what's going on and absolutely don't know how to cope with this issue. I can only describe the symptoms and humbly ask for your help. Please help.

RE: TLS connection dropped regardless keep-alive. - Added by vevsvevs over 2 years ago

I feel too intrusive. If you believe that all this conversation became senseless because of my ignorance - please let me know :)

RE: TLS connection dropped regardless keep-alive. - Added by gstrauss over 2 years ago

Oh, and I want to remind you that it's not my app, it's MiHome (Xiaomi smart home application) which API interaction based on the okhttp3 framework.

Remind me? This is the first time you have mentioned this information.

If you believe that all this conversation became senseless because of my ignorance - please let me know :)

The conversation is unlikely to make any further progress, as an educated answer has already been provided:

If this patch seems to work for you, then it is likely that your app is not properly handling a clean TLS close notify from lighttpd indicating that the client should proceed to shut down the TLS connection.

From my brief look, okhttp3 documentation suggests that okhttp3 should handle disconnection and request retry, but that does not seem to be happening for you. Perhaps you should ask on MiHome or okhttp3 forums.

RE: TLS connection dropped regardless keep-alive. - Added by vevsvevs over 2 years ago

Got it, thank you for your help and amazing efforts put into this project!

    (1-11/11)