Bug #3075
closedTLS 1.3 with SessionTicket fail for the first 8 hours of 1970
Description
I've recently switched for an embedded project from lighttpd 1.4.55 to 1.4.58 and started having issues in some cases when connecting using https with all clients (Firefox, Chrome, curl, ...) that failed with an error about TLS session ticket (see attached curl logs).
After some investigation it appears that the issue occured only on devices that had their date set between 1970-01-01 00:00:00 and 1970-01-01 07:00:00.
(embedded devices are more prone to be in 1970 as they don't always have a RTC to keep date and time)
In addition I've seen that the default openssl parameters changed from "Options" => "ServerPreference,-SessionTicket" in 1.4.55 to "Options" => "ServerPreference" in 1.4.56 (as described in the wiki [1]) which is why I didn't see the issue before upgrading.
Using "-SessionTicket" in the configuration file was a workaround but I think a proper fix would be better.
I've looked a little bit in mod_openssl.c:
First stek_rotate_ts is initialized to 0.
Then in mod_openssl_session_ticket_key_check, the code checks if (cur_ts-28800) is greater than stek_rotate_ts
However as we are early 1970 it's not the case so the ticket is not generated until 28800 seconds passed which is 8 hours.
I've made a quick patch (see attached) which is maybe not the best but mostly fixed the problem for me.
[1] https://redmine.lighttpd.net/projects/1/wiki/docs_ssl#Perfect-Forward-Secrecy-PFS
Files
Updated by gstrauss over 3 years ago
Hahaha. Your post made me laugh ... and then <sigh>
Sure, I'll look into some more defensive coding here and elsewhere in lighttpd, in case lighttpd finds itself mired in 1970.
Edit: see lighttpd git master branch for commit dbe3e236 and commit c035eb77 which prefers monotonic clock for various time references. (When I did that last week, times for session tickets were left using realtime since session tickets are communicated externally and lighttpd allows them to be configured from external sources.)
Updated by gstrauss over 3 years ago
- Status changed from New to Patch Pending
- Target version changed from 1.4.x to 1.4.60
I reviewed the lighttpd code base and this initialization of the STEK in the TLS modules is the only place that jumped out at me, the same place you identified.
I've made a quick patch (see attached) which is maybe not the best but mostly fixed the problem for me.
Would you explain further what you mean by "mostly fixed the problem for me"?
Your patch looks like one viable option. Another is to initialize stek_rotate_ts to a negative value, e.g. -28800 or -86400 depending on the TLS module. That assumes that time_t is a signed value, which is generally true, but not guaranteed. I'll probably base a solution on your patch.
diff --git a/src/mod_gnutls.c b/src/mod_gnutls.c index 480ce739..68189575 100644 --- a/src/mod_gnutls.c +++ b/src/mod_gnutls.c @@ -407,7 +407,8 @@ mod_gnutls_session_ticket_key_check (server *srv, const plugin_data *p, const ti if (stek->expire_ts < cur_ts) mod_gnutls_session_ticket_key_free(); } - else if (cur_ts - 86400 >= stek_rotate_ts) { /*(24 hours)*/ + else if (cur_ts - 86400 >= stek_rotate_ts /*(24 hours)*/ + || 0 == stek_rotate_ts) { mod_gnutls_session_ticket_key_rotate(srv); stek_rotate_ts = cur_ts; } diff --git a/src/mod_mbedtls.c b/src/mod_mbedtls.c index 6352c733..e29f040f 100644 --- a/src/mod_mbedtls.c +++ b/src/mod_mbedtls.c @@ -361,7 +361,9 @@ mod_mbedtls_session_ticket_key_check (plugin_data *p, const time_t cur_ts) mbedtls_cipher_get_key_bitlen(&key->ctx), MBEDTLS_ENCRYPT); if (0 != rc) { /* expire key immediately if error occurs */ - key->generation_time = cur_ts - ctx->ticket_lifetime - 1; + key->generation_time = cur_ts > ctx->ticket_lifetime + ? cur_ts - ctx->ticket_lifetime - 1 + : 0; ctx->active = 1 - ctx->active; } mbedtls_platform_zeroize(stek, sizeof(tlsext_ticket_key_t)); diff --git a/src/mod_openssl.c b/src/mod_openssl.c index f8d5b5dc..93876dd1 100644 --- a/src/mod_openssl.c +++ b/src/mod_openssl.c @@ -446,7 +446,7 @@ mod_openssl_session_ticket_key_check (const plugin_data *p, const time_t cur_ts) rotate = mod_openssl_session_ticket_key_file(p->ssl_stek_file); tlsext_ticket_wipe_expired(cur_ts); } - else if (cur_ts - 28800 >= stek_rotate_ts) /*(8 hours)*/ + else if (cur_ts - 28800 >= stek_rotate_ts || 0 == stek_rotate_ts)/*(8 hrs)*/ rotate = mod_openssl_session_ticket_key_generate(cur_ts, cur_ts+86400); if (rotate) { diff --git a/src/mod_wolfssl.c b/src/mod_wolfssl.c index 2bc1906c..13e0978f 100644 --- a/src/mod_wolfssl.c +++ b/src/mod_wolfssl.c @@ -432,7 +432,7 @@ mod_openssl_session_ticket_key_check (const plugin_data *p, const time_t cur_ts) rotate = mod_openssl_session_ticket_key_file(p->ssl_stek_file); tlsext_ticket_wipe_expired(cur_ts); } - else if (cur_ts - 28800 >= stek_rotate_ts) /*(8 hours)*/ + else if (cur_ts - 28800 >= stek_rotate_ts || 0 == stek_rotate_ts)/*(8 hrs)*/ rotate = mod_openssl_session_ticket_key_generate(cur_ts, cur_ts+86400); if (rotate) {
In addition I've seen that the default openssl parameters changed from "Options" => "ServerPreference,-SessionTicket" in 1.4.55 to "Options" => "ServerPreference" in 1.4.56 (as described in the wiki [1]) which is why I didn't see the issue before upgrading.
Just a clarification: the defaults for SessionTicket
did not change in lighttpd 1.4.56. Session tickets are enabled by default in the openssl library, and the openssl library does not automatically rotate the STEK. lighttpd 1.4.56 and later add behavior to provide and to rotate the STEK by default. That's the behavior change. I think that you ran into missing initialization of the lighttpd's STEK structure due to incorrect time on the embedded device. The wiki documents a secure practice (disable session tickets with -SessionTicket
) for lighttpd 1.4.55 and earlier, if lighttpd is not periodically restarted, e.g. once a day, to force the openssl library to regenerate a new STEK.
I have a question for you: embedded devices do not tend to have a whole lot of entropy, and so it is considered good practice to save a random seed to persistent storage, and then to restore that random seed upon restart to reduce hangs waiting for initial entropy. Why is something similar not done with time? Why not ship the devices with a starting time of manufacturing time (and a random seed changed per device) for bootstrapping? Then the time might be weeks or months in the past, but would never be decades in the past (1970). Things might work better with TLS certificates, which have a better chance of being within a time window of validity. Things might also work better with cookies, HTTP Digest auth, etc.
Updated by DamienT over 3 years ago
gstrauss wrote in #note-2:
Your patch looks like one viable option. Another is to initialize stek_rotate_ts to a negative value, e.g. -28800 or -86400 depending on the TLS module. That assumes that time_t is a signed value, which is generally true, but not guaranteed. I'll probably base a solution on your patch.
Yes, the fact that time_t can be either signed or unsigned makes it more complicated to find a simple solution.
I also though about casting cur_ts to something like uint64_t before substracting so that substracting something would make a big value.
I said that I wasn't sure about the patch because I didn't dig deep enough in the code to know if calling mod_openssl_session_ticket_key_generate() with the same values multiple time was an issue or not.
Because it could happen that cur_ts stays at zero all the time (broken time() implementation), which could work with the current code (function never called) but could cause issues if "0 == stek_rotate_ts" is added to the check as it would be called continuously.
If it is safe to call the function with the same parameters multiple times and everything is handled properly there, then I think it's a good patch.
I have a question for you: embedded devices do not tend to have a whole lot of entropy, and so it is considered good practice to save a random seed to persistent storage, and then to restore that random seed upon restart to reduce hangs waiting for initial entropy. Why is something similar not done with time? Why not ship the devices with a starting time of manufacturing time (and a random seed changed per device) for bootstrapping? Then the time might be weeks or months in the past, but would never be decades in the past (1970). Things might work better with TLS certificates, which have a better chance of being within a time window of validity. Things might also work better with cookies, HTTP Digest auth, etc.
Usually devices either have a RTC or use something like NTP to stay in time so the local date and time will still be good
But the RTC sometimes use a capacitor instead of a battery, which can be depleted in a few weeks if unplugged (instead of the few years of a typical RTC battery), and NTP needs the proper server which can be unavailable.
It would indeed be possible to store the time on shutdown (or at regular intervals as devices often don't have shutdown buttons but are often just unplugged), but depending on the shutdown duration the time could be off by a few minutes which could go unnoticed.
I think most software are often working just as fine in 1970 than with any other date, so there is usually no reason to implement something different.
Updated by gstrauss over 3 years ago
Because it could happen that cur_ts stays at zero all the time (broken time() implementation), which could work with the current code (function never called) but could cause issues if "0 stek_rotate_ts" is added to the check as it would be called continuously.
Were time()
to always return 0, that would not be a bug in lighttpd, but it might result in issues in lighttpd, such as application-level client idle timeout never triggering. Maintenance triggers -- including STEK rotation -- might never occur. So the (0 stek_rotate_ts)
addition is fine.
Musing ... I am considering having lighttpd use int64_t (except when passing pointers) instead of time_t so that even 32-bit instances of lighttpd running on systems with 32-bit (signed) time_t are less susceptible to Y2038 issues. Then again, it sounds like some portion of those devices are stuck in 1970. :)
Updated by gstrauss over 3 years ago
I think most software are often working just as fine in 1970 than with any other date, so there is usually no reason to implement something different.
<rant>
The scourge of the "internet of things" is ignoring security or treating security as unimportant. Among other things, TLS certificate validity is based on a shared notion of time, as are auth cookies and HTTP Digest auth nonces.
</rant>
Updated by DamienT over 3 years ago
Yes time() always returning 0 would be a major issue in the base system, I guess I might sometimes be doing defense programming a little bit too much !
I agree with you about IoT sometimes being very insecure, and it's hard for customers to know if devices are secure.
But even when you want to be implement it securely it can be complicated, time is not always easy to get and I don't think that there is a proper way to offer a https server with a valid certificate.
Which is probably why most IoT devices makes the internet connection mandatory (to get time from an online server and to receive commands from an online server) and you need to connect to an online server to configure the device that is on your LAN, which is not great either.
Updated by gstrauss over 3 years ago
- Status changed from Patch Pending to Fixed
Applied in changeset d50d4dc0e557d582a0da4f3116ef1c4ab7a0bd78.
Updated by gstrauss about 3 years ago
- Related to Bug #3080: Lighttpd 1.4.58 SSL connections stop working if system time of lighttpd server is changed to future one (+12h or even days) added
Updated by gstrauss about 3 years ago
- Related to deleted (Bug #3080: Lighttpd 1.4.58 SSL connections stop working if system time of lighttpd server is changed to future one (+12h or even days))
Updated by gstrauss about 3 years ago
- Has duplicate Bug #3080: Lighttpd 1.4.58 SSL connections stop working if system time of lighttpd server is changed to future one (+12h or even days) added
Also available in: Atom