Bug #285
closedactive SSL connection loss (SSL3_WRITE_PENDING:bad write retry) (CVE-2008-1531)
Description
I'm seeing a gazillion log entries like these:
2005-09-22 17:41:44: (network_openssl.c.102) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2005-09-22 17:41:44: (connections.c.494) connection closed: write failed on fd 11
The page fails to complete writing. When I view the page in non-ssl mode, the page runs to completion.
-- sean
Files
Updated by jan about 19 years ago
- Status changed from New to Assigned
What ? we fixed this 1.4.1 and fixed the fix in 1.4.2. Even OpenBSD was happy afterwards.
Please verify that you are really using 1.4.2 or higher.
Updated by Anonymous about 19 years ago
Confirmed, status-config reports 1.4.3.
-- sean
Updated by Anonymous about 19 years ago
Any luck with this? What other info could I get that would be helpful?
-- sean
Updated by jan about 18 years ago
- Status changed from Assigned to Fixed
- Resolution set to fixed
fixed in 1.4.12
Updated by Anonymous almost 17 years ago
- Status changed from Fixed to Need Feedback
- Resolution deleted (
fixed)
Debian server
- uname -a
Linux 2.6.18-3-amd64 #1 SMP Mon Dec 4 17:04:37 CET 2006 x86_64 GNU/Linux
- openssl version
OpenSSL 0.9.8c 05 Sep 2006
- lighttpd -v
lighttpd-1.4.13 (ssl) - a light and fast webserver
Build-Date: Sep 21 2007 15:20:00
Got this kind of error
2007-11-09 15:05:35: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure
2007-11-09 15:17:04: (network_openssl.c.133) SSL: 5 -1 104 Connection reset by peer
2007-11-09 15:17:04: (connections.c.588) connection closed: write failed on fd 17
2007-11-09 15:17:16: (network_openssl.c.133) SSL: 5 -1 104 Connection reset by peer
2007-11-09 15:17:16: (connections.c.588) connection closed: write failed on fd 36
2007-11-09 15:17:16: (network_openssl.c.154) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-11-09 15:17:16: (connections.c.588) connection closed: write failed on fd 15
2007-11-09 15:17:17: (network_openssl.c.154) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-11-09 15:17:17: (connections.c.588) connection closed: write failed on fd 16
Updated by Anonymous almost 17 years ago
also getting this error
2007-12-07 10:32:22: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-07 10:35:40: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-07 10:37:43: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-07 10:53:00: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-07 11:22:36: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-07 11:45:12: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-07 11:59:52: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-07 12:00:32: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
I got rid of the 99.9% of the ssl handshake errors with the IE/SSL/keepalive = 60s fix. But these remained.
FreeBSD 6.1-RELEASE-p10 FreeBSD amd64 root@long# openssl version OpenSSL 0.9.7e-p1 25 Oct 2004 root@long# lighttpd -v lighttpd-1.4.18 (ssl) - a light and fast webserver Build-Date: Nov 23 2007 13:51:35
-- oliver
Updated by Anonymous almost 17 years ago
2007-12-17 22:09:12: (network_openssl.c.154) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 22:09:12: (connections.c.603) connection closed: write failed on fd 136
Updated by Anonymous almost 17 years ago
Hello, we still have this problem in the last version available in Debian Lenny, this is very blocking with a token plug-in ("mod_secdownload") because link are not valid anymore.
2007-12-17 18:46:10: (log.c.75) server started
2007-12-17 18:47:06: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 18:47:06: (connections.c.588) connection closed: write failed on fd 8
2007-12-17 18:47:06: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 18:47:06: (connections.c.588) connection closed: write failed on fd 10
2007-12-17 18:52:34: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 18:52:34: (connections.c.588) connection closed: write failed on fd 9
2007-12-17 18:52:34: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 18:52:34: (connections.c.588) connection closed: write failed on fd 8
2007-12-17 19:00:59: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 19:00:59: (connections.c.588) connection closed: write failed on fd 8
2007-12-17 19:01:44: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 19:01:44: (connections.c.588) connection closed: write failed on fd 9
2007-12-17 19:01:44: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 19:01:44: (connections.c.588) connection closed: write failed on fd 8
2007-12-17 19:01:44: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 19:01:44: (connections.c.588) connection closed: write failed on fd 11
This breaks the connection, and client will get an 408 error because the link is deprecated
Thanks a lot for your help :)
-- Bruno
Updated by ziemkowski almost 17 years ago
Same for us, except it's also dying on a development server with only the latest Firefox 2.* browsers hitting it. Appears to only happen for us on PHP pages; static files do not appear to be failing.
2007-12-18 19:34:18: (network_openssl.c.154) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-18 19:34:18: (connections.c.603) connection closed: write failed on fd 9
lighttpd-1.4.18 (ssl) - a light and fast webserver Build-Date: Oct 20 2007 08:34:49 OpenSSL 0.9.7m 23 Feb 2007 Linux 2.6.18-028stab031 #2 SMP Mon Aug 13 13:45:16 MDT 2007 i686 i686 i386 GNU/Linux PHP 5.2.4 (cgi-fcgi) (built: Oct 21 2007 05:44:24) Copyright (c) 1997-2007 The PHP Group Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies
This appears to be a common problem... linking related bugs as blocked, although some claim to be fixed only to return again later. Increasing to Blocker as this blocks production viability.
Updated by Anonymous almost 17 years ago
I also get SSL errors with lighttpd 1.4.13-etch8 on Debian Etch using the standard configuration:
2007-12-21 18:52:50: (connections.c.279) SSL: 1 error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca
2007-12-21 18:52:50: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure
2007-12-21 18:52:50: (connections.c.279) SSL: 1 error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca
2007-12-21 18:52:50: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure
2007-12-21 18:52:51: (connections.c.279) SSL: 1 error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca
2007-12-21 18:52:51: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure
Updated by Anonymous almost 17 years ago
Can confirm this bug on Debian Etch:
lighttpd-1.4.18 - a light and fast webserver
Build-Date: Dec 28 2007 15:01:37
2007-12-28 15:09:56: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure
Any solution for this in sight?
Updated by Anonymous over 16 years ago
I am also seeing this.
- uname -srm
FreeBSD 7.0-RC1 amd64
- /usr/local/sbin/lighttpd -v
lighttpd-1.4.18 (ssl) - a light and fast webserver
Build-Date: Nov 23 2007 14:39:40
-- toomas.aas
Updated by stbuehler over 16 years ago
This bug is about "SSL3_WRITE_PENDING:bad write retry", not "handshake failure".
I think the problem is the "evil hack"/workaround for opera in network_openssl.c:
It modifies c->mem, which could have already been used for SSL_write with an SSL_ERROR_WANT_WRITE error, so we could get a new c->mem->ptr which results in the "bad write retry" error.
2 possible solutions:
- Remove the hack
- Delay every chunk till the next is available or connection is closed
I prefer the first - Opera users just have to update their browsers. (Bug in <= 9.01 / 8.54)
It would be nice, if someone could test the patch/give information on how to reproduce the bug.
Updated by stbuehler over 16 years ago
- Status changed from Need Feedback to Fixed
- Resolution set to fixed
Fixed in r2084
Updated by Anonymous over 16 years ago
A fresh lighty 1.4.19 installation has what looks to be the same problem described above, if I'm reading correctly. Error log excerpt:
2008-03-12 09:40:27: (connections.c.279) SSL: 1 error:140780E5:SSL routines:SSL23_READ:ssl handshake failure 2008-03-12 09:41:42: (network_openssl.c.130) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2008-03-12 09:41:42: (connections.c.614) connection closed: write failed on fd 10 2008-03-12 09:43:15: (connections.c.279) SSL: 1 error:140780E5:SSL routines:SSL23_READ:ssl handshake failure 2008-03-12 09:54:44: (network_openssl.c.130) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2008-03-12 09:54:44: (connections.c.614) connection closed: write failed on fd 8 2008-03-12 09:54:44: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2008-03-12 10:00:03: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2008-03-12 10:01:40: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2008-03-12 10:02:24: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2008-03-12 10:02:28: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2008-03-12 10:05:09: (connections.c.279) SSL: 1 error:140780E5:SSL routines:SSL23_READ:ssl handshake failure 2008-03-12 10:11:42: (connections.c.279) SSL: 1 error:140780E5:SSL routines:SSL23_READ:ssl handshake failure
This installed from source on red hat enterprise 3 (yeah, I know). I've switched it back to run apache for now but do have a non-production server I can do further testing on if it's helpful.
-- mstemp5
Updated by Anonymous over 16 years ago
- Status changed from Fixed to Need Feedback
- Resolution deleted (
fixed)
I run into the same problem as bug 258 with the SSL write errors.
Finally I could track down it to the following situation. Start two
parallel downloads using SSL in two different connections. (You can also
download through x-sendfile, or php output. I used wget for easier
reproduce and with large files to have long-lasting connections.)
Now terminate one of them and the other would be closed a bit latter. It
is very annoying as this way large downloads would probably terminate
before finishing.
The log would contain something like this:
2008-03-12 09:41:42: (network_openssl.c.130) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2008-03-12 09:41:42: (connections.c.614) connection closed: write failed on fd 10
The log refers to the second connection which is closed by lighty.
I started debugging the situation an it looked like the SSL error is
generated in ssl3_write_pending function, which happens when the
repeated SSL_write does not have the same arguments as the previous one,
or an other ssl_write is called in between.
I checked these, but everything seemed to be fine.
Also tried a fix from openssl, but without any success:
http://rt.openssl.org/Ticket/Display.html?id=598
However after careful gdb magic the back-trace showed me that the error
function was called from SSL_shutdown and not from SSL_write. The
SSL_shutdown was also called from the connection_state_machine function
on the CON_STATE_ERROR state. Hmm, strange, according to the logs the
error occurred somewhere else...
The SSL_write failed in network_write_chunkqueue_openssl, I realized the
in reality the SSL_write was OK, it only returned SSL_ERROR_WANT_WRITE,
but the SSL error queue contained an other error from an earlier SSL_*
call, in our case from SSL_shutdown.
In connection_state_machine:
1663 case CON_STATE_ERROR: /* transient /
1664
1665 / even if the connection was drop we still have to write it to the access log /
1666 if (con->http_status) {
1667 plugins_call_handle_request_done(srv, con);
1668 }
1669 #ifdef USE_OPENSSL
1670 if (srv_sock->is_ssl) {
1671 int ret;
1672 switch ((ret = SSL_shutdown(con->ssl))) {
1673 case 1:
1674 / ok */
1675 break;
1676 case 0:
1677 SSL_shutdown(con->ssl);
1678 break;
1679 default:
1680 log_error_write(srv, FILE_, LINE_, "sds", "SSL:",
1681 SSL_get_error(con->ssl, ret),
1682 ERR_error_string(ERR_get_error(), NULL));
1683 return -1;
1684 }
1685 }
1686 #endif
On line 1677 SSL_shutdown is called again, because the connection is in
non-blocking mode where the first SSL_shutdown can require an other
call. The problem that the return value of SSL_shutdown is not checked
and in case of error the error queue is not cleared.
When the SSL_write in network_write_chunkqueue_openssl returned a simple
WANT_WRITE error it got the error code from the previous SSL_shutdown
call.
To fix the problem simply we need to check the return value of
SSL_shutdown in 1677 and call ERR_get_error() to remove the error code
from the queue.
An other possible place is in connections.c:1557, but there is no
SSL_shutdown just a FIXME to put it there sometimes when fdevent show
that connection is writeable. (This part is a more frequently run one,
so would have caused more trouble...)
Here is a patch for 1.4.19 r2135, but would be obvious to port to 1.5
series:
Index: connections.c
===================================================================
--- connections.c (revision 2135)
++ connections.c (working copy)@ -1674,7 +1674,15
@
/* ok /
break;
case 0:
- SSL_shutdown(con->ssl);
/
+ * We need to get the error after SSH_shutdown, otherwise it remains
+ * on the error queue and causes latter false-alerts. Usually around
+ * SSL_write methods in network_openssl.c which results to shutdown
+ * of connections.
+ */
+ if (SSL_shutdown(con->ssl) <= 0) {
+ ERR_get_error();
+ }
break;
default:
log_error_write(srv, FILE_, LINE_, "sds", "SSL:",
At least now I learned a lot about lightty and openssl internals. :)
cheers,
Marton
PS: According to google it looks like CUPS has also similar problems...
-- marton.illes
Updated by stbuehler over 16 years ago
- Status changed from Need Feedback to Fixed
- Resolution set to fixed
Good catch!
Fixed in r2136.
I just added some ERR_clear_error() before ssl_write and sll_read to make really sure no old errors are hanging in the queue.
Updated by Anonymous over 16 years ago
Excellent news!
It's early since 1.4.19, but if this tests out well, might I suggest that the importance of the fix justifies another release?
Thanks for your good work.
-- mstemp5
Updated by hoffie over 16 years ago
- Status changed from Fixed to Need Feedback
- Resolution deleted (
fixed)
This is actually a DoS problem, I requested a CVE for it.
The fix does not properly work for me. Lighty no longer drops SSL connections, but it tries to properly close the broken connection, leading to lots of SSL error messages and very high CPU consumption.
I'll try to post an updated patch in a minute.
Updated by hoffie over 16 years ago
I attached a patch, which works without problems for me now (no drop of foreign SSL connections, no endless loop, no countless SSL errors).
I'm not completely sure whether it is correct regarding to logging -- according to the man page of SSL_Shutdown a bidirectional SSL shutdown (that's what this is all about) is optional. With my patch applied, no logging takes place if the second part of the shutdown fails (the "ok, i'll shutdown" from the client). IMO that's fine, but as I said, I'm not sure and it is not me who has to decide.
Function-wise, the patch should be correct, but further testing is certainly appreciated.
Updated by hoffie over 16 years ago
Attached a patch for 1.5 as well, the logging "problem" is not present there as bidirectional SSL shutdown hasn't been implemented yet, it seems (see the FIXME comment in src/connections.c).
Updated by hoffie over 16 years ago
CVE-2008-1531 got assigned to this issue. I'll try the patch later.
Updated by hoffie over 16 years ago
Patch looks fine and appears to work properly. Attaching the same patch against 1.4.19 (distributions might want it).
Updated by stbuehler over 16 years ago
Ok, i hope the ssl error handling is ok in svn now for 1.4.x; i'll leave the bug open for the 1.5.x fix.
Updated by stbuehler over 16 years ago
- Status changed from Need Feedback to Fixed
- Resolution set to fixed
Ok, summary for now:
CVE-2008-1531 (http://nvd.nist.gov/nvd.cfm?cvename=CVE-2008-1531)
- lighttpd-1.4.x: Fixed in r2136, r2139, r2141, r2142 (the first two are the real fixes, the other two change the NEWS file to contain the CVE)
- lighttpd-1.5.x: Fixed in r2140
The problem was: if a user killed his ssl connection, lighttpd would kill another ssl connection as it didn't clear the ssl error queue.
Updated by simoncpu almost 13 years ago
Hi,
We're still experiencing this bug:
% uname -r 8.2-RELEASE-p4 % openssl version OpenSSL 0.9.8q 2 Dec 2010 % lighttpd -v lighttpd/1.4.29 (ssl) - a light and fast webserver Build-Date: Oct 20 2011 17:24:28
Thanks!
Updated by stbuehler almost 13 years ago
- Description updated (diff)
- Missing in 1.5.x set to No
simoncpu: this bug reports lists tons of error descriptions, some valid, some invalid, and is quite old and closed.
You really have to be more specific, and i even recommend opening a new ticket to avoid more confusion.
Also please try the latest 1.4.x svn (or the 1.4.30rc2 snapshot in http://download.lighttpd.net/lighttpd/snapshots-1.4.x/).
Also available in: Atom