Bug #285

active SSL connection loss (SSL3_WRITE_PENDING:bad write retry) (CVE-2008-1531)

Added by Anonymous about 9 years ago. Updated almost 3 years ago.

Status:FixedStart date:
Priority:UrgentDue date:
Assignee:-% Done:

0%

Category:core
Target version:1.5.0
Missing in 1.5.x:No

Description

I'm seeing a gazillion log entries like these:

2005-09-22 17:41:44: (network_openssl.c.102) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2005-09-22 17:41:44: (connections.c.494) connection closed: write failed on fd 11

The page fails to complete writing. When I view the page in non-ssl mode, the page runs to completion.

-- sean

Fix-285-Remove-workaround-for-buggy-Opera-version.patch Magnifier (1.5 KB) stbuehler, 2008-02-16 14:29

lightty-ssl_shutdown-fix.patch Magnifier - -- marton.illes (803 Bytes) Anonymous, 2008-03-22 11:23

06_all_lighttpd-1.4.19-closing_foreign_ssl_connections-dos.diff Magnifier - alternative, hopefully better patch (against 1.4.19, not svn!) (1.9 KB) hoffie, 2008-03-26 23:13

lighttpd-1.5-ssl-dos.patch Magnifier - similar patch against svn trunk (1.5) (1.19 KB) hoffie, 2008-03-26 23:30

fix-ssl-again.patch Magnifier - against lighty-1.4 svn; hopefully fixed the error handling for ssl-shutdown in a clean way (1.81 KB) stbuehler, 2008-03-27 23:17

fix-ssl-again-1.4.19.patch Magnifier - same patch as fix-ssl-again.patch (by stbuehler) against 1.4.19 (2.85 KB) hoffie, 2008-03-28 16:07

committed-patch-1.4.19.patch Magnifier - backport to 1.4.19 of the patch which actually got committed (2.88 KB) hoffie, 2008-03-28 17:00

History

#1 Updated by jan about 9 years ago

  • Status changed from New to Assigned

What ? we fixed this 1.4.1 and fixed the fix in 1.4.2. Even OpenBSD was happy afterwards.

Please verify that you are really using 1.4.2 or higher.

#2 Updated by Anonymous about 9 years ago

Confirmed, status-config reports 1.4.3.

-- sean

#3 Updated by Anonymous almost 9 years ago

Any luck with this? What other info could I get that would be helpful?

-- sean

#4 Updated by jan about 8 years ago

  • Status changed from Assigned to Fixed
  • Resolution set to fixed

fixed in 1.4.12

#5 Updated by Anonymous almost 7 years ago

  • Status changed from Fixed to Need Feedback
  • Resolution deleted (fixed)

Debian server

  1. uname -a
    Linux 2.6.18-3-amd64 #1 SMP Mon Dec 4 17:04:37 CET 2006 x86_64 GNU/Linux
  1. openssl version
    OpenSSL 0.9.8c 05 Sep 2006
  1. lighttpd -v
    lighttpd-1.4.13 (ssl) - a light and fast webserver
    Build-Date: Sep 21 2007 15:20:00

Got this kind of error

2007-11-09 15:05:35: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure
2007-11-09 15:17:04: (network_openssl.c.133) SSL: 5 -1 104 Connection reset by peer
2007-11-09 15:17:04: (connections.c.588) connection closed: write failed on fd 17
2007-11-09 15:17:16: (network_openssl.c.133) SSL: 5 -1 104 Connection reset by peer
2007-11-09 15:17:16: (connections.c.588) connection closed: write failed on fd 36
2007-11-09 15:17:16: (network_openssl.c.154) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-11-09 15:17:16: (connections.c.588) connection closed: write failed on fd 15
2007-11-09 15:17:17: (network_openssl.c.154) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-11-09 15:17:17: (connections.c.588) connection closed: write failed on fd 16

#6 Updated by Anonymous almost 7 years ago

also getting this error


2007-12-07 10:32:22: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 10:35:40: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 10:37:43: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 10:53:00: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 11:22:36: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 11:45:12: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 11:59:52: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 12:00:32: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 

I got rid of the 99.9% of the ssl handshake errors with the IE/SSL/keepalive = 60s fix. But these remained.


FreeBSD 6.1-RELEASE-p10 FreeBSD amd64

root@long# openssl version
OpenSSL 0.9.7e-p1 25 Oct 2004

root@long# lighttpd -v
lighttpd-1.4.18 (ssl) - a light and fast webserver
Build-Date: Nov 23 2007 13:51:35

-- oliver

#7 Updated by Anonymous almost 7 years ago

2007-12-17 22:09:12: (network_openssl.c.154) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 22:09:12: (connections.c.603) connection closed: write failed on fd 136

#8 Updated by Anonymous almost 7 years ago

Hello, we still have this problem in the last version available in Debian Lenny, this is very blocking with a token plug-in ("mod_secdownload") because link are not valid anymore.

2007-12-17 18:46:10: (log.c.75) server started
2007-12-17 18:47:06: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 18:47:06: (connections.c.588) connection closed: write failed on fd 8
2007-12-17 18:47:06: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 18:47:06: (connections.c.588) connection closed: write failed on fd 10
2007-12-17 18:52:34: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 18:52:34: (connections.c.588) connection closed: write failed on fd 9
2007-12-17 18:52:34: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 18:52:34: (connections.c.588) connection closed: write failed on fd 8
2007-12-17 19:00:59: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 19:00:59: (connections.c.588) connection closed: write failed on fd 8
2007-12-17 19:01:44: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 19:01:44: (connections.c.588) connection closed: write failed on fd 9
2007-12-17 19:01:44: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 19:01:44: (connections.c.588) connection closed: write failed on fd 8
2007-12-17 19:01:44: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2007-12-17 19:01:44: (connections.c.588) connection closed: write failed on fd 11

This breaks the connection, and client will get an 408 error because the link is deprecated

Thanks a lot for your help :)

-- Bruno

#9 Updated by ziemkowski almost 7 years ago

Same for us, except it's also dying on a development server with only the latest Firefox 2.* browsers hitting it. Appears to only happen for us on PHP pages; static files do not appear to be failing.


2007-12-18 19:34:18: (network_openssl.c.154) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-18 19:34:18: (connections.c.603) connection closed: write failed on fd 9 

lighttpd-1.4.18 (ssl) - a light and fast webserver
Build-Date: Oct 20 2007 08:34:49

OpenSSL 0.9.7m 23 Feb 2007

Linux 2.6.18-028stab031 #2 SMP Mon Aug 13 13:45:16 MDT 2007 i686 i686 i386 GNU/Linux

PHP 5.2.4 (cgi-fcgi) (built: Oct 21 2007 05:44:24)
Copyright (c) 1997-2007 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies

This appears to be a common problem... linking related bugs as blocked, although some claim to be fixed only to return again later. Increasing to Blocker as this blocks production viability.

#10 Updated by Anonymous almost 7 years ago

I also get SSL errors with lighttpd 1.4.13-etch8 on Debian Etch using the standard configuration:
2007-12-21 18:52:50: (connections.c.279) SSL: 1 error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca
2007-12-21 18:52:50: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure
2007-12-21 18:52:50: (connections.c.279) SSL: 1 error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca
2007-12-21 18:52:50: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure
2007-12-21 18:52:51: (connections.c.279) SSL: 1 error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca
2007-12-21 18:52:51: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure

#11 Updated by Anonymous almost 7 years ago

Can confirm this bug on Debian Etch:

lighttpd-1.4.18 - a light and fast webserver
Build-Date: Dec 28 2007 15:01:37

2007-12-28 15:09:56: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure

Any solution for this in sight?

#12 Updated by Anonymous over 6 years ago

the same on gentoo lighttpd-1.4.18

#13 Updated by Anonymous over 6 years ago

I am also seeing this.

  1. uname -srm
    FreeBSD 7.0-RC1 amd64
  1. /usr/local/sbin/lighttpd -v
    lighttpd-1.4.18 (ssl) - a light and fast webserver
    Build-Date: Nov 23 2007 14:39:40

-- toomas.aas

#14 Updated by stbuehler over 6 years ago

This bug is about "SSL3_WRITE_PENDING:bad write retry", not "handshake failure".

I think the problem is the "evil hack"/workaround for opera in network_openssl.c:

It modifies c->mem, which could have already been used for SSL_write with an SSL_ERROR_WANT_WRITE error, so we could get a new c->mem->ptr which results in the "bad write retry" error.

2 possible solutions:
- Remove the hack
- Delay every chunk till the next is available or connection is closed
I prefer the first - Opera users just have to update their browsers. (Bug in <= 9.01 / 8.54)

It would be nice, if someone could test the patch/give information on how to reproduce the bug.

#15 Updated by stbuehler over 6 years ago

  • Status changed from Need Feedback to Fixed
  • Resolution set to fixed

Fixed in r2084

#16 Updated by Anonymous over 6 years ago

A fresh lighty 1.4.19 installation has what looks to be the same problem described above, if I'm reading correctly. Error log excerpt:


2008-03-12 09:40:27: (connections.c.279) SSL: 1 error:140780E5:SSL routines:SSL23_READ:ssl handshake failure 
2008-03-12 09:41:42: (network_openssl.c.130) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 09:41:42: (connections.c.614) connection closed: write failed on fd 10 
2008-03-12 09:43:15: (connections.c.279) SSL: 1 error:140780E5:SSL routines:SSL23_READ:ssl handshake failure 
2008-03-12 09:54:44: (network_openssl.c.130) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 09:54:44: (connections.c.614) connection closed: write failed on fd 8 
2008-03-12 09:54:44: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 10:00:03: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 10:01:40: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 10:02:24: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 10:02:28: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 10:05:09: (connections.c.279) SSL: 1 error:140780E5:SSL routines:SSL23_READ:ssl handshake failure 
2008-03-12 10:11:42: (connections.c.279) SSL: 1 error:140780E5:SSL routines:SSL23_READ:ssl handshake failure 

This installed from source on red hat enterprise 3 (yeah, I know). I've switched it back to run apache for now but do have a non-production server I can do further testing on if it's helpful.

-- mstemp5

#17 Updated by Anonymous over 6 years ago

  • Status changed from Fixed to Need Feedback
  • Resolution deleted (fixed)

I run into the same problem as bug 258 with the SSL write errors.

Finally I could track down it to the following situation. Start two
parallel downloads using SSL in two different connections. (You can also
download through x-sendfile, or php output. I used wget for easier
reproduce and with large files to have long-lasting connections.)

Now terminate one of them and the other would be closed a bit latter. It
is very annoying as this way large downloads would probably terminate
before finishing.

The log would contain something like this:
2008-03-12 09:41:42: (network_openssl.c.130) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
2008-03-12 09:41:42: (connections.c.614) connection closed: write failed on fd 10

The log refers to the second connection which is closed by lighty.

I started debugging the situation an it looked like the SSL error is
generated in ssl3_write_pending function, which happens when the
repeated SSL_write does not have the same arguments as the previous one,
or an other ssl_write is called in between.
I checked these, but everything seemed to be fine.

Also tried a fix from openssl, but without any success:
http://rt.openssl.org/Ticket/Display.html?id=598

However after careful gdb magic the back-trace showed me that the error
function was called from SSL_shutdown and not from SSL_write. The
SSL_shutdown was also called from the connection_state_machine function
on the CON_STATE_ERROR state. Hmm, strange, according to the logs the
error occurred somewhere else...

The SSL_write failed in network_write_chunkqueue_openssl, I realized the
in reality the SSL_write was OK, it only returned SSL_ERROR_WANT_WRITE,
but the SSL error queue contained an other error from an earlier SSL_*
call, in our case from SSL_shutdown.

In connection_state_machine:

1663 case CON_STATE_ERROR: /* transient /
1664
1665 /
even if the connection was drop we still have to write it to the access log /
1666 if (con->http_status) {
1667 plugins_call_handle_request_done(srv, con);
1668 }
1669 #ifdef USE_OPENSSL
1670 if (srv_sock->is_ssl) {
1671 int ret;
1672 switch ((ret = SSL_shutdown(con->ssl))) {
1673 case 1:
1674 /
ok */
1675 break;
1676 case 0:
1677 SSL_shutdown(con->ssl);
1678 break;
1679 default:
1680 log_error_write(srv, FILE_, LINE_, "sds", "SSL:",
1681 SSL_get_error(con->ssl, ret),
1682 ERR_error_string(ERR_get_error(), NULL));
1683 return -1;
1684 }
1685 }
1686 #endif

On line 1677 SSL_shutdown is called again, because the connection is in
non-blocking mode where the first SSL_shutdown can require an other
call. The problem that the return value of SSL_shutdown is not checked
and in case of error the error queue is not cleared.

When the SSL_write in network_write_chunkqueue_openssl returned a simple
WANT_WRITE error it got the error code from the previous SSL_shutdown
call.

To fix the problem simply we need to check the return value of
SSL_shutdown in 1677 and call ERR_get_error() to remove the error code
from the queue.

An other possible place is in connections.c:1557, but there is no
SSL_shutdown just a FIXME to put it there sometimes when fdevent show
that connection is writeable. (This part is a more frequently run one,
so would have caused more trouble...)

Here is a patch for 1.4.19 r2135, but would be obvious to port to 1.5
series:

Index: connections.c ===================================================================
--- connections.c (revision 2135)
++ connections.c (working copy)
@ -1674,7 +1674,15 @
/* ok /
break;
case 0:
- SSL_shutdown(con->ssl);
/
+ * We need to get the error after SSH_shutdown, otherwise it remains
+ * on the error queue and causes latter false-alerts. Usually around
+ * SSL_write methods in network_openssl.c which results to shutdown
+ * of connections.
+ */
+ if (SSL_shutdown(con->ssl) <= 0) {
+ ERR_get_error();
+ }
break;
default:
log_error_write(srv, FILE_, LINE_, "sds", "SSL:",

At least now I learned a lot about lightty and openssl internals. :)

cheers,

Marton

PS: According to google it looks like CUPS has also similar problems...

-- marton.illes

#18 Updated by stbuehler over 6 years ago

  • Status changed from Need Feedback to Fixed
  • Resolution set to fixed

Good catch!

Fixed in r2136.

I just added some ERR_clear_error() before ssl_write and sll_read to make really sure no old errors are hanging in the queue.

#19 Updated by Anonymous over 6 years ago

Excellent news!

It's early since 1.4.19, but if this tests out well, might I suggest that the importance of the fix justifies another release?

Thanks for your good work.

-- mstemp5

#20 Updated by hoffie over 6 years ago

  • Status changed from Fixed to Need Feedback
  • Resolution deleted (fixed)

This is actually a DoS problem, I requested a CVE for it.

The fix does not properly work for me. Lighty no longer drops SSL connections, but it tries to properly close the broken connection, leading to lots of SSL error messages and very high CPU consumption.
I'll try to post an updated patch in a minute.

#21 Updated by hoffie over 6 years ago

I attached a patch, which works without problems for me now (no drop of foreign SSL connections, no endless loop, no countless SSL errors).

I'm not completely sure whether it is correct regarding to logging -- according to the man page of SSL_Shutdown a bidirectional SSL shutdown (that's what this is all about) is optional. With my patch applied, no logging takes place if the second part of the shutdown fails (the "ok, i'll shutdown" from the client). IMO that's fine, but as I said, I'm not sure and it is not me who has to decide.

Function-wise, the patch should be correct, but further testing is certainly appreciated.

#22 Updated by hoffie over 6 years ago

Attached a patch for 1.5 as well, the logging "problem" is not present there as bidirectional SSL shutdown hasn't been implemented yet, it seems (see the FIXME comment in src/connections.c).

#23 Updated by hoffie over 6 years ago

CVE-2008-1531 got assigned to this issue. I'll try the patch later.

#24 Updated by hoffie over 6 years ago

Patch looks fine and appears to work properly. Attaching the same patch against 1.4.19 (distributions might want it).

#25 Updated by stbuehler over 6 years ago

Ok, i hope the ssl error handling is ok in svn now for 1.4.x; i'll leave the bug open for the 1.5.x fix.

#26 Updated by stbuehler over 6 years ago

  • Status changed from Need Feedback to Fixed
  • Resolution set to fixed

Ok, summary for now:
CVE-2008-1531 (http://nvd.nist.gov/nvd.cfm?cvename=CVE-2008-1531)
- lighttpd-1.4.x: Fixed in r2136, r2139, r2141, r2142 (the first two are the real fixes, the other two change the NEWS file to contain the CVE)
- lighttpd-1.5.x: Fixed in r2140

The problem was: if a user killed his ssl connection, lighttpd would kill another ssl connection as it didn't clear the ssl error queue.

#27 Updated by darix over 6 years ago

r2144 fixes a small typo in the patch.

#28 Updated by simoncpu almost 3 years ago

Hi,

We're still experiencing this bug:

% uname -r
8.2-RELEASE-p4
% openssl version
OpenSSL 0.9.8q 2 Dec 2010
% lighttpd -v
lighttpd/1.4.29 (ssl) - a light and fast webserver
Build-Date: Oct 20 2011 17:24:28

Thanks!

#29 Updated by stbuehler almost 3 years ago

  • Description updated (diff)
  • Missing in 1.5.x set to No

simoncpu: this bug reports lists tons of error descriptions, some valid, some invalid, and is quite old and closed.

You really have to be more specific, and i even recommend opening a new ticket to avoid more confusion.
Also please try the latest 1.4.x svn (or the 1.4.30rc2 snapshot in http://download.lighttpd.net/lighttpd/snapshots-1.4.x/).

Also available in: Atom