Bug #657
closedlighty vs. apt-get - problem with pipelining
Description
Symptoms: running an in-house debian mirror on lighty I've noticed that sometimes the connection is "reset by peer" while downloading packages. This happens fairly rarely (maybe 1% of the time, meaning 1 in 100 packages will produce the problem). Retrying helps. Also, adding
Acquire::http::Pipeline-Depth "0";
to the apt config - aka, disabling http/1.1 pipelining - seems to be a valid workaround.
However, the APT docs scream about the above as being only needed on non-standard-compliant platforms. Quoting from 'man apt.conf':
"One setting is provided to control the pipeline depth in cases where the
remote server is not RFC conforming or buggy (such as Squid 2.0.2)
Acquire::http::Pipeline-Depth can be a value from 0 to 5 indicating how
many outstanding requests APT should send. A value of zero MUST be specified if
the remote host does not properly linger on TCP con? nections - otherwise data
corruption will occur. Hosts which require this are in viola? tion of RFC 2068."
tcpdumps are available if anyone's interested.
cheers,
raas
-- raas
Files
Updated by moo over 18 years ago
there can be keep-alive limits in the server side, and the client never know when it reach the limit. the client should retry if the pipelined requests is failed. even the 1st request in the pipeline should be try 2 times, according to the http rfc
Updated by Anonymous about 18 years ago
We were also seeing this issue when using APT with lighttpd 1.4.13 over a fast LAN. I did some test runs with trace and tcpdumps, and found the following strace:
open("/srv/ubuntu/pool/main/p/parted/parted_1.7.1-2.1ubuntu3_i386.deb", O_RDONLY|O_LARGEFILE ) = 10 fcntl64(10, F_SETFD, FD_CLOEXEC) = 0 sendfile64(8, 10, [0], 55200) = -1 EAGAIN (Resource temporarily unavailable) setsockopt(8, SOL_TCP, TCP_CORK, [0], 4) = 0 write(3, "66.230.200.243 apt.wikimedia.org - [06/Dec/2006:00:36:34 +0000] \"GET /ubuntu/pool /main/p/parted/parted_1.7.1-2.1ubuntu3_i386.deb HTTP/1.1\" 200 0 \"-\" \"Ubuntu APT-HTTP/1.3 \"\n", 171) = 171 close(10) = 0 shutdown(8, 1 /* send */) = 0
lighty receives a EAGAIN on sendfile() (buffer full?) and then shuts down the sending to the socket and logs a 200 OK?!?
The following code in network_linux_sendfile.c seems responsible for this:
if (-1 == (r = sendfile(fd, c->file.fd, &offset, toSend))) { switch (errno) { case EAGAIN: case EINTR: r = 0; break; case EPIPE: case ECONNRESET: return -2; default: log_error_write(srv, __FILE__, __LINE__, "ssd", "sendfile failed:", strerror(errno), fd); return -1; } } if (r == 0) { /* We got an event to write but we wrote nothing * * - the file shrinked -> error * - the remote side closed inbetween -> remote-close */ if (HANDLER_ERROR == stat_cache_get_entry(srv, con, c->file.name, &sce)) { /* file is gone ? */ return -1; } if (offset > sce->st.st_size) { /* file shrinked, close the connection */ return -1; } return -2; }
If EAGAIN, r is set to 0. In the following code block, it's assumed that either the source file has shrunk (not the case) or the remote end must have closed the connection (not the case either). The latter seems strange to me - shouldn't ECONNRESET be returned for that?
EAGAIN likely means that some buffer is full, in which case this code returns -2 which makes lighty close down the connection.
I disabled the return -2, which seems to fix this issue. However, there seems to be another bug occurring with APT...
-- mark
Updated by Anonymous about 18 years ago
...and the other issue was simply a too low server.max-keep-alive-requests, which is not set to 128 by default as the manual used to say, but to 16.
lighttpd correctie closed the connection after 16 requests with Connection: close, but apparently APT doesn't handle that correctly.
-- mark
Updated by Anonymous over 17 years ago
Hi,
we have the same problems with debian.netcologne.de and deb.grml.org where I tried lighttpd. A fix would be really appreciated.
Thanks Alex
-- formorer
Updated by jan almost 17 years ago
- Status changed from New to Fixed
- Resolution set to fixed
a patch for the EAGAIN was applied in r2072
Updated by apenwarr over 15 years ago
- File linger-on-close.patch linger-on-close.patch added
- Status changed from Fixed to Patch Pending
I believe I've found the cause of this bug and made a patch for it. For more information, see:
Updated by apenwarr over 15 years ago
apenwarr wrote:
I believe I've found the cause of this bug and made a patch for it. For more information, see:
Attached is a perl test program that demonstrates the bug. To use it, you need a file called http://localhost/hello. The file should few a few hundred kbytes (I used a copy of /usr/bin/joe) for best results. Also, set your server.max-keep-alive-requests to 16 or so.
Note that, because this problem is exercising a race condition, it won't be 100% reliable. You might need to try different file sizes, different max-keep-alive-requests values, and running it on a separate machine from the web server.
Expected behaviour: you'll get a number of responses == max-keep-alive-requests, and all will have the same length.
Actual behaviour with bug: you'll get the right number of responses, usually, but the last one will be too short.
Actual behaviour after applying my above patch: as expected.
Updated by Olaf-van-der-Spek over 15 years ago
otherwise data corruption will occur.
Why is that? Doesn't the RST to the client result in an unclean socket closure? I agree that there's an issue with the server, but I don't see why this would result in data corruption.
Updated by stbuehler over 15 years ago
- Target version changed from 1.4.19 to 1.4.24
It results in data corruption as a RST kills the buffers, so the client doesn't see the complete response.
Under linux closing a socket ends in RST if either the in-queue isn't empty or if more data arrives after the close. So there is no easy way to tell the kernel "just do FIN handling and throw away all incoming data", which means we have to do it (e.g. shutdown(SHUT_WR), read() until EOF and then close() it); but i really don't want to block a full connection* object for that (shutdown(SHUT_RD) is a no-op on linux).
I think we should use an extra "closing-fd" queue with a global timeout and a global limit (close old fd if a new arrives), so we don't hit out-of-fds.
Updated by Olaf-van-der-Spek over 15 years ago
stbuehler wrote:
It results in data corruption as a RST kills the buffers, so the client doesn't see the complete response.
Doesn't the client see an error (instead of a clean EOF)?
Updated by stbuehler over 15 years ago
Hm yes, right. I couldn't reproduce the problem with the perl script, so perhaps apenwarr could show us the (client) strace or/and tcpdump.
Updated by Olaf-van-der-Spek over 15 years ago
stbuehler wrote:
but i really don't want to block a full connection* object for that (shutdown(SHUT_RD) is a no-op on linux).
Are connection objects that expensive?
Updated by stbuehler over 15 years ago
Yes; they use memory of course, and if you hit the connection limit it will block other users. I just don't want make DOS too easy...
Updated by Olaf-van-der-Spek over 15 years ago
This only happens when the keep alive limit has been reached. It's easy for an attacker not to do that.
Updated by apenwarr over 15 years ago
I can submit tcpdump/strace if you guys really want, but it's not very pretty and I'd have to un-patch my lighttpd in order to reproduce it, so I'll delay unless you really really need it.
To clarify some of the above discussion:
- shutdown(SHUT_RD) is definitely not a no-op. Like close(), it'll make sure the remote end gets RST if they try to send you any more data. Sometimes that's useful, but not in our case. What it doesn't do is flush the input queue: http://alumnit.ca/~apenwarr/log/?m=200908#14
- lingering on close is no different than waiting a few more seconds to see if a client will send more requests. Clients could cause a DOS just as easily that way, I'd think.
- apache does things the way I describe.
- you could indeed make a new kind of object that only drains its input rather than doing any kind of complex processing on it, but this would still use a file descriptor, which I imagine is the most expensive part of an (otherwise idle) connection object anyway.
- when the server closes too early, the client sees ECONNABORTED (usually) and the kernel may drop some of its incoming data. So if you want to imagine what a tcpdump looks like from the client end, it's a received RST; if you want to know what a strace looks like, the client is getting ECONNABORTED during read(). (The RST is triggered by a write(), but of course doesn't happen immediately during a write, so it gets returned on whatever the next socket call is after receiving the RST.)
Updated by stbuehler over 15 years ago
Nice post, looks like a good summary of the problem to me (you could have posted that link sooner. Had to figure the SO_LINGER problem myself :) ).
shutdown(SHUT_RD) is a no-op. Have a look at the kernel code:- syscall shutdown: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=net/socket.c;h=6d47165590473daa4990bf69b0435d5c49b41302;hb=HEAD#l1840
- security_socket_shutdown returns 0 by default (selinux/... may overwrite that)
- sock->ops->shutdown is tcp_shutdown for TCP over ipv4/ipv6
- tcp_shutdown: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=net/ipv4/tcp.c;h=91145244ea630777e7e4a2e7f112f3c31bc7f90e;hb=HEAD#l1776
if (!(how & SEND_SHUTDOWN)) return;
You are right with your comment on DOS; i guess we could use server.max-read-idle as timeout. But a fd cost us nothing, and a connection* object is rather big, so i still think the extra object would be the better solution (we already have this in the sandbox), but i guess no one is gonna implement it for 1.x ...
I guess you agree that it doesn't lead to data corruption if the client handles RST/ECONNABORTED the way it should. There are still situations in which this doesn't help (if lighty closes always after first request, due to keep-alive or missing content-length or whatever, a client which always tries pipelining may never get the complete result for the first request).
Now I just need a developer to review the patch :p
Updated by Olaf-van-der-Spek over 15 years ago
stbuehler wrote:
and a connection* object is rather big,
How big is rather big?
I guess you agree that it doesn't lead to data corruption if the client handles RST/ECONNABORTED the way it should. There are still situations in which this doesn't help (if lighty closes always after first request, due to keep-alive or missing content-length or whatever, a client which always tries pipelining may never get the complete result for the first request).
It might fail in a lot of other situations too.
Updated by apenwarr over 15 years ago
Basically, an HTTP client can't assume the server is HTTP/1.1 (and thus multiple requests per connection). It theoretically has to make a request, wait for the answer, check that it's HTTP/1.1, and then if so, it should be able to assume pipelining. A client that always assumes pipelining will work will fail for lots of reasons.
With my patch added to lighttpd, I believe all properly-written http clients will be fine.
Updated by stbuehler about 15 years ago
- Status changed from Patch Pending to Fixed
- % Done changed from 0 to 100
Applied in changeset r2638.
Updated by stbuehler about 15 years ago
Just a small notice: it seems like many clients don't close their end of the connection, so i will drop the timeout to 5 seconds (instead of 30). See #2086
Also available in: Atom