Bug #657

lighty vs. apt-get - problem with pipelining

Added by Anonymous over 8 years ago. Updated almost 5 years ago.

Status:FixedStart date:
Priority:NormalDue date:
Assignee:-% Done:

100%

Category:core
Target version:1.4.24
Missing in 1.5.x:No

Description

Symptoms: running an in-house debian mirror on lighty I've noticed that sometimes the connection is "reset by peer" while downloading packages. This happens fairly rarely (maybe 1% of the time, meaning 1 in 100 packages will produce the problem). Retrying helps. Also, adding

Acquire::http::Pipeline-Depth "0";

to the apt config - aka, disabling http/1.1 pipelining - seems to be a valid workaround.

However, the APT docs scream about the above as being only needed on non-standard-compliant platforms. Quoting from 'man apt.conf':

"One setting is provided to control the pipeline depth in cases where the
remote server is not RFC conforming or buggy (such as Squid 2.0.2)
Acquire::http::Pipeline-Depth can be a value from 0 to 5 indicating how
many outstanding requests APT should send. A value of zero MUST be specified if
the remote host does not properly linger on TCP con? nections - otherwise data
corruption will occur. Hosts which require this are in viola? tion of RFC 2068."

tcpdumps are available if anyone's interested.

cheers,

raas

-- raas

linger-on-close.patch Magnifier - Path for linger-on-close bug (4.5 KB) apenwarr, 2009-08-14 05:45

client.pl Magnifier (2.06 KB) apenwarr, 2009-08-14 07:22


Related issues

Related to Bug #2086: mod_status has too many 'close' states Fixed 2009-10-19

Associated revisions

Revision 2072
Added by jan over 6 years ago

fixed handling of EAGAIN in linux-sendfile (fixes #657)

Revision 2638
Added by stbuehler almost 5 years ago

Fix linger-on-close behaviour to avoid rare failure conditions (was r2636, fixes #657, thx apenwarr)

Sry for the broken commit message

History

#1 Updated by moo over 8 years ago

there can be keep-alive limits in the server side, and the client never know when it reach the limit. the client should retry if the pipelined requests is failed. even the 1st request in the pipeline should be try 2 times, according to the http rfc

#2 Updated by Anonymous over 7 years ago

We were also seeing this issue when using APT with lighttpd 1.4.13 over a fast LAN. I did some test runs with trace and tcpdumps, and found the following strace:


open("/srv/ubuntu/pool/main/p/parted/parted_1.7.1-2.1ubuntu3_i386.deb", O_RDONLY|O_LARGEFILE
) = 10
fcntl64(10, F_SETFD, FD_CLOEXEC)        = 0
sendfile64(8, 10, [0], 55200)           = -1 EAGAIN (Resource temporarily unavailable)
setsockopt(8, SOL_TCP, TCP_CORK, [0], 4) = 0
write(3, "66.230.200.243 apt.wikimedia.org - [06/Dec/2006:00:36:34 +0000] \"GET /ubuntu/pool
/main/p/parted/parted_1.7.1-2.1ubuntu3_i386.deb HTTP/1.1\" 200 0 \"-\" \"Ubuntu APT-HTTP/1.3
\"\n", 171) = 171
close(10)                               = 0
shutdown(8, 1 /* send */)               = 0

lighty receives a EAGAIN on sendfile() (buffer full?) and then shuts down the sending to the socket and logs a 200 OK?!?

The following code in network_linux_sendfile.c seems responsible for this:


            if (-1 == (r = sendfile(fd, c->file.fd, &offset, toSend))) {
                    switch (errno) {
                    case EAGAIN:
                    case EINTR:
                            r = 0;
                            break;
                    case EPIPE:
                    case ECONNRESET:
                            return -2;
                    default:
                            log_error_write(srv, __FILE__, __LINE__, "ssd",
                                            "sendfile failed:", strerror(errno), fd);
                            return -1;
                    }
            }
            if (r == 0) {
                    /* We got an event to write but we wrote nothing
                     *
                     * - the file shrinked -> error
                     * - the remote side closed inbetween -> remote-close */
                    if (HANDLER_ERROR == stat_cache_get_entry(srv, con, c->file.name, &sce)) {
                        /* file is gone ? */
                        return -1;
                    }
                    if (offset > sce->st.st_size) {
                        /* file shrinked, close the connection */
                        return -1;
                    }
                    return -2;
            }

If EAGAIN, r is set to 0. In the following code block, it's assumed that either the source file has shrunk (not the case) or the remote end must have closed the connection (not the case either). The latter seems strange to me - shouldn't ECONNRESET be returned for that?

EAGAIN likely means that some buffer is full, in which case this code returns -2 which makes lighty close down the connection.

I disabled the return -2, which seems to fix this issue. However, there seems to be another bug occurring with APT...

-- mark

#3 Updated by Anonymous over 7 years ago

...and the other issue was simply a too low server.max-keep-alive-requests, which is not set to 128 by default as the manual used to say, but to 16.

lighttpd correctie closed the connection after 16 requests with Connection: close, but apparently APT doesn't handle that correctly.

-- mark

#4 Updated by Anonymous almost 7 years ago

Hi,

we have the same problems with debian.netcologne.de and deb.grml.org where I tried lighttpd. A fix would be really appreciated.

Thanks Alex

-- formorer

#5 Updated by jan over 6 years ago

  • Status changed from New to Fixed
  • Resolution set to fixed

a patch for the EAGAIN was applied in r2072

#6 Updated by apenwarr about 5 years ago

I believe I've found the cause of this bug and made a patch for it. For more information, see:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=541428

#7 Updated by apenwarr about 5 years ago

apenwarr wrote:

I believe I've found the cause of this bug and made a patch for it. For more information, see:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=541428

Attached is a perl test program that demonstrates the bug. To use it, you need a file called http://localhost/hello. The file should few a few hundred kbytes (I used a copy of /usr/bin/joe) for best results. Also, set your server.max-keep-alive-requests to 16 or so.

Note that, because this problem is exercising a race condition, it won't be 100% reliable. You might need to try different file sizes, different max-keep-alive-requests values, and running it on a separate machine from the web server.

Expected behaviour: you'll get a number of responses == max-keep-alive-requests, and all will have the same length.

Actual behaviour with bug: you'll get the right number of responses, usually, but the last one will be too short.

Actual behaviour after applying my above patch: as expected.

#8 Updated by Olaf-van-der-Spek about 5 years ago

otherwise data corruption will occur.

Why is that? Doesn't the RST to the client result in an unclean socket closure? I agree that there's an issue with the server, but I don't see why this would result in data corruption.

#9 Updated by stbuehler about 5 years ago

  • Target version changed from 1.4.19 to 1.4.24

It results in data corruption as a RST kills the buffers, so the client doesn't see the complete response.

Under linux closing a socket ends in RST if either the in-queue isn't empty or if more data arrives after the close. So there is no easy way to tell the kernel "just do FIN handling and throw away all incoming data", which means we have to do it (e.g. shutdown(SHUT_WR), read() until EOF and then close() it); but i really don't want to block a full connection* object for that (shutdown(SHUT_RD) is a no-op on linux).

I think we should use an extra "closing-fd" queue with a global timeout and a global limit (close old fd if a new arrives), so we don't hit out-of-fds.

#10 Updated by Olaf-van-der-Spek about 5 years ago

stbuehler wrote:

It results in data corruption as a RST kills the buffers, so the client doesn't see the complete response.

Doesn't the client see an error (instead of a clean EOF)?

#11 Updated by stbuehler about 5 years ago

Hm yes, right. I couldn't reproduce the problem with the perl script, so perhaps apenwarr could show us the (client) strace or/and tcpdump.

#12 Updated by Olaf-van-der-Spek about 5 years ago

stbuehler wrote:

but i really don't want to block a full connection* object for that (shutdown(SHUT_RD) is a no-op on linux).

Are connection objects that expensive?

#13 Updated by stbuehler about 5 years ago

Yes; they use memory of course, and if you hit the connection limit it will block other users. I just don't want make DOS too easy...

#14 Updated by Olaf-van-der-Spek almost 5 years ago

This only happens when the keep alive limit has been reached. It's easy for an attacker not to do that.

#15 Updated by apenwarr almost 5 years ago

I can submit tcpdump/strace if you guys really want, but it's not very pretty and I'd have to un-patch my lighttpd in order to reproduce it, so I'll delay unless you really really need it.

To clarify some of the above discussion:

- shutdown(SHUT_RD) is definitely not a no-op. Like close(), it'll make sure the remote end gets RST if they try to send you any more data. Sometimes that's useful, but not in our case. What it doesn't do is flush the input queue: http://alumnit.ca/~apenwarr/log/?m=200908#14

- lingering on close is no different than waiting a few more seconds to see if a client will send more requests. Clients could cause a DOS just as easily that way, I'd think.

- apache does things the way I describe.

- you could indeed make a new kind of object that only drains its input rather than doing any kind of complex processing on it, but this would still use a file descriptor, which I imagine is the most expensive part of an (otherwise idle) connection object anyway.

- when the server closes too early, the client sees ECONNABORTED (usually) and the kernel may drop some of its incoming data. So if you want to imagine what a tcpdump looks like from the client end, it's a received RST; if you want to know what a strace looks like, the client is getting ECONNABORTED during read(). (The RST is triggered by a write(), but of course doesn't happen immediately during a write, so it gets returned on whatever the next socket call is after receiving the RST.)

#16 Updated by stbuehler almost 5 years ago

Nice post, looks like a good summary of the problem to me (you could have posted that link sooner. Had to figure the SO_LINGER problem myself :) ).

shutdown(SHUT_RD) is a no-op. Have a look at the kernel code:

You are right with your comment on DOS; i guess we could use server.max-read-idle as timeout. But a fd cost us nothing, and a connection* object is rather big, so i still think the extra object would be the better solution (we already have this in the sandbox), but i guess no one is gonna implement it for 1.x ...

I guess you agree that it doesn't lead to data corruption if the client handles RST/ECONNABORTED the way it should. There are still situations in which this doesn't help (if lighty closes always after first request, due to keep-alive or missing content-length or whatever, a client which always tries pipelining may never get the complete result for the first request).

Now I just need a developer to review the patch :p

#17 Updated by Olaf-van-der-Spek almost 5 years ago

stbuehler wrote:

and a connection* object is rather big,

How big is rather big?

I guess you agree that it doesn't lead to data corruption if the client handles RST/ECONNABORTED the way it should. There are still situations in which this doesn't help (if lighty closes always after first request, due to keep-alive or missing content-length or whatever, a client which always tries pipelining may never get the complete result for the first request).

It might fail in a lot of other situations too.

#18 Updated by apenwarr almost 5 years ago

Basically, an HTTP client can't assume the server is HTTP/1.1 (and thus multiple requests per connection). It theoretically has to make a request, wait for the answer, check that it's HTTP/1.1, and then if so, it should be able to assume pipelining. A client that always assumes pipelining will work will fail for lots of reasons.

With my patch added to lighttpd, I believe all properly-written http clients will be fine.

#19 Updated by stbuehler almost 5 years ago

  • Status changed from Patch Pending to Fixed
  • % Done changed from 0 to 100

Applied in changeset r2638.

#20 Updated by stbuehler almost 5 years ago

  • Missing in 1.5.x set to No

#21 Updated by stbuehler almost 5 years ago

Just a small notice: it seems like many clients don't close their end of the connection, so i will drop the timeout to 5 seconds (instead of 30). See #2086

Also available in: Atom