Bug #2086
closedmod_status has too many 'close' states
Description
Hello!
I've updated to 1.4.24 RC yesterday (from the stbuehler repository for debian) and the problem appeared in mod_status: http://bit.ly/UOIj5
As you can see there're many connection in 'close' state. I'm not sure it affects something else, but looks really strange :)
Updated by stbuehler over 15 years ago
Hm, looks like the timeout didn't work; they should be closed after 30 seconds.
Updated by akrus over 15 years ago
According to netstat there're many 'FIN_WAIT2's.
Updated by stbuehler over 15 years ago
Which fd-event handler are you using?
And it would be nice if you could have a look at strace, and perhaps paste a little bit from it - it would be interesting to see if the FIN_WAIT2 fds gets signaled "active" (it should signal EOF as POLL_IN or POLL_HUP).
Updated by akrus over 15 years ago
I think fd-event handler is default one (I didn't specify anything in config). According to strace it's epoll.
Could you give some more instructions on strace? strace -p <pid> returns many strings, but I have no idea what to look for, the is no 'EOF' :)
Updated by akrus over 15 years ago
Maybe:
epoll_ctl(6, EPOLL_CTL_MOD, 153, {EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=153, u64=153}}) = 0 epoll_ctl(6, EPOLL_CTL_MOD, 153, {EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=153, u64=153}}) = 0 epoll_ctl(6, EPOLL_CTL_MOD, 153, {EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=153, u64=153}}) = 0 epoll_ctl(6, EPOLL_CTL_MOD, 153, {EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=153, u64=153}}) = 0 epoll_ctl(6, EPOLL_CTL_MOD, 153, {EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=153, u64=153}}) = 0 epoll_ctl(6, EPOLL_CTL_MOD, 153, {EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=153, u64=153}}) = 0
Updated by stbuehler over 15 years ago
You could search for fds matching FIN_WAIT2 connections with lsof
; this one looks like an active connection to me (it asks for EPOLLOUT, so it wants to write something). In "close" state it should ask for EPOLLIN|EPOLLERR|EPOLLHUP
. EPOLL_CTL_MOD
means it is setting that flags, epoll_wait() calls are probably more interesting.
Updated by akrus over 15 years ago
epoll_ctl(6, EPOLL_CTL_MOD, 161, {EPOLLIN|EPOLLERR|EPOLLHUP, {u32=161, u64=161}}) = 0 epoll_wait(6, {{EPOLLIN, {u32=161, u64=161}}}, 1025, 1000) = 1
epoll_ctl(6, EPOLL_CTL_MOD, 153, {EPOLLIN|EPOLLERR|EPOLLHUP, {u32=153, u64=153}}) = 0 epoll_ctl(6, EPOLL_CTL_MOD, 153, {EPOLLIN|EPOLLERR|EPOLLHUP, {u32=153, u64=153}}) = 0 epoll_wait(6, {{EPOLLOUT, {u32=135, u64=135}}}, 1025, 1000) = 1
Something like this~
lsof and lsof -i (with | grep FIN_WAIT2) returns nothing...
Updated by stbuehler over 15 years ago
Ok, i think i can reproduce it. Just make a request with a keep-alive client (and enable keep-alive on the server), and wait for the timeout:
13:06:23.327767 accept(3, {sa_family=AF_INET, sin_port=htons(55370), sin_addr=inet_addr("127.0.0.1")}, [8135381057554874384]) = 6 13:06:23.327994 brk(0x179c000) = 0x179c000 13:06:23.328300 brk(0x17bd000) = 0x17bd000 13:06:23.328594 fcntl(6, F_SETFD, FD_CLOEXEC) = 0 13:06:23.328642 fcntl(6, F_SETFL, O_RDWR|O_NONBLOCK) = 0 13:06:23.328694 ioctl(6, FIONREAD, [429]) = 0 13:06:23.328754 read(6, "GET /test.txt HTTP/1.1\r\nHost: 127.0.0.1:8081\r\nConnection: Keep-Alive\r\n"..., 447) = 429 13:06:23.328975 stat(".../test.txt", {st_mode=S_IFREG|0644, st_size=328, ...}) = 0 13:06:23.329074 open(".../test.txt", O_RDONLY) = 7 13:06:23.329137 close(7) = 0 13:06:23.329287 setsockopt(6, SOL_TCP, TCP_CORK, [1], 4) = 0 13:06:23.329348 writev(6, [{"HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nAccept-Ranges: bytes\r\nETag: \"2301341072\"\r\nLast-Modified: Tue, 05 Feb 2008 22:45:31 GMT\r\nContent-Length: 328\r\nDate: Mon, 19 Oct 2009 11:06:22 GMT\r\nServer: lighttpd/1.4.24\r\n\r\n"..., 216}], 1) = 216 13:06:23.329448 open(".../test.txt", O_RDONLY) = 7 13:06:23.329511 fcntl(7, F_SETFD, FD_CLOEXEC) = 0 13:06:23.329559 sendfile(6, 7, [0], 328) = 328 13:06:23.329614 close(7) = 0 13:06:23.329661 setsockopt(6, SOL_TCP, TCP_CORK, [0], 4) = 0 13:06:23.329743 ioctl(6, FIONREAD, [0]) = 0 13:06:23.329802 read(6, 0x17bacd0, 4159) = -1 EAGAIN (Resource temporarily unavailable) 13:06:23.329882 epoll_ctl(5, EPOLL_CTL_ADD, 6, {EPOLLIN|EPOLLERR|EPOLLHUP, {u32=6, u64=6}}) = 0 13:06:23.329941 accept(3, 0x7ffffa63a800, [8135381057554874480]) = -1 EAGAIN (Resource temporarily unavailable) 13:06:23.330009 epoll_ctl(5, EPOLL_CTL_MOD, 6, {EPOLLIN|EPOLLERR|EPOLLHUP, {u32=6, u64=6}}) = 0 13:06:23.330071 epoll_wait(5, {}, 1025, 1000) = 0 13:06:24.330132 epoll_wait(5, {}, 1025, 1000) = 0 13:06:25.330134 epoll_wait(5, {}, 1025, 1000) = 0 13:06:26.330097 epoll_wait(5, {}, 1025, 1000) = 0 13:06:27.330094 epoll_wait(5, {}, 1025, 1000) = 0 13:06:28.330171 shutdown(6, 1 /* send */) = 0 13:06:28.330286 ioctl(6, FIONREAD, [0]) = 0 13:06:28.330357 epoll_ctl(5, EPOLL_CTL_MOD, 6, {EPOLLIN|EPOLLERR|EPOLLHUP, {u32=6, u64=6}}) = 0 13:06:28.330423 epoll_wait(5, {}, 1025, 1000) = 0 [nothing happens now...] 13:07:13.329978 epoll_wait(5, {}, 1025, 1000) = 0 13:07:14.330096 epoll_wait(5, ^C <unfinished ...>
(This is with a keepalive-idle of 5 seconds)
You can see that lighty closes the "output" of the socket after the keep-alive timeout (shutdown
), but it never sees an EOF from the client (that would be read(...) = 0
); and it looks like epoll_wait doesn't detect EOF as input activity.
Updated by stbuehler over 15 years ago
Just a small detail: FIN_WAIT2 means the client didn't close its end of the connection, so lighttpd waits for a FIN (a read()
would return EAGAIN
). That is no reason not to close it after the timeout (i will fix that), but it indicates that many clients don't handle this in a clean way. I will therefor drop the HTTP_LINGER_TIMEOUT to 5 seconds (was 30).
Updated by stbuehler over 15 years ago
- Status changed from New to Fixed
- % Done changed from 0 to 100
Applied in changeset r2669.
Also available in: Atom