Bug #1810
closedaio connection lifetime issues (race condition)
Description
The AIO threads don't seem to pay attention to the lifetime of the connection object.
For example, using network_gthread_sendfile.c, if I add sleep(10) before the sendfile(), reduce server.max-write-idle to 1, and then start and time out a connection, I get:
2008-10-30 21:25:18: (server.c.744) NOTE: a request for /test timed out after writing 235 bytes. We waited 1 seconds. If this a problem increase server.max-write-idle
network_gthread_sendfile.c.135: (error) sendfile() failed: Bad file descriptor (9)
The main thread times it out, and kills the connection; the sendfile thread then wakes up and tries to use the FD that's gone (or more realistically on a live server, now points to someone else's connection).
There may be issues with any uses of con in the thread, too, since that pointer may be reused by some other connection.
A less contrived way that this might be an issue is sending data that's being read off a slow or dirty CDROM, which may stall for a long time, or off an NFS mount to a server that's down, which may reconnect minutes later and complete the request.
I think it's critical to get this right, even if the problem cases are obscure; with the lengths lighttpd goes to avoid threading, I'm sure everyone knows how much of a nightmare troubleshooting race conditions can be down the line.
Also available in: Atom