Lighttpd kills Ubuntu network install / local mirror
If you try to netboot / net install Ubuntu Gutsy (haven't tried any others) over the network and you host the packages using Lighttpd instead of Apache, the installation will fail. This error is in the logs.
Dec 2 11:58:19 in-target: After unpacking 1765MB of additional disk space will be used.
Dec 2 11:58:19 in-target: Get:1 http://192.168.1.144 gutsy/main libfuse2 2.7.0-1ubuntu5 121kB
Dec 2 11:58:28 in-target: Get:96 http://192.168.1.144 gutsy/main xmag 1:1.0.1-0ubuntu2 19.1kB
Dec 2 11:58:28 in-target: E: Method http has died unexpectedly!
If I simply shutdown Lighttpd and try Apache, it works perfectly.
As you can see, that's 96 GETs in like 9 seconds. Lighttpd either can't handle it, or intentionally cuts us off :( I'm guessing we get cut off for some reason. Doesn't make any sense...
Only modules running are: mod_access, mod_alias, mod_accesslog, and mod_compress
I found these relevant errors in the error.log
2007-12-02 10:47:22: (network_linux_sendfile.c.171) sendfile failed: Input/output error 6
2007-12-02 10:47:22: (connections.c.603) connection closed: write failed on fd 6
Ignore the date/timestamps on those errors; they were copied from an earlier attempt so they don't match the first errors I posted obviously.
These errors couldn't be from anywhere else because the only reason I setup lighttpd was to host this install server.
Updated by Anonymous almost 12 years ago
I don't have a network trace. I have recreated this from several machines and also virtual machines.
By setting the network backend to "writev" I was able to make it through the install process after it failed once and then I retried it again. I've tried both 1.4x and 1.5x (which had the gthread-aio) and it doesn't really matter -- it simply doesnt like Lighttpd and I don't understand why... There is nothing special going on here; it's just serving a bunch of binary files.
Updated by Anonymous over 11 years ago
I can confirm this. A stock Ubuntu 7.10 AMD64 Server install booting from the network using PXE and then installing from a local mirror running the a stock lighttpd install from Gutsy 7.10 repository (1.4.18-1ubuntu1) fails with the exact error the original poster mentioned.
I installed Apache prefork - also a standard install. The only change I made to apache was to increase MaxSpareServers to 30. It worked perfectly from the same doc_root first time.
With lighttpd it consistently failed at exactly the same point in the install. FYI this was running on a Dell 2950 with the built in NIC on a gigabit ethernet switch with both the mirror and machine being installed on the same LAN.
Lighttpd wrote nothing to the error log and there was nothing useful in the access log either. I also checked the system logs and nothing.
This is very troubling because we run lighttpd as a front-end reverse proxy in our production environment and it processes well over 150 requests per second. So I'm wondering if requests are quietly failing under high load.
This error is 100% reproducible in our data center in a racked environment with a Dell 1GB switch and Dell 2950 servers. I ran a similar config in our office - same OS (also 64 bit) and with lighttpd and it worked fine. The only difference was the mirror machine was not a server class machine but it was an AMD64 arch. The mirror machine was also on a 100 Megabit port with 100MB nic and the server was 1GB - so perhaps the load wasn't enough to trigger this problem.
If I have time and two spare 2950's I'll try to repro this and debug in more detail.
Also available in: Atom