lighttpd crashes under highload

we could trace down to a performance issue of lighttpd. sporadicly lighttpd crashes...
valgrind log is here:

Updated by jan about 15 years ago

  • Status changed from New to Assigned

please verify if the problem persists with 1.4.5


Updated by Anonymous about 15 years ago

still is an issue. but its not as hard as before anymore. compare yourself: 1.4.4 i had a dozen crashs a day, with 1.4.5 i have "only" a couple.

Updated by Anonymous about 15 years ago

I have seen similar during DOS condition. No core dump (though enabled). lighttpd seemed to 'stop'. php-cgi processes continued until I send a killall -TERM php-cgi. Did not need to send KILL, so however lighttpd stopped, it did not do so in an entirely orderly manner.


server.max-connections = 1024
server.max-fds = 3072

to see if max-connections protects against this problem. Well hopefully the DOS won't re-occurr ;)

Hope this extra information is useful.

Updated by Anonymous almost 15 years ago

I can confirm this too. I'm evaluating 1.4.7 and unexpectedly crashes after 10 minutes or so of high load. My environment is Debian 3.1 (sarge) with the stock 2.6.8 (-686-smp) kernel package.

I set it up to exclusively have mod_proxy distribute load to several (11) backend servers. No "regular" file requests were served by the server. At a output-rate of more than 150 Mbps and 1800 rps the process quietly exits all of a sudden. When I started lighttpd with the -D flag to see if anything was printed to stderr, I didn't see anything there either when it crashed again. However, I noticed that it did leave with an "aborted" exit code.

I switched off both the rrdtool- and accesslog-modules and could exclude them from suspicion.

I will try a more recent kernel revision later on, but my gut feeling hints me that the problem is indeed in Lighttpd.

Updated by jan almost 15 years ago

Can you generate a strace for me ? The wiki knows how to report a bug.


Updated by Anonymous almost 15 years ago

I'll try to make one. Problem is that under high loads strace itself becomes the performance penalty, thus limiting the rq/sec rate and apparently the chance of the crash to occur...

Updated by Anonymous almost 15 years ago

Here are my premier results:

11:37:14.450805 accept(5, {sa_family=AF_INET, sin_port=htons(2315), sin_addr=inet_addr("[xxxxxxxxxxxxx]")}, [16]) = 42
11:37:14.450900 fcntl64(42, F_SETFD, FD_CLOEXEC) = 0
11:37:14.450941 fcntl64(42, F_SETFL, O_RDWR|O_NONBLOCK) = 0
11:37:14.450980 ioctl(42, FIONREAD, [7935]) = 0
11:37:14.451026 read(42, "POST /[xxxxxxxxxxx]\r\n[xxxxxxxxxxxxx]"..., 7935) = 7935
11:37:14.452304 ioctl(42, FIONREAD, [0]) = 0
11:37:14.452361 read(42, 0x886ec38, 4159) = -1 EAGAIN (Resource temporarily unavailable)
11:37:14.452440 write(2, "lighttpd: connections.c:962: connection_handle_read_state: Assertion `c->mem->used\' failed.\n", 92) = 92
11:37:14.452580 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
11:37:14.452664 gettid()                = 2539
11:37:14.452703 tgkill(2539, 2539, SIGABRT) = 0
11:37:14.452740 --- SIGABRT (Aborted) @ 0 (0) ---

A connection is accepted from a client and a POST request is read. Then we ask to read an additional 0 bytes from ...?

Updated by Anonymous almost 15 years ago

  • Status changed from Fixed to Need Feedback
  • Resolution deleted (fixed)

Wonderful! That patch fixed the problem..._in most cases_! I can still make it crash however (though it seems even less common now).

lighttpd: connections.c:962: connection_handle_read_state: Assertion `c->mem->used' failed.

I have not had time to make a new strace run yet. It looks like a variant of the same problem, no? That some certain chunk sequences still can slip through the cleanup?

Updated by Anonymous almost 15 years ago

I reproduced the crash with strace attached again. It's exactly the order of calls as last time (see above).

Updated by Anonymous almost 15 years ago

...but that was with 1.4.7+patch. I have not seen this after I upgraded to the 1.4.8 release. (On the other hand I also switched to slightly faster hardware.)

Let's close it and reopen if someone can reproduce with 1.4.8

Updated by Anonymous almost 15 years ago

  • Status changed from Need Feedback to Fixed
  • Resolution set to fixed

I can now confirm that this issue never appeared again after the 1.4.8 release.

