Bug #1398
closedSegfault on x86_64 and 2*connections > max-fds
Description
I run lighttpd on Debian Etch x86_64 and it segfaults when the number of connections exceeds half of max-fds (give or take), killing off all the clients that it serves. I think the expected behavior should be for light to post a message into the error log and then temporarily disabling the accept handler.
Output from gdb and dmesg (note: max-fds is set to 4096):
Program received signal SIGSEGV, Segmentation fault.
fdevent_get_handler (ev=0x57b510, fd=4088) at fdevent.c:171
171 if (ev->fdarrayfd->fd != fd) SEGFAULT;
(gdb) bt
#0 fdevent_get_handler (ev=0x57b510, fd=4088) at fdevent.c:171
#1 0x0000000000407d12 in main (argc=<value optimized out>, argv=<value optimized out>) at server.c:1405
Oct 2 15:51:48 xxxx kernel: lighttpdr11090: segfault at 00000ff800000011 rip 0000000000415027 rsp 00007fffde91b1f0 error 4
-- doubleukay
Updated by Anonymous about 17 years ago
Here's another spot where it segfaults (max-fd = 1024 in this test)
(gdb) bt
#0 0xf7ec4709 in free () from /lib/tls/i686/cmov/libc.so.6
#1 0x0805dfb7 in fdevent_unregister (ev=0x80a77f0, fd=1022) at fdevent.c:125
#2 0x08052092 in connection_close (srv=0x806e008, con=0x8171768) at connections.c:124
#3 0x080523ef in connection_state_machine (srv=0x806e008, con=0x8171768) at connections.c:1716
#4 0x0804e5b2 in main (argc=3, argv=0xfface034) at server.c:1279
Updated by Amr_not_Amr about 17 years ago
I'm facing the same problem on CentOS 5, x86_64 ..
When the number of connections exceeds that half max-fds it give Segfaults .. here is some examples I get in the messages log ..
Nov 4 13:23:24 dellway kernel: lighttpdr20469: segfault at 0000000400000010 rip 0000000000414d54 rsp 00007fff36333930 error 4
Nov 4 13:34:07 dellway kernel: lighttpdr20773: segfault at 0000000400000010 rip 0000000000414d54 rsp 00007fff36333930 error 4
Nov 4 14:35:09 dellway kernel: lighttpdr25474: segfault at 000003e000000011 rip 0000000000414d54 rsp 00007fff71ab50a0 error 4
Nov 4 14:35:11 dellway kernel: lighttpdr25475: segfault at 000003e000000011 rip 0000000000414d54 rsp 00007fff71ab50a0 error 4
Nov 4 14:35:18 dellway kernel: lighttpdr25476: segfault at 000003e000000011 rip 0000000000414d54 rsp 00007fff71ab50a0 error 4
Nov 4 14:36:12 dellway kernel: lighttpdr25502: segfault at 000000000000011a rip 0000000000414d54 rsp 00007fff71ab50a0 error 4
Nov 4 14:36:32 dellway kernel: lighttpdr25473: segfault at 00000000000001fe rip 0000000000414d54 rsp 00007fff71ab50a0 error 4
Nov 4 15:49:48 dellway kernel: lighttpdr25510: segfault at 000003e000000011 rip 0000000000414d54 rsp 00007fff71ab50a0 error 4
Nov 4 15:50:35 dellway kernel: lighttpdr25516: segfault at 0000000000000023 rip 0000000000414d54 rsp 00007fff71ab50a0 error 4
Nov 4 16:19:39 dellway kernel: lighttpdr25519: segfault at 00000000000003f2 rip 0000000000414d54 rsp 00007fff71ab50a0 error 4
Updated by slyphon almost 17 years ago
This also occurs on solaris, compiled with Sun native compiler.
slyphon@light01 ~ $ lighttpd -V lighttpd-1.4.18 (ssl) - a light and fast webserver Build-Date: Dec 4 2007 02:44:17 Event Handlers: + select (generic) + poll (Unix) - rt-signals (Linux 2.4+) - epoll (Linux 2.6) + /dev/poll (Solaris) - kqueue (FreeBSD) Network handler: + sendfile Features: + IPv6 support + zlib support + bzip2 support + crypt support + SSL Support + PCRE support - mySQL support - LDAP support - memcached support + FAM support - LUA support - xml support - SQLite support - GDBM support
We configured lighttpd to run 3 rails fcgi processes, and then tortured it with ab.
slyphon@light01 ~ $ ab -c 200 -n 1000 -v1 http://localhost/ This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0 Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright 2006 The Apache Software Foundation, http://www.apache.org/ Benchmarking localhost (be patient) Completed 100 requests Test aborted after 10 failures apr_socket_connect(): Connection refused (146) Total of 111 requests completed
using a small dtrace script, I was able to get a stacktrace when the SIGSEGV gets sent:
lighttpd`fdevent_get_handler+0x15 lighttpd`main+0xf80 lighttpd`_start+0x7d
the logs show many lines similar to:
2007-12-25 06:06:48: (mod_fastcgi.c.2816) wait for fd at connection: 58 2007-12-25 06:06:48: (mod_fastcgi.c.2816) wait for fd at connection: 59 2007-12-25 06:06:48: (mod_fastcgi.c.2816) wait for fd at connection: 126 2007-12-25 06:06:48: (mod_fastcgi.c.2816) wait for fd at connection: 125 2007-12-25 06:06:48: (mod_fastcgi.c.2816) wait for fd at connection: 124 2007-12-25 06:06:48: (mod_fastcgi.c.2816) wait for fd at connection: 123 2007-12-25 06:06:48: (mod_fastcgi.c.2816) wait for fd at connection: 122 2007-12-25 06:06:48: (mod_fastcgi.c.2816) wait for fd at connection: 121 2007-12-25 06:06:48: (mod_fastcgi.c.2816) wait for fd at connection: 120 2007-12-25 06:06:48: (mod_fastcgi.c.2816) wait for fd at connection: 119
Updated by stbuehler almost 17 years ago
- Status changed from New to Fixed
- Resolution set to duplicate
Also available in: Atom