Bug #1562
closedsigsegv @ fdevent_get_handler - when congestion occurs, and file descriptor arrays is full.
Description
I'm experiencing segfaults when congestion occurs, at 800-850Mbps.
The crashes occurs here:
lighttpdr1334: segfault at 0000000c eip 0805ef46 esp bfc63e80 error 4
#!asm 0x0805ef44 <fdevent_get_handler+20>: je 0x805ef4f <fdevent_get_handler+31> 0x0805ef46 <fdevent_get_handler+22>: cmp 0x8(%eax),%edx 0x0805ef49 <fdevent_get_handler+25>: jne 0x805ef7c <fdevent_get_handler+76>
#!c 171 if (ev->fdarray[fd]->fd != fd) SEGFAULT();
I guess eax is ev->fdarrayfd, it's not NULL but it's not a valid pointer either, thus, trying to access ev->fdarrayfd->fd makes lighty read from inexistent segments or segments without read permission.
So what's wrong ? Maybe fd is just an index out of the allocated array. Let's have a look on how it's being created.
fdevent.c:
#!c 15 fdevents *fdevent_init(size_t maxfds, fdevent_handler_t type) { 19 ev->fdarray = calloc(maxfds, sizeof(*ev->fdarray));
server.c:
#!c 1076 if (NULL == (srv->ev = fdevent_init(srv->max_fds + 1, srv->event_handler))) {
and earlier in the same file:
#!c 679 if (0 != getrlimit(RLIMIT_NOFILE, &rlim)) { 680 log_error_write(srv, __FILE__, __LINE__, 681 "ss", "couldn't get 'max filedescriptors'", 682 strerror(errno)); 683 return -1; 684 } 685 686 if (use_rlimit && srv->srvconf.max_fds) { 687 /* set rlimits */ 688 689 rlim.rlim_cur = srv->srvconf.max_fds; 690 rlim.rlim_max = srv->srvconf.max_fds; 691 692 if (0 != setrlimit(RLIMIT_NOFILE, &rlim)) { 693 log_error_write(srv, __FILE__, __LINE__, 694 "ss", "couldn't set 'max filedescriptors'", 695 strerror(errno)); 696 return -1; 697 } 698 } 700 /* #372: solaris need some fds extra for devpoll */ 701 if (rlim.rlim_cur > 10) rlim.rlim_cur -= 10; 827 srv->max_fds = rlim.rlim_cur;
So, here's what's being done:
- the process fetch the current configured rlimits and save it in rlim
- if the configuration has a setting for max_fds, it override the one that's configured for the current task
- The, the max_fds get decremented by 10 (solaris bugfix, yay)
- and the allocation of the file descriptor array is being made using max_fds as size.
And here's what happen:
- The system can give you more than max_fds file descriptors
- fd > max_fds
- sigsegv
Possible workarounds:
- comment line 701 in server.c if you're not running solaris
or
- replace maxfds by maxfds + 10 in line 19 of fdevent.c
or
- fix this race condition ;-)
Files
Updated by stbuehler almost 17 years ago
If i understood the #372 problem correctly, Solaris doesn't want to poll for rlim.cur fds, as one is used for the /dev/poll fd, and returns an error.
But it doesn't seem to matter if the dopoll.dp_nfds value is a little bit smaller - it is just the max number of events to be polled in one syscall.
So i think reducing the number dopoll.dp_nfds by one in fdevent_solaris_devpoll_poll should fix #372 and we can remove the previous "fix" for it to fix this bug (#1562).
Updated by admin almost 17 years ago
Does this apply to 1.4.13 as well?
Updated by stbuehler almost 17 years ago
Yes; #372 was "fixed" (i.e. introduced this bug) in r853, which was between 1.4.7 and 1.4.8.
Updated by stbuehler almost 17 years ago
- Status changed from New to Fixed
- Resolution set to fixed
Fixed in r2082
Also available in: Atom