Bug #1562

sigsegv @ fdevent_get_handler - when congestion occurs, and file descriptor arrays is full.

Added by fdeletang almost 9 years ago. Updated over 8 years ago.

I'm experiencing segfaults when congestion occurs, at 800-850Mbps.

The crashes occurs here:
lighttpdr1334: segfault at 0000000c eip 0805ef46 esp bfc63e80 error 4

0x0805ef44 <fdevent_get_handler+20>:    je     0x805ef4f <fdevent_get_handler+31>
0x0805ef46 <fdevent_get_handler+22>:    cmp    0x8(%eax),%edx
0x0805ef49 <fdevent_get_handler+25>:    jne    0x805ef7c <fdevent_get_handler+76>

171         if (ev->fdarray[fd]->fd != fd) SEGFAULT();

I guess eax is ev->fdarrayfd, it's not NULL but it's not a valid pointer either, thus, trying to access ev->fdarrayfd->fd makes lighty read from inexistent segments or segments without read permission.

So what's wrong ? Maybe fd is just an index out of the allocated array. Let's have a look on how it's being created.


 15 fdevents *fdevent_init(size_t maxfds, fdevent_handler_t type) {
 19         ev->fdarray = calloc(maxfds, sizeof(*ev->fdarray));


1076         if (NULL == (srv->ev = fdevent_init(srv->max_fds + 1, srv->event_handler))) {

and earlier in the same file:

 679                 if (0 != getrlimit(RLIMIT_NOFILE, &rlim)) {
 680                         log_error_write(srv, __FILE__, __LINE__,
 681                                         "ss", "couldn't get 'max filedescriptors'",
 682                                         strerror(errno));
 683                         return -1;
 684                 }
 686                 if (use_rlimit && srv->srvconf.max_fds) {
 687                         /* set rlimits */
 689                         rlim.rlim_cur = srv->srvconf.max_fds;
 690                         rlim.rlim_max = srv->srvconf.max_fds;
 692                         if (0 != setrlimit(RLIMIT_NOFILE, &rlim)) {
 693                                 log_error_write(srv, __FILE__, __LINE__,
 694                                                 "ss", "couldn't set 'max filedescriptors'",
 695                                                 strerror(errno));
 696                                 return -1;
 697                         }
 698                 }

 700                 /* #372: solaris need some fds extra for devpoll */
 701                 if (rlim.rlim_cur > 10) rlim.rlim_cur -= 10;

 827                         srv->max_fds = rlim.rlim_cur;

So, here's what's being done:
- the process fetch the current configured rlimits and save it in rlim
- if the configuration has a setting for max_fds, it override the one that's configured for the current task
- The, the max_fds get decremented by 10 (solaris bugfix, yay)
- and the allocation of the file descriptor array is being made using max_fds as size.

And here's what happen:
- The system can give you more than max_fds file descriptors
- fd > max_fds
- sigsegv

Possible workarounds:
- comment line 701 in server.c if you're not running solaris
- replace maxfds by maxfds + 10 in line 19 of fdevent.c
- fix this race condition ;-)

Fix-372-and-1562.patch View - Patch for 1.4 and 1.5 (1.23 KB) stbuehler, 2008-02-13 14:38

Associated revisions

Revision 796502e7 (diff)
Added by stbuehler almost 9 years ago

r2087@chromobil: stefan | 2008-02-26 17:01:12 +0100
Fix #1562 and try re-fixing #372: out of range access in fd array

- Bug is in original #372 fix [853]
- The re-fix for #372 is not tested:
the problem is that Solaris doesn't want to poll for maxfds (ulimit) events,
as at least one filedescriptor is used for the poll device.
So the solution is to just ask for one event less; the number of events
actually available is returned by the poll syscall, so it should work.

git-svn-id: svn:// 152afb58-edef-0310-8abb-c4023f1b3aa9


#1 Updated by stbuehler almost 9 years ago

If i understood the #372 problem correctly, Solaris doesn't want to poll for rlim.cur fds, as one is used for the /dev/poll fd, and returns an error.

But it doesn't seem to matter if the dopoll.dp_nfds value is a little bit smaller - it is just the max number of events to be polled in one syscall.

So i think reducing the number dopoll.dp_nfds by one in fdevent_solaris_devpoll_poll should fix #372 and we can remove the previous "fix" for it to fix this bug (#1562).

#2 Updated by admin almost 9 years ago

#3 Updated by stbuehler almost 9 years ago

Yes; #372 was "fixed" (i.e. introduced this bug) in r853, which was between 1.4.7 and 1.4.8.

#4 Updated by admin almost 9 years ago

Doesn't this allow for an easy DoS attack?

#5 Updated by admin almost 9 years ago

It's remotely exploitable...

#6 Updated by stbuehler almost 9 years ago

  • Status changed from New to Fixed
  • Resolution set to fixed

Fixed in r2082

