Project

General

Profile

Bug #1320

check return value of calloc in mod_simple_vhost.c

Added by Safari almost 13 years ago. Updated over 11 years ago.

Status:
Wontfix
Priority:
Normal
Category:
core
Target version:
ASK QUESTIONS IN Forums:

Description

check return value of calloc in mod_simple_vhost.c


Files

#1

Updated by ralf almost 13 years ago

this problem affects more calls to malloc(and friends).

#2

Updated by jan almost 13 years ago

  • Status changed from New to Assigned

just put assert() around it and let it die. We can't handle it graceful anyway.

The buffer.c code already does that.

#3

Updated by admin almost 13 years ago

just put assert() around it and let it die.

Is assert supposed to be used like that? I don't think so.

#4

Updated by ralf almost 13 years ago

Replying to Olaf van der Spek:

just put assert() around it and let it die.

Is assert supposed to be used like that? I don't think so.

"The assert() macro may be removed at compile time with the cc(1) option -DNDEBUG."

#5

Updated by stbuehler over 11 years ago

  • Target version changed from 1.4.20 to 1.4.21
#6

Updated by ralf over 11 years ago

Safari wrote:

check return value of calloc in mod_simple_vhost.c

What about wrapping all the malloc functions?

i play a bit and found the linker option: --wrap (see ld(1))

some example code:

#include <stdio.h>
#include <stdlib.h>

/* prototypes */
void __real_free(void *);
void * __real_malloc(size_t);

/* the wrapper for free() */
void __wrap_free(void *ptr)
{
        printf ("%s(%p)\n", __FUNCTION__, ptr);
        __real_free(ptr);
}

/* the wrapper for malloc() */
void * __wrap_malloc(size_t s)
{
        void *p;

        p = __real_malloc(s);

        printf ("%s(%zu): %p\n", __FUNCTION__, s, p);

        return p;
}

/* free the malloc's! */
int main(void)
{
        free(malloc(4711));
        return 0;
}

example output:

$ ./test_file
__wrap_malloc(4711): 0x501010
__wrap_free(0x501010)

example makefile part:

test_file: test_file.c $(OBJECTS) Makefile
        @echo "LD $@" 
        @$(CC) $(CFLAGS) -o $@ $< $(OBJECTS) -Wl,-wrap,free -Wl,-wrap,malloc $(LDFLAGS)

I dont know if it works correctly on other systems (*BSD, cygwin, ...).

#7

Updated by Safari over 11 years ago

How were you going to fix the bug (null pointer dereference) with this wrap thingie?

Out of memory -conditions can be logged once a minute or so, no need for asserts to flood the logs.
At the upper level, the functions which call functions in mod_simple_vhost.c , could do maybe_warn_oom(func, size_of_alloc_attempt) which writes a log line at max once a minute. Or also additionally do garbage-collect (close oldest idle connections etc.).

#8

Updated by icy over 11 years ago

  • Subject changed from check return value of calloc in mod_simple_vhost.c to check return value of calloc in mod_simple_vhost.c
  • Patch available set to No

If malloc returns NULL, it means that even the swap space has run out. In such situations, there isn't much you can do.
Of course you can free unneeded data and try again.

void *try_malloc(size_t s) {
   void *p = malloc(s);
   if (p)
      return p;
   free_caches();
   return malloc(s);
}

But usually in such situations, the server is so overloaded that you can safely assume that normal operation isn't guaranteed anymore and let the app die due to the nullpointer dereferencing.
The OS OOM-killer will terminate stuff anyway.

In 2.0 for example, we use glib which by default terminates the app if a malloc fails.

#9

Updated by ralf over 11 years ago

Safari wrote:

How were you going to fix the bug (null pointer dereference) with this wrap thingie?

The wrap thing avoids you from rewriting many lines of code (replace all "malloc","free","calloc","realloc") function calls.

There maybe a problem - i dont know - with the dynamically loaded modules.

If you write your own function (eg: "lighttpd_malloc()") then other modules or developers maybe a bit to lazy to use them because (a) they dont know that the function exists (b) they forgot it (c) ...

What ever function is used, i think the result in "NULL==malloc()" is that the app must be killed.
As icy says there is no way to guarantee normal operation.

A special Linux thing is that malloc()'d memory must not be really free for usage, eg: malloc(1M) -> does not allocate the memory, only memset(ptr,1M,0) allocates it.

If you rewrite/wrap the malloc() stuff you maybe insert boundaries (a header and trailer) in the memory for error detection.
If there was an error the app should die, because its not clear what happen because someone overwrites memory blocks. However, this is maybe a too large change.

#10

Updated by Safari over 11 years ago

In your first example wrap you did nothing if NULL was returned by malloc.

1) doing null pointer dereference on purpose is not smart.
http://www.google.fi/search?q="null+pointer+dereference"+bugtraq

2) on Linux, with sysctl vm.overcommit_memory=2 , when malloc returns != NULL, it's guaranteed to be usable.

3) I don't know would using lighttpd_malloc be worse, considering the current laziness in checking system call return values.

4) portability of __wrap or LD_PRELOAD hacks? With lighttpd_malloc it just works and is funnier to debug.

5) some apps actually do not crash and bomb when they run out of memory, instead, they wait a couple of secs and try again (for example qmail in some modules) or log error and quit without segfaulting.
Maybe similar approach can be used in lighttpd: if for example no mem for new connection, log it, don't bomb out (null ptr deref in connections_get_new_connection), don't shutdown http service.

6) I don't know would it be wise to terminate app unconditionally in case malloc() fails, it only makes self-denial-of-service easier. Maybe there's just one page which is causing mem allocation trouble, when it bombs out, the whole http service is down till it's restarted (at least 1-2 s).
Or does that termination apply only to glib's internal functions? Well, maybe, likewise, there's only one page which sometimes causes invalid parameters to be passed to glib, which cause mem alloc failures.

#11

Updated by ralf over 11 years ago

Safari wrote:

In your first example wrap you did nothing if NULL was returned by malloc.

It was just a example of ld's wrap feature.

1) doing null pointer dereference on purpose is not smart.
http://www.google.fi/search?q="null+pointer+dereference"+bugtraq

Not checking return values is never smart ;)

2) on Linux, with sysctl vm.overcommit_memory=2 , when malloc returns != NULL, it's guaranteed to be usable.

ah

3) I don't know would using lighttpd_malloc be worse, considering the current laziness in checking system call return values.

You've to replace many lines of code and you (or who ever) must ensure that only "lighttp_*()" functions are used.
Maybe this is not a problem since someone reads the code before it gets implemented and check for malloc() and friends.
In the source "lighttpd_malloc()" is more clear then some ld-wrap-kung-fu.

4) portability of __wrap or LD_PRELOAD hacks? With lighttpd_malloc it just works and is funnier to debug.

Yep ... this wrap is just a hack to ,,get it done'' - its not fine (but even a cool ld feature which was new to me).

5) some apps actually do not crash and bomb when they run out of memory, instead, they wait a couple of secs and try again (for example qmail in some modules) or log error and quit without segfaulting.
Maybe similar approach can be used in lighttpd: if for example no mem for new connection, log it, don't bomb out (null ptr deref in connections_get_new_connection), don't shutdown http service.

6) I don't know would it be wise to terminate app unconditionally in case malloc() fails, it only makes self-denial-of-service easier. Maybe there's just one page which is causing mem allocation trouble, when it bombs out, the whole http service is down till it's restarted (at least 1-2 s).
Or does that termination apply only to glib's internal functions? Well, maybe, likewise, there's only one page which sometimes causes invalid parameters to be passed to glib, which cause mem alloc failures.

Now i understand you.

It makes really no sense to kick out 1000 connected users only because the admin retrieves "http://server/strange-status-script.php" (or whatever) which needs much memory.
This can only be done in the source which calls malloc() (or lighttpd_malloc()) - as your original patch does (i think),
eg. by drop/free'ing the current connection/client/request/...

#12

Updated by icy over 11 years ago

Safari wrote:

1) doing null pointer dereference on purpose is not smart.
http://www.google.fi/search?q="null+pointer+dereference"+bugtraq

In order to exploit! null pointer dereferences, one has to control either an offset which is used together with the pointer or do other nasty stuff that I will not discuss here.
From what I can tell, it is not possible to exploit this in lighty. If you know of a way, please tell us.

5) some apps actually do not crash and bomb when they run out of memory, instead, they wait a couple of secs and try again (for example qmail in some modules) or log error and quit without segfaulting.
Maybe similar approach can be used in lighttpd: if for example no mem for new connection, log it, don't bomb out (null ptr deref in connections_get_new_connection), don't shutdown http service.

Waiting a couple of seconds is a no-no for a daemon like lighty. The error logging and exiting is much better.
Memory for new connections is in most cases not alloced, it already has been before. There are other places though.

6) I don't know would it be wise to terminate app unconditionally in case malloc() fails, it only makes self-denial-of-service easier. Maybe there's just one page which is causing mem allocation trouble, when it bombs out, the whole http service is down till it's restarted (at least 1-2 s).
Or does that termination apply only to glib's internal functions? Well, maybe, likewise, there's only one page which sometimes causes invalid parameters to be passed to glib, which cause mem alloc failures.

What self-denial-of-service?
A restart does not take 1-2 seconds. Where did you get those numbers from? It usually should take maybe a something in the range of milliseconds.
The glib malloc-failed-termination consists of writing an error to the stderr and exiting the application.

But I agree that a smart handling would be a good thing to have.

#13

Updated by Safari over 11 years ago

icy wrote:

A restart does not take 1-2 seconds. Where did you get those numbers from? It usually should take maybe a something in the range of milliseconds.

I use supervise to, well, supervise lighttpd process:

  1. cat /service/lighttpd/run
    #!/bin/sh
    exec 2>&1
    /bin/date -R >> run.log
    /bin/sleep 1
    ulimit -n 4096
    unset LANG LC_CTYPE LC_ALL
    exec /usr/local/sbin/lighttpd-r2356 -m /usr/lib64/lighttpd -D -f /etc/lighttpd/lighttpd.conf >& /dev/null < /dev/null

...

Now, if there was no sleep of any kind, and there is a problem starting lighttpd, it would chew 100% CPU and fill up the partition.

#14

Updated by icy over 11 years ago

Safari wrote:

I use supervise to, well, supervise lighttpd process:
Now, if there was no sleep of any kind, and there is a problem starting lighttpd, it would chew 100% CPU and fill up the partition.

There is no need to put a sleep in the run file - supervise does prevent such behaviour itself already and in a better way (delay AFTER startup):

It pauses for a second after starting ./run, so that it does not loop too quickly if ./run exits immediately.

But this is getting off topic.

#15

Updated by icy over 11 years ago

  • Status changed from Assigned to Wontfix
  • Pending changed from Yes to No

Also available in: Atom