Project

General

Profile

[Solved] High cpu usage with HTTPS configuration after start up

Added by klaus over 6 years ago

Hi,
We are using lighttpd 1.4.35 on an embedded ARM device. The devices website is polling data in background to display actual "state".
Now we have activated HTTPS. If we restart lighttpd while website continues polling data the server produce 100% cpu usage for one core. This also happen if we disable all plugins. A minimal config for this problem looks like:

server.document-root           = "/www/pages/" 
server.errorlog                = "/var/log/lighttpd.error.log" 

# listen to IPv4
server.bind                    = "0.0.0.0" 
server.port                    = "80" 

# https
$SERVER["socket"] == "0.0.0.0:443" {
    ssl.engine = "enable" 
    ssl.pemfile = "/etc/lighttpd.d/certs/lighttpd.pem" 
}

index-file.names = ( 
    "index.html",
    "index.htm",
    "default.htm" 
)

With debug.log-ssl-noise = "enable" we get multiple times following message: lighttpd-1.4.35/src/connections.c.305 SSL: 1 error:140780E5:SSL routines:ssl23_read:ssl handshake failure.

This problem exists if the website runs with Firefox (version 56/57) and IE11. With Chrome (version 62) we don’t have this issue.

If we open the website in browser after the server started all things are fine.

Is there a workaround? Can I provide other data?


Replies (11)

RE: High cpu usage with HTTPS configuration after start up - Added by gstrauss over 6 years ago

Is there a workaround? Can I provide other data?

Yes. Please try running the current version of lighttpd 1.4.47. lighttpd 1.4.35 was released 3 1/2 years ago. If your operating system distribution does not provide a more recent version, then you should might consider switching to a distribution with a philosophy of more actively maintaining software packages. If running Debian 9 (Stretch) with lighttpd 1.4.45, then you likely want your embedded system to set ssl.read-ahead = "disable", which becomes the default in lighttpd 1.4.46.

RE: High cpu usage with HTTPS configuration after start up - Added by klaus over 6 years ago

Thank you for your quick answer.
Today I've tested version 1.4.47. Unfortunately, the problem also occurs there.
The software is compiled with (just like version 1.4.35):

lighttpd/1.4.47-devel-lighttpd-1.4.47 (ssl) - a light and fast webserver

Event Handlers:
        + select (generic)
        + poll (Unix)
        + epoll (Linux 2.6)
        - /dev/poll (Solaris)
        - eventports (Solaris)
        - kqueue (FreeBSD)
        - libev (generic)

Network handler:
        - linux-sendfile
        - freebsd-sendfile
        - darwin-sendfile
        - solaris-sendfilev
        + writev
        + write
        - mmap support

Features:
        + IPv6 support
        - zlib support
        - bzip2 support
        + crypt support
        + SSL support
        + PCRE support
        - MySQL support
        - PgSQL support
        - DBI support
        - Kerberos support
        - LDAP support
        - memcached support
        - FAM support
        - LUA support
        - xml support
        - SQLite support
        - GDBM support

Config is now:

server.modules = (
    "mod_openssl" 
)

server.document-root           = "/www/pages/" 
server.errorlog                = "/var/log/lighttpd.error.log" 

# listen to IPv4
server.bind                    = "0.0.0.0" 
server.port                    = "80" 

# https
$SERVER["socket"] == "0.0.0.0:443" {
    ssl.engine = "enable" 
    ssl.read-ahead = "disable" 
    ssl.pemfile = "/etc/lighttpd.d/certs/lighttpd.pem" 
}

index-file.names = ( 
    "index.html",
    "index.htm",
    "default.htm" 
)

I use a custom Linux distribution built with Yocto.

RE: High cpu usage with HTTPS configuration after start up - Added by gstrauss over 6 years ago

Thanks for testing with the current version of lighttpd.

Would you help me to reproduce this?

The devices website is polling data in background to display actual "state".

How is it polling? What types of requests? What frequency?
Is the request sending data to the server?
Is the response large?
Have you set server.upload-dirs to a local disk (flash) or is it pointed to an in-memory filesystem? e.g. Are you running out of memory when this happens?
What happens with server.max-keep-alive-idle = 0 ?
What about with:
ssl.read-ahead = "disable"
server.stream-request-body = 2
server.stream-response-body = 2

If we restart lighttpd while website continues polling data the server produce 100% cpu usage for one core.

How are you restarting lighttpd? lighttpd 1.4.46 supports a graceful restart with SIGUSR1. Can you reproduce with a graceful restart?

RE: High cpu usage with HTTPS configuration after start up - Added by klaus over 6 years ago

How is it polling? What types of requests? What frequency?

There are four GET requests (AJAX/xhr) that are sent once a second (see attached files). This requests are normally answered by a custom lighttpd module. The error also occurs if the module is not loaded. Then the server simply respond with a 404.

Is the response large?

Size is ~1kBytes (JSON data) on success otherwise size is size of default error page, e.g. 404.

What happens with server.max-keep-alive-idle = 0 ?
What about with:
ssl.read-ahead = "disable"
server.stream-request-body = 2
server.stream-response-body = 2

I have tested all these settings (version 1.4.47), but with the same result.

How are you restarting lighttpd?

I use: /etc/init.d/lighttpd restart for the old version

lighttpd 1.4.46 supports a graceful restart with SIGUSR1. Can you reproduce with a graceful restart?

With kill -10 $(pidof lighttpd) I can reproduce it.

Are you running out of memory when this happens?

Nope. Looks like a "endless" loop. If I have a high CPU load and then close the website/browser, everything is fine after a while (~30-60 seconds). I can reopen the website and all works fine.

Today I tested some other things too.
- run lighttpd on my x86 Ubuntu machine (14.04.4 LTS). Behavior does not exist here. :-/
- compieled a new image with an other openssl version (1.0.2h instead of 1.0.2d). Without success.

Would you help me to reproduce this?

Yes of course.
I tried to emulate the requests with the Postman Runner. But that didn't work. I think without the website and ARM environment it will be difficult.
I am not an SSL expert. But I suspect that the browser tries to connect with old "SSL/TLS data" (see first posting: handshake failure) and then something happens...?

RE: High cpu usage with HTTPS configuration after start up - Added by gstrauss over 6 years ago

Do you have gdb in your environment? If so, please attach to the pid (gdb -p $(pidof lighttpd)) and get a few where; continue; Ctrl-C; where; continue; Ctrl-C; where; quit stack traces. It may be helpful to find out where it is spinning.

While I am glab that you couldn't reproduce it under x86 Ubuntu, this will make things more difficult to track down. I, too, tried to reproduce on x86_64 Fedora with some simple test cases, but was unable to do so with the simple tests that I tried (they all worked as expected).

RE: High cpu usage with HTTPS configuration after start up - Added by klaus over 6 years ago

See attachment for gdb output.
It looks like it's not a lighttpd problem. The debugger always stops in libcrypto code. And this code seems to be ARM specific.

Do you have any more ideas as to how I could proceed?

RE: High cpu usage with HTTPS configuration after start up - Added by gstrauss over 6 years ago

Has any ssl worked on this system? Have you tried testing openssl-client? Maybe curl with ssl support as a client to an external SSL site?

If you're building your own packages, you might try building libressl instead of openssl, and building and linking lighttpd against libressl. Or try the latest openssl 1.1.0f. Another thing to try would be to compile both lighttpd and openssl without any compiler optimization enabled.

A corrupt stack is really bad, and I would very much like to help figure out if there is something in lighttpd causing this, though I hope not.

RE: High cpu usage with HTTPS configuration after start up - Added by klaus over 6 years ago

Has any ssl worked on this system?

Yes. We have a working OpenSSH deamon and we are using OpenSSL for signature checking.

Many other packages/parts depends on OpenSSL. If I update to a new "main" version of OpenSSL I have to maintain/update a lot of other yocto recipes.

I built lighttpd and openssl without any compiler optimization (-O0) enabled with same result.

A corrupt stack is really bad...

I have built a new image (OpenSSL with configure option no-asm) and used the correct debug symbols. So I get a nice stack trace and no "corrupt stack" hint. See attachments.

RE: High cpu usage with HTTPS configuration after start up - Added by klaus over 6 years ago

Today I tested an other certificate (2048 bit private key instead of 4096 bit) and I don't have the problem described above.

Maybe our cortexa9 (dual core @800 MHz) is too slow in some situations for such a large key?

For now I'll use the smaller key...

RE: High cpu usage with HTTPS configuration after start up - Added by stbuehler over 6 years ago

For benchmarking you could use openssl rsautl on you ARM to create signatures for a small file.

My guess is that the ssl23_read:ssl handshake failure errors are due to clients closing the connection after a timeout.

RE: High cpu usage with HTTPS configuration after start up - Added by klaus over 6 years ago

Thank you for your feedback and thank you for your quick support. By using the smaller key, the issue is resolved for me.

If you need any more information, just let me know.

    (1-11/11)