Bug #2125

Multiple https certs doesn't work

Added by dbb over 4 years ago. Updated over 4 years ago.

Status:FixedStart date:2009-12-27
Priority:NormalDue date:
Assignee:-% Done:

100%

Category:-
Target version:1.4.27
Missing in 1.5.x:No

Description

See details in this thread: http://redmine.lighttpd.net/boards/2/topics/2495

Not original reporter, just creating the issue so that it can be fixed.

error.log Magnifier - Error log while running tester (224 KB) dbb, 2010-01-10 20:13


Related issues

Duplicated by Feature #2198: SNI not following regular expressions Duplicate 2010-05-14
Duplicated by Bug #2291: SNI doesn't actually work, returns a random certificate Duplicate 2011-01-22

Associated revisions

Revision 2724
Added by stbuehler over 4 years ago

Reset uri.authority before TLS servername handling, reset all "keep-alive" data in connection_del (fixes #2125)

History

#1 Updated by nitrox over 4 years ago

Just a few hours ago we debugged this on our irc-channel, this time it was an old openssl/libssl problem, updated -> worked. Make sure you´ve atleast openssl 0.9.8_j_!

You can test it with openssl´s own ssl-server implementation:
openssl s_server -cert cert1.pem -cert2 cert2.pem -servername host2.domain.tld

Check with the following that 2 different certs are presented:
openssl s_client -connect domain.tld:4433 -tls1 vs.
openssl s_client -connect domain.tld:4433 -tls1 -servername host2.domain.tld

If that works, check if your lighty version is current, atleast 1.4.24, .25 fixed a small problem and is recommended.

Basically you just add a ssl.pemfile statement to your host-conditional, a working ssl setup assumed here.

#2 Updated by nitrox over 4 years ago

  • Status changed from New to Need Feedback

#3 Updated by dbb over 4 years ago

nitrox wrote:

Just a few hours ago we debugged this on our irc-channel, this time it was an old openssl/libssl problem, updated -> worked. Make sure you´ve atleast openssl 0.9.8_j_!

You can test it with openssl´s own ssl-server implementation:
openssl s_server -cert cert1.pem -cert2 cert2.pem -servername host2.domain.tld

Check with the following that 2 different certs are presented:
openssl s_client -connect domain.tld:4433 -tls1 vs.
openssl s_client -connect domain.tld:4433 -tls1 -servername host2.domain.tld

If that works, check if your lighty version is current, atleast 1.4.24, .25 fixed a small problem and is recommended.

Basically you just add a ssl.pemfile statement to your host-conditional, a working ssl setup assumed here.

I'm running Arch Linux with OpenSSL 0.9.8l and Lighttpd 1.4.25:

dbb@guru ~ $ pacman -Qi openssl lighttpd | egrep "Name|Version" 
Name           : openssl
Version        : 0.9.8l-1
Name           : lighttpd
Version        : 1.4.25-1

In case the problem may be distribution specific, the patches applied to the OpenSSL package for Arch can be found here: http://repos.archlinux.org/wsvn/packages/openssl/repos/core-i686/. There are no patches applied to the Lighttpd package.

I manually tested with OpenSSL's server/client as above, which worked. However I'm still seeing the issue as described in the thread, in particular, if you have a primary and secondary host, Lighttpd will always return the proper certificate for the secondary host up until the point where a request is made to the primary host, which from then on out leads to only the primary certificate being returned.

For reference here is the relevant section of my configuration:

$SERVER["socket"] == ":443" {
    ssl.engine = "enable" 
    ssl.pemfile = "/etc/lighttpd/certs/liqd.org.pem" 
    ssl.ca-file = "/etc/lighttpd/certs/liqd.org.ca" 

    $HTTP["host"] == "liqd.org" {
        ssl.pemfile = "/etc/lighttpd/certs/liqd.org.pem" 
        ssl.ca-file = "/etc/lighttpd/certs/liqd.org.ca" 
    }

    $HTTP["host"] == "code.liqd.org" {
        ssl.pemfile = "/etc/lighttpd/certs/code.liqd.org.pem" 
        ssl.ca-file = "/etc/lighttpd/certs/code.liqd.org.ca" 
    }

    $HTTP["host"] =~ "^www\.(.*)$" {
        url.redirect = ( "^/(.*)" => "https://%1/$1" )
    }
}

#4 Updated by dbb over 4 years ago

Any word on this?

#5 Updated by nitrox over 4 years ago

Well, for some it works without problems, for others it doesn´t. But we don´t have enough informations, debugs, straces, logs to reproduce it. On the other side we had some cases on irc, where updating openssl solved all problems.

#6 Updated by dbb over 4 years ago

So I whipped up a script to test this:

http://pastebin.com/f528c07d

Basically it uses gnutls-cli to randomly make connections to a list of hosts and check that the certificate returned matches the host. My observations using it on my host:

  1. After I did some rearranging of my certificates yesterday, it appears this error is harder to reproduce. The main thing I did was remove my separate CA certificates and concatenate the entire chain into a single file for each certificate. Therefore I am no longer using ssl.ca-file.
  2. Running the tester for several thousand connections will work fine but once a mismatch occurs (I had to up the number of connections to 10,000 to consistently get a break... eventually), pretty much any subsequent connections will result in a mismatch as well.
  3. The break always occurs on the host that is not the default, i.e. when a break occurs the certificate that is returned is the one defined outside of any conditionals.
  4. When running 4 of the testers in parallel when one broke, 2 others immediately broke as well. However one kept going successfully for significantly longer until breaking later.

And like I said earlier, I'm running the latest stable versions of both lighttpd and openssl.

#7 Updated by stbuehler over 4 years ago

I'm not sure if it is good to skip ssl.ca-file - i think openssl will ignore the additional certs in the pem-file. (Perhaps you can use the same file for pem-file and ca-file).

I think we need some debug info to find the bug, so if you know how to code c a little bit you can try this:
  • print a debug line for every call to network_ssl_servername_callback (network.c)
  • enable the "failed to get TLS server name" log line (#if 1 instead of #if 0)
  • log the TLS server name (copy line 94+95 in network.c) after a successful call to SSL_get_servername

It would be interesting what debug output you get if the certs don't match anymore.

--

script from http://pastebin.com/f528c07d

#!/bin/bash

numtries=10000
hosts=(host1.localhost.local host2.localhost.local host3.localhost.local)

checkcert()
{
    output=$(gnutls-cli $1 &)
    check=$(echo "$output" | grep "hostname")

    if [[ "$check" != *NOT* ]]; then
        # match
        echo -en "\e[32m" 
        echo "$check" 
        echo -en "\e[0m" 
    else
        # no match
        echo -en "\e[31m" 
        echo "$output" 
        echo -en "\e[0m" 
        exit 1
    fi
}

randhost()
{
    let "i = $RANDOM % ${#hosts[@]}" 

    echo ${hosts[$i]}
}

for i in $(seq 1 $numtries)
do
    checkcert $(randhost)
done

(please always inline/attach files so we have all information in the tracker)

#8 Updated by dbb over 4 years ago

stbuehler wrote:

I'm not sure if it is good to skip ssl.ca-file - i think openssl will ignore the additional certs in the pem-file. (Perhaps you can use the same file for pem-file and ca-file).

I think we need some debug info to find the bug, so if you know how to code c a little bit you can try this:
  • print a debug line for every call to network_ssl_servername_callback (network.c)
  • enable the "failed to get TLS server name" log line (#if 1 instead of #if 0)
  • log the TLS server name (copy line 94+95 in network.c) after a successful call to SSL_get_servername

I made the following changes in network.c:

--- lighttpd-orig/lighttpd-1.4.25/src/network.c    2009-10-16 18:03:44.000000000 -0400
+++ lighttpd/src/lighttpd-1.4.25/src/network.c    2010-01-10 14:46:59.290717919 -0500
@@ -65,14 +65,16 @@

 #if defined USE_OPENSSL && ! defined OPENSSL_NO_TLSEXT
 static int network_ssl_servername_callback(SSL *ssl, int *al, server *srv) {
-    const char *servername;
+    log_error_write(srv, __FILE__, __LINE__, "ss", "SSL:", "called network_ssl_servername_callback");
+    
+    const char *servername;
     connection *con = (connection *) SSL_get_app_data(ssl);
     UNUSED(al);

     buffer_copy_string(con->uri.scheme, "https");

     if (NULL == (servername = SSL_get_servername(ssl, TLSEXT_NAMETYPE_host_name))) {
-#if 0
+#if 1
         /* this "error" just means the client didn't support it */
         log_error_write(srv, __FILE__, __LINE__, "ss", "SSL:",
                 "failed to get TLS server name");
@@ -103,7 +105,10 @@
         return SSL_TLSEXT_ERR_ALERT_FATAL;
     }

-    return SSL_TLSEXT_ERR_OK;
+    log_error_write(srv, __FILE__, __LINE__, "ssb", "SSL:",
+        "successful call to get servername", con->tlsext_server_name);
+    
+    return SSL_TLSEXT_ERR_OK;
 }
 #endif

Then I ran the tester. One thing I noticed is that to cause the break a request needs to come in from an actual browser. I ran through 40000 connections without issue, but then refreshing the pages in Firefox a few times triggers a break. Anyway, the error log from this is attached, although I don't think it provides any additional insight. One additional observation though, is that this time the break went boths ways (connecting to host1 returned the certificate for host2, and connecting o host2 returned the certificate for host1):

Resolving 'host2.localhost.local'...
Connecting to '127.0.0.1:443'...
- Certificate type: X.509
 - Got a certificate list of 1 certificates.
 - Certificate[0] info:
  - subject `C=US,ST=NY,L=New York,O=Internet Widgits Pty Ltd,CN=host1.localhost.local', issuer `C=US,ST=NY,L=New York,O=Internet Widgits Pty Ltd,CN=host1.localhost.local', RSA key 1024 bits, signed using RSA-SHA, activated `2010-01-10 13:39:36 UTC', expires `2020-01-08 13:39:36 UTC', SHA-1 fingerprint `bab1ee719387dbab5d7f901bdab695ca613fd266'
- The hostname in the certificate does NOT match 'host2.localhost.local'
Resolving 'host1.localhost.local'...
Connecting to '127.0.0.1:443'...
- Certificate type: X.509
 - Got a certificate list of 1 certificates.
 - Certificate[0] info:
  - subject `C=US,ST=NY,L=New York,O=Internet Widgits Pty Ltd,CN=host2.localhost.local', issuer `C=US,ST=NY,L=New York,O=Internet Widgits Pty Ltd,CN=host2.localhost.local', RSA key 1024 bits, signed using RSA-SHA, activated `2010-01-10 13:40:02 UTC', expires `2020-01-08 13:40:02 UTC', SHA-1 fingerprint `8e78aeb59a42b3da70826571a40c9f1d6d48c4bc'
- The hostname in the certificate does NOT match 'host1.localhost.local'

EDIT: Also, having the entire chain in one file doesn't seem to be an issue. I've tested it and browsers see the entire certificate chain and validate the site.

#9 Updated by KawatoriShinji over 4 years ago

For some reason con->uri.authority when network_ssl_servername_callback() gets called. When the callback calls config_patch_connection(srv, con, COMP_HTTP_HOST), it sees uri.authority and never looks at tlsext_server_name.

Since I haven't looked outside of the callback or the COMP_HTTP_HOST patch code I don't know how hacky this is, but I just dropped a fix directly into the callback. So far it's worked for me.

diff -Nrup src/network.c src/network.c
--- src/network.c       2010-03-01 16:44:01.000000000 +0000
+++ src/network.c       2010-03-01 21:51:16.000000000 +0000
@@ -82,6 +82,9 @@ static int network_ssl_servername_callba
        buffer_copy_string(con->tlsext_server_name, servername);
        buffer_to_lower(con->tlsext_server_name);

+       /* Sometimes this is still set, confusing COMP_HTTP_HOST */
+       buffer_reset(con->uri.authority);
+
        config_cond_cache_reset(srv, con);
        config_setup_connection(srv, con);

#10 Updated by digital over 4 years ago

KawatoriShinji- Thank you for the patch; I've just applied it to my lighttpd 1.4.25 branch [FreeBSD 7.x] env, but appears that it had no effect.

KawatoriShinji wrote:

For some reason con->uri.authority when network_ssl_servername_callback() gets called. When the callback calls config_patch_connection(srv, con, COMP_HTTP_HOST), it sees uri.authority and never looks at tlsext_server_name.

Since I haven't looked outside of the callback or the COMP_HTTP_HOST patch code I don't know how hacky this is, but I just dropped a fix directly into the callback. So far it's worked for me.

[...]

#11 Updated by digital over 4 years ago

  • Target version set to 1.4.27

#12 Updated by Necoro over 4 years ago

This small patch seems to solve the issue on my setup (which is lighttpd-1.4.25 with the additional CVE-2010-0295 patch running on Gentoo Linux). I'll report if things change :)

#13 Updated by stbuehler over 4 years ago

  • Status changed from Need Feedback to Fixed
  • % Done changed from 0 to 100

Applied in changeset r2724.

#14 Updated by digital over 4 years ago

For those that continue to have a problem and are testing under Windows XP:

SNI does not work on Windows XP using various browsers, see:
http://en.wikipedia.org/wiki/Server_Name_Indication

(Thus not a Lighttpd issue)

#15 Updated by dbb over 4 years ago

Tested 1.4.27rc1 on my own server with 10,000 iterations of my test script and no wrong certificates were returned. Looks like this one has finally be resolved.

#16 Updated by stbuehler over 4 years ago

Sounds good - thank you very much for your feedback!

Also available in: Atom