Bug #2464


patch for intermittent ldap failures

Added by codehero almost 11 years ago. Updated over 7 years ago.

Target version:


I was encountering intermittent auth failures using ldap. I tracked the issue to LDAP syscalls not recovering from EINTR properly. Setting LDAP_OPT_RESTART allows LDAP to retry the syscall in MOST cases.

Some libdldap syscalls are not EINTR guarded (presently, connect() in ldap_pvt_connect()) and so you can still expect to see intermittent failures from that. Only patching libldap will fix those issues.

Note that my testing platform is an extremely slow processor and is perfect at finding these kinds of conditions.


Related issues 2 (0 open2 closed)

Related to Bug #2345: LDAP authentication problemFixed2011-09-19Actions
Related to Bug #2846: LDAP authentication vs. AD: problems with referrals Fixed2017-12-07Actions
Actions #1

Updated by codehero almost 11 years ago

OpenLDAP incorporated patch into its master to guard the connect() call against the EINTR signal.

In my testing so far (with both patched lighttpd and OpenLDAP) I have not experienced a single connection issue. Please incorporate this patch.

Actions #2

Updated by stbuehler over 10 years ago

  • Target version set to 1.4.x

I'm not sure LDAP_OPT_RESTART has no negative side effects. Would it keep restarting in a loop until it gets a response (and therefore blocking all request handling)? Does it fail after a fixed amount of tries?

Actions #3

Updated by codehero over 10 years ago

LDAP_OPT_RESTART is only checked when syscalls fail due to system interference (ie from a signal. If you looked at the openldap patches I linked you would have seen:

+    } while(err == EINTR &&
+        LDAP_BOOL_GET( &ld->ld_options, LDAP_BOOL_RESTART ));

So my system had intermittent auth failures because external CGI processes exited sending SIGLCHLD while openldap's connect() call was taking place.

You don't have an infinite loop without infinite signals interrupting the lighttpd process.

Actions #4

Updated by stbuehler over 10 years ago

The patches might touch only EINTR stuff with LDAP_OPT_RESTART, but given the docs for the option it was not designed for this (although the doc also says "FIXME").

But did you check the complete openldap code, and did you check all versions? Also, does LDAP_OPT_RESTART have a version requirement?

Also infinite loops on EINTR are bad; for example there could be SIGALRM... - what happens if the connect() would timeout, but SIGALRM restarts it every second?

Actions #5

Updated by codehero over 10 years ago

I did not check all the code openldap code for LDAP_OPT_RESTART and given how much of a pile the openldap code is, I won't. I don't know about any version requirements, as I was only interested in solving this issue for my specific application.

I did not consider SIGALRM, but unless the signalling period is less than the time it takes for connect() to complete I do not see the issue. I was using a custom local LDAP server so this was not an issue for me. I would NOT recommend using a remote LDAP server...

FWIW, I have not seen a single lockup with my code yet. I have made my own (very poor) LDAP server and bugs in my server caused lighty to hang up, but this was without the LDAP_OPT_RESTART change. The takeaway here is that using openldap in the first place is playing with fire.

Actions #6

Updated by gstrauss almost 8 years ago

Instead of LDAP_OPT_RESTART, the following patch modifies the flags to sigaction for SIGCHLD to restart those system calls that are restartable, instead of causing those system calls to return EINTR. See 'man 7 signal' for list of restartable system calls.
This patch only works on systems which support sigaction() instead of only supporting signal(), but most modern systems support sigaction().

diff --git a/src/server.c b/src/server.c
index 26ecbcc..351503d 100644
--- a/src/server.c
+++ b/src/server.c
@@ -973,6 +973,7 @@ int main (int argc, char **argv) {
        sigaction(SIGTERM, &act, NULL);
        sigaction(SIGHUP,  &act, NULL);
        sigaction(SIGALRM, &act, NULL);
+       act.sa_flags |= SA_RESTART | SA_NOCLDSTOP;
        sigaction(SIGCHLD, &act, NULL);

 #elif defined(HAVE_SIGNAL)

Actions #7

Updated by stbuehler over 7 years ago

  • Target version changed from 1.4.x to 1.4.40

I'll go with the patch from gstrauss (thanks), as I don't want to search for how to check for the presence of LDAP_OPT_RESTART (which probably would fix the problem too).

Actions #8

Updated by stbuehler over 7 years ago

  • Status changed from Patch Pending to Fixed
  • % Done changed from 0 to 100

Applied in changeset r3108.

Actions #9

Updated by gstrauss over 7 years ago

  • Related to Bug #2345: LDAP authentication problem added
Actions #10

Updated by gstrauss almost 6 years ago

  • Related to Bug #2846: LDAP authentication vs. AD: problems with referrals added

Also available in: Atom