Project

General

Profile

Actions

Bug #2464

closed

patch for intermittent ldap failures

Added by codehero over 11 years ago. Updated about 8 years ago.

Status:
Fixed
Priority:
Normal
Category:
-
Target version:
ASK QUESTIONS IN Forums:

Description

I was encountering intermittent auth failures using ldap. I tracked the issue to LDAP syscalls not recovering from EINTR properly. Setting LDAP_OPT_RESTART allows LDAP to retry the syscall in MOST cases.

Some libdldap syscalls are not EINTR guarded (presently, connect() in ldap_pvt_connect()) and so you can still expect to see intermittent failures from that. Only patching libldap will fix those issues.

Note that my testing platform is an extremely slow processor and is perfect at finding these kinds of conditions.


Files


Related issues 3 (0 open3 closed)

Related to Bug #2345: LDAP authentication problemFixed2011-09-19Actions
Related to Bug #2846: LDAP authentication vs. AD: problems with referrals Fixed2017-12-07Actions
Related to Feature #2294: add ldap referrals supportFixedActions
Actions #1

Updated by codehero about 11 years ago

OpenLDAP incorporated patch into its master to guard the connect() call against the EINTR signal.

http://www.openldap.org/its/index.cgi/Software%20Bugs?id=7476

In my testing so far (with both patched lighttpd and OpenLDAP) I have not experienced a single connection issue. Please incorporate this patch.

Actions #2

Updated by stbuehler almost 11 years ago

  • Target version set to 1.4.x

I'm not sure LDAP_OPT_RESTART has no negative side effects. Would it keep restarting in a loop until it gets a response (and therefore blocking all request handling)? Does it fail after a fixed amount of tries?

Actions #3

Updated by codehero almost 11 years ago

LDAP_OPT_RESTART is only checked when syscalls fail due to system interference (ie from a signal. If you looked at the openldap patches I linked you would have seen:

+    } while(err == EINTR &&
+        LDAP_BOOL_GET( &ld->ld_options, LDAP_BOOL_RESTART ));

So my system had intermittent auth failures because external CGI processes exited sending SIGLCHLD while openldap's connect() call was taking place.

You don't have an infinite loop without infinite signals interrupting the lighttpd process.

Actions #4

Updated by stbuehler almost 11 years ago

The patches might touch only EINTR stuff with LDAP_OPT_RESTART, but given the docs for the option it was not designed for this (although the doc also says "FIXME").

But did you check the complete openldap code, and did you check all versions? Also, does LDAP_OPT_RESTART have a version requirement?

Also infinite loops on EINTR are bad; for example there could be SIGALRM... - what happens if the connect() would timeout, but SIGALRM restarts it every second?

Actions #5

Updated by codehero almost 11 years ago

I did not check all the code openldap code for LDAP_OPT_RESTART and given how much of a pile the openldap code is, I won't. I don't know about any version requirements, as I was only interested in solving this issue for my specific application.

I did not consider SIGALRM, but unless the signalling period is less than the time it takes for connect() to complete I do not see the issue. I was using a custom local LDAP server so this was not an issue for me. I would NOT recommend using a remote LDAP server...

FWIW, I have not seen a single lockup with my code yet. I have made my own (very poor) LDAP server and bugs in my server caused lighty to hang up, but this was without the LDAP_OPT_RESTART change. The takeaway here is that using openldap in the first place is playing with fire.

Actions #6

Updated by gstrauss about 8 years ago

Instead of LDAP_OPT_RESTART, the following patch modifies the flags to sigaction for SIGCHLD to restart those system calls that are restartable, instead of causing those system calls to return EINTR. See 'man 7 signal' for list of restartable system calls.
This patch only works on systems which support sigaction() instead of only supporting signal(), but most modern systems support sigaction().

diff --git a/src/server.c b/src/server.c
index 26ecbcc..351503d 100644
--- a/src/server.c
+++ b/src/server.c
@@ -973,6 +973,7 @@ int main (int argc, char **argv) {
        sigaction(SIGTERM, &act, NULL);
        sigaction(SIGHUP,  &act, NULL);
        sigaction(SIGALRM, &act, NULL);
+       act.sa_flags |= SA_RESTART | SA_NOCLDSTOP;
        sigaction(SIGCHLD, &act, NULL);

 #elif defined(HAVE_SIGNAL)

Actions #7

Updated by stbuehler about 8 years ago

  • Target version changed from 1.4.x to 1.4.40

I'll go with the patch from gstrauss (thanks), as I don't want to search for how to check for the presence of LDAP_OPT_RESTART (which probably would fix the problem too).

Actions #8

Updated by stbuehler about 8 years ago

  • Status changed from Patch Pending to Fixed
  • % Done changed from 0 to 100

Applied in changeset r3108.

Actions #9

Updated by gstrauss almost 8 years ago

  • Related to Bug #2345: LDAP authentication problem added
Actions #10

Updated by gstrauss about 6 years ago

  • Related to Bug #2846: LDAP authentication vs. AD: problems with referrals added
Actions #11

Updated by gstrauss 3 months ago

Actions

Also available in: Atom