Bug #934: lighttpd 1.4.13 crashes under PHP load - Lighttpd - lighty labs

Actions

Copy link

Bug #934

closed

lighttpd 1.4.13 crashes under PHP load

Added by Anonymous over 18 years ago. Updated over 16 years ago.

Status:

Invalid

Priority:

Normal

Category:

core

Target version:

ASK QUESTIONS IN Forums:

Description

I am running lighttpd 1.4.13, using FastCGI PHP.

Periodically I see two different failure behaviors.

The first is a server crash. I have attached several valgrind dumps for this. (Note also that there is a problem with lighttpd's use of setgroups when the group has many users in it).

The second may be related - the PHP FastCGI connection becomes "clogged", i.e. connections do not close out and the FastCGI interface rapidly runs out of connections to the PHP server.

This same lighttpd server handles a high static file serving load without any trouble - I encounter the crashing and FastCGI failure behaviors when running PHP scripts on it.

PHP is 5.2.0 FastCGI.

Also attached is the lighttpd configuration file.

-- jb

Files

Download all files

lighttpd.30871 (9.13 KB) lighttpd.30871		Anonymous, 2006-12-14 00:43
lighttpd.31020 (8.84 KB) lighttpd.31020		Anonymous, 2006-12-14 00:43
lighttpd.31027 (8.84 KB) lighttpd.31027		Anonymous, 2006-12-14 00:43
lighttpd.conf (12 KB) lighttpd.conf		Anonymous, 2006-12-14 00:43

Actions

Copy link

Updated by Anonymous over 18 years ago

This part turned out to be a problem with nss_ldap which I corrected by installing the latest nss_ldap from source.

I still have the PHP issues.

--
(Note also that there is a problem with lighttpd's use of setgroups when the group has many users in it).

Actions

Copy link

Updated by Anonymous over 18 years ago

The PHP problem seems to be related to some PHP scripts' use of the php "mail()" function.

mail() forks and execs a /bin/sh which is used to run /usr/bin/sendmail. Something about this appears to be confusing / hanging up PHP or the fastcgi interface.

When the problem behavior occurs, I see "sh" processes in the ps list. If I manually kill these sh processes the FastCGI load starts to come down.

Actions

Copy link

Updated by Anonymous about 18 years ago

Ok, it turns out this was an extremely obscure issue between different version NFS server and NFS client. PHP opened a session file on our NFS server, which is quite a bit older than the NFS client. Apparently there is some incompatibility between these versions which was causing the strange FLocK behavior.

I moved the PHP sessions to a same-kernel-version NFS server and the problem is cured.

You can close this ticket.

Actions

Copy link

Updated by darix about 18 years ago

ok ... i still dont see a relation how lighty can crash on bwarfed php scripts o.O

Actions

Copy link

Updated by Anonymous about 18 years ago

The behavior was odd. It was positively triggered when PHP opened a session file and did fcntl(LOCK_EX) on it. The linux kernel would hang when cloning that file descriptor as part of the fork/exec to launch a child (e.g. when sending mail). So there would be a /bin/bash process in sleep state, with no memory allocated to it yet. An strace on that process would wake it up and resume normal execution (it got a SIGSTOP and SIGCONT pair which got it past whatever system call was deadlocked.)

Now, when this occurred, PHP would also stop properly handling new incoming requests. The whole FastCGI engine got gummed up, new FCGI requests would start backing up rapidly (fastcgi.active-requests and fastcgi.load would grow rapidly). If I didn't clean out the stuck /bin/bash child fast enough, eventually lighty/php would get into a state where I would be forced to kill lighty+php and restart it. If it got to this state, it would not start serving FCGI/PHP again even if I did kill the /bin/bash child. I don't know if the FCGI state was confused on the lighty, or the PHP side. It's also possible that it is something else in the PHP/FCGI code deadlocking on the same NFS problem, but in a different way. I really don't know.

I had assumed earlier, that the problem was lighty's FCGI processing becoming confused as a result of something PHP or its child was doing. Now I know the problems were really in PHP/linux kernel.

Actions

Copy link

Updated by darix about 18 years ago

Status changed from New to Fixed
Resolution set to invalid

1. the group issue should be solved in 1.4.14 (to be released)
2. your valgrind crashed and not lighty.
3. this is not the place to discuss php nfs locking issues.

that said.... closing. ;)

Actions

Copy link

Updated by stbuehler over 16 years ago

Status changed from Fixed to Invalid

Actions

Copy link

Also available in: Atom

Project

General

Profile

Lighttpd

Custom queries

Bug #934

lighttpd 1.4.13 crashes under PHP load

Updated by Anonymous over 18 years ago

Updated by Anonymous over 18 years ago

Updated by Anonymous about 18 years ago

Updated by darix about 18 years ago

Updated by Anonymous about 18 years ago

Updated by darix about 18 years ago

Updated by stbuehler over 16 years ago