Project

General

Profile

Actions

Bug #2121

closed

fastcgi failover not working

Added by mm over 14 years ago. Updated about 14 years ago.

Status:
Missing Feedback
Priority:
Normal
Category:
-
Target version:
-
ASK QUESTIONS IN Forums:

Description

The failover of FastCGI is not working in 1.4.25.

Here is extract of my configuration file:

fastcgi.server = ( ".php" =>
(
(
"host" => "192.168.0.10",
"port" => 9000
),
( "host" => "192.168.0.11",
"port" => 9000
)
)
)

Fastcgi is started via spawn-fcgi, the behaiviour is like this:

1.) both servers have php-fastcgi running - everything works well, requests are correctly processed
2.) one server has fastcgi not running - first request is processed from the second server, all next requests to the first backend while the failed backend is disabled have 503 errors, after it is re-enabled, the first request is processed, and again all next requests have errors
3.) both servers have fastcgi disabled - 500 or 503 is returned, correct behaiviour

I have reproduced this problem on OpenSolaris and on FreeBSD.

Log file is attached.

I have done some debugging and found out that the reason for "connection was dropped after accept() (perhaps the fastcgi process died)" is ENOTCONN


Files

lighttpd.log (2.6 KB) lighttpd.log Log output mm, 2009-12-16 12:03

Related issues 2 (0 open2 closed)

Related to Bug #2316: FreeBSD fastcgi broken: connection was dropped after accept()Missing Feedback2011-05-12Actions
Related to Bug #2329: connection was dropped after accept()Missing Feedback2011-07-26Actions
Actions #1

Updated by stbuehler over 14 years ago

  1. Ok, the balance problems first: I just tried it locally, killing and restarting two php backends, and lighty always found the working one (after disable-time triggered, which is 1 second by default). So i think that part works as it should, and i guess your problems have something to do with the second part.
  2. Yes, i know the errno is ENOTCONN. And i think i'm gonna blame the operating system - i don't think it is our fault (connect was successful, so we are connected, and ENOTCONN doesn't make sense).
    If you can reproduce it without high load (I only got reports of this problem from sites with many php requests), it would be nice if you could provide a ktrace of the syscalls.
Actions #2

Updated by stbuehler over 14 years ago

  • Status changed from New to Missing Feedback
  • Target version deleted (1.4.26)
Actions #3

Updated by mm about 14 years ago

It seems to be related to the TCP_NODELAY option as well.
OpenSolaris does set this in a patch to lighttpd and in FreeBSD it is a kernel tunable:
net.inet.tcp.delayed_ack
net.inet.tcp.delacktime

Turning off improves the situation, but there are still some requests (but very much less than before) that are not passed through. On the contrary, running lighttpd on the target FastCGI servers and sending requests via mod_proxy works without any requests dropped.

Another problem I noticed - lighttpd does check for the existence of local files even if a request should be forwarded to a remote FastCGI server.

Actions #4

Updated by stbuehler about 14 years ago

There is an option to disable the local-file check.

Actions #5

Updated by gstrauss 4 months ago

  • Related to Bug #2316: FreeBSD fastcgi broken: connection was dropped after accept() added
Actions #6

Updated by gstrauss 4 months ago

  • Related to Bug #2329: connection was dropped after accept() added
Actions

Also available in: Atom