Project

General

Profile

Actions

Bug #3007

closed

Infinite loop at startup when backend binary fails

Added by Anonymous about 4 years ago. Updated over 3 years ago.

Status:
Invalid
Priority:
Low
Category:
core
Target version:
-
ASK QUESTIONS IN Forums:
No

Description

When testing a degraded case of failing FastCGI application, I observe (at random rate) an infinite loop when starting server.
When starting the service manually, the log is generally:

Feb 14 14:00:34.410 daemon.err lighttpd: (../../lighttpd-1.4.52/src/server.c.1457) server started (lighttpd/1.4.52) 
Feb 14 14:00:34.410 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.1465) --- gw spawning local \n\tproc: /bin/false \n\tport: 0 \n\tsocket /var/run/lighttpd/foo.socket \n\tmin-procs: 1 \n\tmax-procs: 1 
Feb 14 14:00:34.410 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.1489) --- gw spawning \n\tport: 0 \n\tsocket /var/run/lighttpd/foo.socket \n\tcurrent: 0 / 1 
Feb 14 14:00:34.410 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.458) new proc, socket: 0 /var/run/lighttpd/foo.socket-0 
Feb 14 14:00:34.410 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.328) child exited: 1 unix:/var/run/lighttpd/foo.socket-0 
Feb 14 14:00:34.410 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.606) gw-backend failed to start: /bin/false 
Feb 14 14:00:34.410 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.608) If you're trying to run your app as a FastCGI backend, make sure you're using the FastCGI-enabled version.  If this is PHP on Gentoo, add 'fastcgi' to the USE flags.  If this is PHP, try removing the bytecode caches for now and try again. 
Feb 14 14:00:34.410 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.1503) [ERROR]: spawning gw failed. 
Feb 14 14:00:34.410 daemon.err lighttpd: (../../lighttpd-1.4.52/src/server.c.1465) Configuration of plugins failed. Going down. 

But, sometimes, and every time at startup:

Feb 14 13:58:59.300 daemon.err lighttpd: (../../lighttpd-1.4.52/src/server.c.1457) server started (lighttpd/1.4.52) 
Feb 14 13:58:59.300 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.1465) --- gw spawning local \n\tproc: /bin/false \n\tport: 0 \n\tsocket /var/run/lighttpd/foo.socket \n\tmin-procs: 1 \n\tmax-procs: 1 
Feb 14 13:58:59.300 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.1489) --- gw spawning \n\tport: 0 \n\tsocket /var/run/lighttpd/foo.socket \n\tcurrent: 0 / 1 
Feb 14 13:58:59.300 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.458) new proc, socket: 0 /var/run/lighttpd/foo.socket-0 
Feb 14 13:58:59.310 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.328) child exited: 1 unix:/var/run/lighttpd/foo.socket-0 
Feb 14 13:58:59.310 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.458) new proc, socket: 0 /var/run/lighttpd/foo.socket-0 
Feb 14 13:58:59.310 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.475) unlink /var/run/lighttpd/foo.socket-0 after connect failed: Connection refused 
Feb 14 13:58:59.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.328) child exited: 1 unix:/var/run/lighttpd/foo.socket-0 
Feb 14 13:59:01.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.1033) --- gw spawning \n\tsocket unix:/var/run/lighttpd/foo.socket-0 \n\tcurrent: 1 / 1 
Feb 14 13:59:01.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.458) new proc, socket: 0 /var/run/lighttpd/foo.socket-0 
Feb 14 13:59:01.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.475) unlink /var/run/lighttpd/foo.socket-0 after connect failed: Connection refused 
Feb 14 13:59:01.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.328) child exited: 1 unix:/var/run/lighttpd/foo.socket-0 
Feb 14 13:59:01.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.606) gw-backend failed to start: /bin/false 
Feb 14 13:59:01.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.608) If you're trying to run your app as a FastCGI backend, make sure you're using the FastCGI-enabled version.  If this is PHP on Gentoo, add 'fastcgi' to the USE flags.  If this is PHP, try removing the bytecode caches for now and try again. 
Feb 14 13:59:01.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.1040) ERROR: spawning gw failed. 
Feb 14 13:59:02.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.1033) --- gw spawning \n\tsocket unix:/var/run/lighttpd/foo.socket-0 \n\tcurrent: 1 / 1 
Feb 14 13:59:02.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.458) new proc, socket: 0 /var/run/lighttpd/foo.socket-0 
Feb 14 13:59:02.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.475) unlink /var/run/lighttpd/foo.socket-0 after connect failed: Connection refused 
Feb 14 13:59:02.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.328) child exited: 1 unix:/var/run/lighttpd/foo.socket-0 
Feb 14 13:59:02.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.606) gw-backend failed to start: /bin/false 
Feb 14 13:59:02.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.608) If you're trying to run your app as a FastCGI backend, make sure you're using the FastCGI-enabled version.  If this is PHP on Gentoo, add 'fastcgi' to the USE flags.  If this is PHP, try removing the bytecode caches for now and try again. 
Feb 14 13:59:02.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.1040) ERROR: spawning gw failed. 
Feb 14 13:59:03.320 daemon.err lighttpd: (../../lighttpd-1.4.52/src/gw_backend.c.1033) --- gw spawning \n\tsocket unix:/var/run/lighttpd/foo.socket-0 \n\tcurrent: 1 / 1 
... repeating every second

The version of lighttpd is 1.4.52, compiled with Yocto Project on i686 platform (Intel Atom, no hyperthreading).
The bin-path parameter is intentionally set to /bin/false to simulate a corrupted/buggy application.
When bin-path points to an existing application, the behaviour is correct.
Complete configuration file is given as attachment.


Files

lighttpd.conf (665 Bytes) lighttpd.conf Anonymous, 2020-02-14 14:02
Actions #1

Updated by gstrauss about 4 years ago

  • Status changed from New to Invalid
  • Priority changed from Normal to Low
  • Target version deleted (1.4.x)

Why do you think this is a problem with lighttpd? You have made a mistake with your backend and lighttpd keeps retrying each second so that things will work once you fix your mistake.

Why don't you try having lighttpd execute the script "/bin/my-custom-script"? Have the contents be /bin/false. You'll see the same behavior. Then change the contents of that script to something that works, and you'll see lighttpd stop retrying.

Actions #2

Updated by Anonymous about 4 years ago

Thank you for your answer.
I understand your point, having a correct behaviour under normal condition is the main concern.
I'm trying to simulate a failing application, to anticipate a potential issue. That's not a correct usage, but it could happen. I'm working on embedded system and I need to keep control on disk usage. I must also ensure that failures do not propagate across the system.
The point that make me think there is a problem with lightppd is that the behaviour depends on undefined conditions. I'd like understand what is the cause to find a correct containment.

Actions #3

Updated by gstrauss about 4 years ago

lighttpd's behavior is well-defined and intentional.

On the other hand, what you are saying is nonsensical.

Actions #4

Updated by Anonymous about 4 years ago

It seems my communication is bad enough to justify unpleasant conclusion.
I won't bother you anymore

Actions #5

Updated by gstrauss about 4 years ago

It seems my communication is bad enough to justify unpleasant conclusion.

I think the communication is poor and you have not communicated well. You also do not seem to understand the difference between a question and a bug report.

The "Forum" tab at the top of the page links to the discussion forum. You did not post there with a question. Instead, you posted a bug here, and have been told that the lighttpd behavior is intended, and therefore not a bug.

Actions #6

Updated by gstrauss over 3 years ago

  • ASK QUESTIONS IN Forums set to No

The point that make me think there is a problem with lightppd (sic) is that the behaviour depends on undefined conditions.

When lighttpd is not configured to start up backends (and backends are started up independently), lighttpd starts up and at runtime will retry backends to which lighttpd fails to connect.

When lighttpd is configured to start up backends, lighttpd attempts to start the backends up during lighttpd startup, and does a quick waitpid() after doing so, to see if the backend process exited quickly. If the backend process exited quickly and before lighttpd calls waitpid(), lighttpd reports this and exits with an error.

The steps taken by lighttpd are well-defined. However, since lighttpd continues starting up and can not know how long to wait for the backend to start up, the behavior takes one of the two states you see, depending on how quickly the backend failed, and whether or not lighttpd (independently) called waitpid() before or after the backend exited. /bin/false tends to exit quickly. Something like /usr/bin/php startup is not something I would describe as 'quick'.

There are many different ways that a backend can fail. If a backend fails very quickly at startup, lighttpd treats this as a fatal error, since the error is unlikely to be transient.

Actions

Also available in: Atom