Project

General

Profile

Bug #2775

lighttpd 1.4.42, 1.4.43 hangs during continuous polling

Added by smikesmith over 2 years ago. Updated over 2 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
mod_cgi
Target version:
Start date:
2016-12-12
Due date:
% Done:

0%

Estimated time:
Missing in 1.5.x:

Description

This sounds vaguely like issue #2771, but I get no error messages, and lighttpd continues to run, just refuses further connections or any activity. (This has not been observed in 1.4.41 or prior).

While polling for new data continuously through cgi channel, lighttpd eventually (10~30 minutes) will simply stop exchanging data or accepting new connections. This is an instance of a single browser connection using PHP and cgi to load a dynamic page then continuously refresh data through cgi.

Configuration can be made available, but essentially proxys 'api/v1/...' to a local nodejs server on localhost:8080. This has worked previously with 1.4.34, but the improved proxy features (min-buffered passthru, i.e., server.stream-request-body=2) of 1.4.40+ are now required in our application, thus the interest in upgrade.

Wireshark shows similar http activity simply stops unexpectedly following an access.

Modules in use are mod_cgi, mod_fastcgi, mod_proxy, mod_access, mod_auth

lighttpd_log_last_8_minutes (1.12 MB) lighttpd_log_last_8_minutes log file up to "lock" smikesmith, 2016-12-12 17:32
lighttpd_strace_last_8_minutes (4.4 MB) lighttpd_strace_last_8_minutes strace up to and just after "lock" smikesmith, 2016-12-12 17:32

Related issues

Is duplicate of Bug #2771: With mod_cgi I am getting sockets disabled, out-of-fds errorFixed2016-11-24

Actions

History

#1

Updated by gstrauss over 2 years ago

Hmm. Thanks for the report. I will look more closely at strace later.

Are the PHP requests going through CGI or FastCGI, or are you proxying to another backend? Would you share your lighttpd.conf?

There are some fixes post 1.4.43, so you might want to test with tip of lighttpd git master to see if the problem has already been fixed.

#2

Updated by gstrauss over 2 years ago

  • Status changed from New to Need Feedback

Looking at the strace, I see that after 16:38:18, lighttpd event loop is still waking up every second, checking the time, and then going back into the kernel for another second of epoll_wait() while lighttpd waits for new connections. That behavior looks proper and expected to me. lighttpd does not appear to be hung.

Are you sure that lighttpd has "hung"? Did you try to 'telnet localhost 80' on the same box to see if lighttpd is still responding? Is your networking still up? Did your firewall get reconfigured to block connections from the PHP (or other networks external to the machine)? Did SELinux or some other security mechanism start blocking or dropping external connections? (I know this is a stretch but lighttpd does not appear to be hung.) Look at your system to see what else is running that might potentially interfere.

.

Likely unrelated: looking at https://redmine.lighttpd.net/boards/2/topics/6956, you might want to take a look at fastcgi.conf for the location of the parenthesis around "php-num-procs" in your fastcgi.conf. I see the comment there, but the config still looks incorrect to me. The fastcgi.server value is a key => value list (extensions or paths) with value as a key => value list of host-label => (key => value list of host config)

  fastcgi.server = ( "ext1" => ( "label1" => ( "host" => "127.0.0.1", ... )
                                 "label2" => ( "host" => "127.0.0.1", ... ),
                               ),
                     "ext2" => ( "label3" => ( "host" => "127.0.0.1", ... ),
                                 "label4" => ( "host" => "127.0.0.1", ... ),
                               ),
                   )

The "label" is optional, so you could have something that looks like:

  fastcgi.server = ( "ext1" => ( ( "host" => "127.0.0.1", ... )
                                 ( "host" => "127.0.0.2", ... ),
                               ),
                     "ext2" => ( ( "host" => "127.0.0.3", ... ),
                                 ( "host" => "127.0.0.4", ... ),
                               ),
                   )

#3

Updated by gstrauss over 2 years ago

BTW, for your build of lighttpd 1.4.43, you probably want to include the post-1.4.43 patches 99925202 (which you mentioned) as well as 5bf5e1ad
Please test with those (or with tip of lighttpd git master)

#4

Updated by smikesmith over 2 years ago

Thanks greatly for your feedback. We use fastcgi for PHP, I think. That's from someone else, so I'll have him look at your comments about config.

Yes, lighttpd was still running as shown by the strace, but by "locked", I meant that it simply stopped serving existing connections or taking new ones. In short, it stopped being a server, and the only recourse appeared to be to restart it.

As for 2775, so far the patch 9925202 seems to have cured the problem, and we're moving forward on testing. We need the feature to stream in minimized memory as we are pushing backups much larger than actual RAM size, so that will be a great feature. I'll post again if we find other problems, but meanwhile, looks like we're set!

#5

Updated by gstrauss over 2 years ago

  • Is duplicate of Bug #2771: With mod_cgi I am getting sockets disabled, out-of-fds error added
#6

Updated by gstrauss over 2 years ago

  • Category set to mod_cgi
  • Status changed from Need Feedback to Duplicate

Thanks for the update. If you are sending POST requests to CGI using lighttpd 1.4.43, then 99925202 is also needed, as you have found.

#7

Updated by gstrauss over 2 years ago

  • Target version changed from 1.4.x to 1.4.44

Also available in: Atom