Bug #1854
closedLighty 1.4.20 leaks memory and stability problem with backend
Description
I'm running lighttpd-1.4.20 and PHP 5.2.4 (cgi-fcgi) with PHP scripts connecting to other webservers. Those other webservers become non-responsive and causing Lighty to time-out after 360 seconds and receiving partial content. Lighty grows about 20MB a day.
Because I run those applications on a VPS with limited RAM, I cannot attach Valgrind to the process. The traffic pattern is unpredictable. But I strongly suspect the partial content with time-out was the cause.
So I wrote a simple test program (lighty-leaky.php, attached) which sleeps 400 seconds before completing the transaction. This simulates the situation cited above. Traffic is simulated with
ab -n 100 -c 80 -k http://f8/lighty-leaky.php?loop=10800 ab -n 100 -c 50 http://f8/lighty-leaky.php?loop=10800
The 'loop' control the size of the partial content. Lighty growed close to 7MB after a few tries.
% top -bn 1 | grep lighttpd 11493 apache 20 0 5784 988 624 S 0 0.2 0:00.00 lighttpd after 11493 apache 20 0 12508 7932 832 S 0 1.5 0:02.28 lighttpd
And the following error log occurred
2008-12-19 00:41:44: (server.c.1247) NOTE: a request for /lighty-leaky.php?loop=10800 timed out after writing 2758099 bytes. We waited 20 seconds. If this a problem increase server.max-write-idle 2008-12-19 00:42:33: (mod_fastcgi.c.2926) backend is overloaded; we'll disable it for 2 seconds and send the request to another backend instead: reconnects: 0 load: 130 2008-12-19 00:42:33: (mod_fastcgi.c.3568) all handlers for /lighty-leaky.php on .php are down. 2008-12-19 00:42:36: (mod_fastcgi.c.2681) fcgi-server re-enabled: 0 /var/run/lighttpd/php-fastcgi.socket
The first log recurred multiple times, presumably corresponds to the number of requests. I set server.max-write-idle = 20 in configuration to shorten the wait.
After that, Lighty never managed to reconnect to the backend. Requests return 500 error.
I'm using "mod_rewrite", "mod_access", "mod_auth", "mod_status", "mod_setenv", "mod_fastcgi", "mod_simple_vhost", "mod_compress", "mod_expire", "mod_rrdtool", "mod_accesslog" . My FastCGI configuration
fastcgi.server = ( ".php" => ( "localhost" => ( "socket" => "/var/run/lighttpd/php-fastcgi.socket", "bin-path" => "/usr/bin/php-cgi", "bin-environment" => ( "PHP_FCGI_CHILDREN" => "32", "PHP_FCGI_MAX_REQUESTS" => "4000" ), "bin-copy-environment" => ( "PATH", "SHELL", "USER" ), "min-procs" => 1, "max-procs" => 1, "max-load-per-proc" => 8, "idle-timeout" => 50, "broken-scriptfilename" => "enable" ) ) )
Let me know if I can be of further assistance.
Files
Updated by jiwei almost 16 years ago
Lighttpd did manage to reconnect to the backend php-cgi after a few hours (could be shorter), but not in 2 seconds as claimed in the log.
Updated by stbuehler about 15 years ago
- Priority changed from High to Normal
- Target version set to 1.4.x
Memory leak? prove it with valgrind (there are other reasons for growing memory usage... including buffer reusing and fragmentation).
Updated by stbuehler about 15 years ago
- Status changed from New to Duplicate
- Target version deleted (
1.4.x) - Missing in 1.5.x set to No
I think the "all backends" down bug is fixed, see r2657 and #1825
Also available in: Atom