Bug #758

memory fragmentation leads to high memory usage after peaks

Added by Anonymous about 8 years ago. Updated over 4 years ago.

Status:InvalidStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:core
Target version:-
Missing in 1.5.x:No

Description

Summary: lighttpd of course had some memory leaks (and perhaps even has today), but this bug is not about these problems.
The main problem here is that the memory gets fragmented, and that is why malloc()/free() doesn't return the memory to the system; the memory is not lost to lighttpd, and lighttpd/malloc() will reuse the memory.

-- stbuehler


lighttpd-1.4.11 (Jul 12 2006 20:07:41) - a light and fast webserver
has memory leak -- about 60 megabytes accumulating each day.


server.modules = (
 "mod_rewrite",
 "mod_redirect",
 "mod_alias",
 "mod_access",
 "mod_cml",
 "mod_auth",
 "mod_evasive",
 "mod_status",
 "mod_setenv",
 "mod_fastcgi",
 "mod_proxy",
 "mod_simple_vhost",
 "mod_evhost",
 "mod_userdir",
 "mod_cgi",
 "mod_scgi",
 "mod_compress",
 "mod_ssi",
 "mod_flv_streaming",
 "mod_webdav",
 "mod_usertrack",
 "mod_expire",
 "mod_secdownload",
 "mod_rrdtool",
 "mod_accesslog" )
rrdtool.binary="/usr/bin/rrdtool" 
rrdtool.db-name="/opt/lighttpd/lighttpd.rrd" 
server.document-root        = "/opt/lighttpd/root/" 
allow-x-send-file="enable" 
webdav.activate="disable" 
webdav.is-readonly="enable" 
index-file.names            = ( "index.php", "index.html",
                                "index.htm", "default.htm" )
mimetype.assign             = (
  ".pdf"          =>      "application/pdf",
  ".sig"          =>      "application/pgp-signature",
  ".spl"          =>      "application/futuresplash",
  ".class"        =>      "application/octet-stream",
  ".ps"           =>      "application/postscript",
  ".torrent"      =>      "application/x-bittorrent",
  ".dvi"          =>      "application/x-dvi",
  ".gz"           =>      "application/x-gzip",
  ".pac"          =>      "application/x-ns-proxy-autoconfig",
  ".swf"          =>      "application/x-shockwave-flash",
  ".tar.gz"       =>      "application/x-tgz",
  ".tgz"          =>      "application/x-tgz",
  ".tar"          =>      "application/x-tar",
  ".zip"          =>      "application/zip",
  ".mp3"          =>      "audio/mpeg",
  ".m3u"          =>      "audio/x-mpegurl",
  ".wma"          =>      "audio/x-ms-wma",
  ".wax"          =>      "audio/x-ms-wax",
  ".ogg"          =>      "application/ogg",
  ".wav"          =>      "audio/x-wav",
  ".gif"          =>      "image/gif",
  ".jpg"          =>      "image/jpeg",
  ".jpeg"         =>      "image/jpeg",
  ".png"          =>      "image/png",
  ".xbm"          =>      "image/x-xbitmap",
  ".xpm"          =>      "image/x-xpixmap",
  ".xwd"          =>      "image/x-xwindowdump",
  ".css"          =>      "text/css",
  ".html"         =>      "text/html",
  ".htm"          =>      "text/html",
  ".js"           =>      "text/javascript",
  ".asc"          =>      "text/plain",
  ".c"            =>      "text/plain",
  ".cpp"          =>      "text/plain",
  ".log"          =>      "text/plain",
  ".conf"         =>      "text/plain",
  ".text"         =>      "text/plain",
  ".txt"          =>      "text/plain",
  ".dtd"          =>      "text/xml",
  ".xml"          =>      "text/xml",
  ".mpeg"         =>      "video/mpeg",
  ".mpg"          =>      "video/mpeg",
  ".mov"          =>      "video/quicktime",
  ".qt"           =>      "video/quicktime",
  ".avi"          =>      "video/x-msvideo",
  ".asf"          =>      "video/x-ms-asf",
  ".asx"          =>      "video/x-ms-asf",
  ".wmv"          =>      "video/x-ms-wmv",
  ".bz2"          =>      "application/x-bzip",
  ".tbz"          =>      "application/x-bzip-compressed-tar",
  ".tar.bz2"      =>      "application/x-bzip-compressed-tar" 
 )
server.tag                 = "httpd" 
accesslog.filename          = "/opt/lighttpd/access.log" 
server.errorlog             = "/opt/lighttpd/error.log" 
url.access-deny             = ( "~", ".inc" )
$HTTP["url"] =~ "\.pdf$" {
  server.range-requests = "disable" 
}
$HTTP["remoteip"] != "1.2.3.4" {
  url.access-deny = ( "" )
}
static-file.exclude-extensions = ( ".php", ".pl", ".fcgi" )
server.port                = 1234
dir-listing.activate       = "enable" 
fastcgi.server = ( ".php" => (( 
                     "bin-path" => "/opt/lighttpd/php5-cgi",
                     "socket" => "/opt/lighttpd/php5-fastcgi.socket",
                     # vvv Dat: multiply max-procs and PHP_FCGI_CHILDREN
                     "max-procs" => 1,
                     "bin-environment" => ( 
                       "PHP_FCGI_CHILDREN" => "4",
                       "PHP_FCGI_MAX_REQUESTS" => "10000" 
                     ),
                     "bin-copy-environment" => (
                       "PATH", "SHELL", "USER" 
                     ),
                     "broken-scriptfilename" => "enable",
                     "idle-timeout" => 20,
                 )))

-- pts

valgrind.log Magnifier (2.02 KB) davidb54, 2009-01-13 06:56

lighttpd_bug758.patch Magnifier (1.17 KB) davidb54, 2009-01-13 18:20


Related issues

Related to Bug #881: memory usage when ssl.engine used and large data uploaded... Wontfix

History

#1 Updated by Anonymous almost 8 years ago

I can confirm that this is the case, using the FreeBSD port. In my test, it grew faster than that (60M in a couple hours); it seemed to grow linearly with traffic.

This bug in particular was a deal breaker for me. I was trying lighttpd out as an alternative to pound. Performance is good, and memory usage is good when it starts, but it's simply too broken for me to actually use. Maybe I'll check back in a few months.

-- bob

#2 Updated by Anonymous almost 8 years ago

Lighty memory usage seems to grow linearly with traffic on my Linux boxes as well. After the typical high traffic evening the process has grown from 11 MB to 450 MB. Kill+restart fixes the problem :-/

#3 Updated by Anonymous almost 8 years ago

Replying to anonymous:

Lighty memory usage seems to grow linearly with traffic on my Linux boxes as well. After the typical high traffic evening the process has grown from 11 MB to 450 MB. Kill+restart fixes the problem :-/

This is most likely not a true leak - this is probably a symptom of the internal buffer reuse. If you change BUFFER_MAX_REUSE_SIZE to zero in settings.h, all memory will be free'd.

-- msolo

#4 Updated by Anonymous over 7 years ago

Can anybody confirm the fix by setting BUFFER_MAX_REUSE_SIZE = 0?
Is this ticket abandoned?

-- alex

#5 Updated by Anonymous almost 7 years ago

BUFFER_MAX_REUSE_SIZE=0 not fix problem.
I'm using lighttpd 1.4.17 mod_status + mod_proxy + mod_access + mod_secure_download - for high load - download server - memory leaks 500mb every day.

-- Dmitryus

#6 Updated by Anonymous almost 7 years ago

I use lighttpd(1.4.18) + mod_proxy with squid in front of another web server, memory leak of lighttpd process cause my 1G ram and 1G swap space consumed in a couple of hours.

-- frostyplanet

#7 Updated by Anonymous over 6 years ago

People experiencing the memory growth: Do you use https?

-- mark

#8 Updated by Anonymous over 6 years ago

Replying to :

People experiencing the memory growth: Do you use https?

Same problem. Not using https.

#9 Updated by Anonymous over 6 years ago

Replying to anonymous:

Replying to :

People experiencing the memory growth: Do you use https?

Same problem. Not using https.

Here too - no https.

#10 Updated by Anonymous over 6 years ago

too - no https. << My comment ( forgot change name )

Settin

"
BUFFER_MAX_REUSE_SIZE= 0
server.max-request-size = 300000
server.max-keep-alive-requests = 10
server.max-keep-alive-idle = 5
"
In the server config helps me a lot ;)

-- Linux

#11 Updated by Anonymous over 6 years ago

Experiencing the same here. I have to restart it every night or it will get over 100mb.

Currently using: mod_rewrite, mod_redirect, mod_access, mod_fastcgi, mod_simple_vhost, mod_cgi, mod_accesslog, mod_geoip.

server.max-keep-alive-requests = 128,
server.max-keep-alive-idle = 30,
server.max-read-idle = 60,
server.max-write-idle = 240,

-- logikcoder

#12 Updated by Anonymous over 6 years ago

Constantly i got the same problem. I've changed server and nothing.

lighttpd 22494 0.0 17.4 97508 84352 ? S Dec04 0:05 /usr/sbin/lighttpd -f /etc/lighttpd/lighttpd.conf (512RAM)

Help please.

-- Linux

#13 Updated by Anonymous over 6 years ago

Sorry, my previous description may not be accury. I run lighttpd 1.4.18, fastcgi and php5, with Website providing download service using php's readfile() method. Memory usage of process lighttpd growed rapidly beyond 1G in half an hour and later used out swap. If put another server lighttpd + mod_proxy + squid to cache access of my website, both merchine suffered memory growth. Latter I found removing readfile() method in the download service php page can solve the problem :P

-- frostyplanet

#14 Updated by Anonymous over 6 years ago

Sorry again for my poor English and careless.In my previous post "merchine" should be "machines","accury" should be "accurate". And my servers are gentoo linux, kernel 2.6.22, php 5.2.14-p20070914-r2, lighttpd 1.4.18 only with mod_fastcgi and mod_accesslog. I have tried Apache to make sure php library and webpage are OK.

-- frostyplanet

#15 Updated by darix over 6 years ago

if lighttpd has access to the files you want to send to the client, use X-LIGHTTPD-send-file. see the mod_fastcgi documentation.

#16 Updated by Anonymous over 6 years ago

My config is:

Lighttpd 1.4.18 (installed from .deb package)
PHP 5 (Static, in use (exec/read/system/fopen etc/..)
Mysql (Only installed, not used)

Should i try compile lighttpd?

-- Linux

#17 Updated by darix over 6 years ago

no. it wouldnt change anything.

#18 Updated by Anonymous over 6 years ago

Oops.. Right now 60% Memory usage by the lighttpd process, load going up. Swap getting fulled, damn.

-- Linux

#19 Updated by Anonymous over 6 years ago

Hi,

I have similar problem.
I have a lighttpd server, which used for proxy task.
All requests to port 80 served by a background apache server.

Internet -> lighttpd -proxy-> apache

If several downloads (some CD ISO images etc.) started and running, the lighttpd
eats all the memory and swap space and stops working.

I use this instance of lighttpd only for proxy. Should I use squid instead? :/

-- István

#20 Updated by Anonymous over 6 years ago

Are you going to repair that memory leak in 1.5.0 ? is this leak also in 1.5beta ? Lighty is great,but this leak sucks :(

-- marcello100

#21 Updated by Anonymous over 6 years ago

Replying to darix:

if lighttpd has access to the files you want to send to the client, use X-LIGHTTPD-send-file. see the mod_fastcgi documentation.

I tried. It doesnt't help. By adding this option the lighttpd leaks severly, while its mem usage is normal without this option.

-- frostyplanet

#22 Updated by admin over 6 years ago

I'm running into the same issue on one Debian Etch server.

Uptime 2 days 6 hours 47 min 44 s
Requests 2 Mreq
Traffic 50.69 Gbyte

lighttpd 1.4.13
Server-Features
RegEx Conditionals enabled
Network Engine
fd-Event-Handler poll
Config-File-Settings
Loaded Modules indexfile
access
alias
accesslog
rewrite
redirect
status
fastcgi
simple_vhost
dirlisting
staticfile

#23 Updated by admin over 6 years ago

Forgot to add the actual memory usage: from top: VIRT: 935m, RES = 828m.

#24 Updated by admin over 6 years ago

At the moment we're at 1612m virt, 1.1g res.

#25 Updated by stbuehler over 6 years ago

Please use valgrind to see whether and where memory leaks.

See http://trac.lighttpd.net/trac/wiki/HowToReportABug

#26 Updated by Anonymous over 6 years ago

I have the same problem but php-cgi process are the reason for memory leak

-- masryalex

#27 Updated by Anonymous over 6 years ago

Haven't seen increasing memory usage on FreeBSD 6.2-RELEASE, via the lighttpd port.

Running a 1.4 million page view / month site, serving static media and thumbnailing via cgi (Django application)

---------------------
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
94485 www 1 4 0 16404K 3820K kqread 2 62:12 0.00% lighttpd
---------------------

That's 62 hours of CPU time on a box with 136 days uptime.

Also haven't seen this issue on the same platform serving 900'000 page views / month via php-fastcgi backend, lighttpd only.

#28 Updated by Anonymous over 6 years ago

--- deleted spam ---

#29 Updated by stbuehler about 6 years ago

It has been now more than two months that i asked for a valgrind log, and no one even responded to that => dropping Priority/Severity to normal.

And I think most "leaks" reported here are just high memory usages due to big files sent from a backend (fastcgi/proxy); it is known that lighty caches the complete response in memory from the backend as fast as possible, and that memory is not freed but reused later.

#30 Updated by Anonymous about 6 years ago

Dunno, might want to take a look at this, and contact the original poster:
http://trac.lighttpd.net/trac/ticket/1642

#31 Updated by ralf about 6 years ago

Hello,

i remember that in lighttpd are some cache things that avoid frequenz malloc() calls.

This maybe the/a problem on high load.

Example (the numbers are just examples)


malloc(10)           (some cached internal/general memory for whatever)
malloc(20)  * 10.000 (high load, memory for whatever)
malloc(10)           (some cached internal/general memory for whatever)

....

high load goes down to normal, 20 * 10k is free'd but NOT the malloc(10) at the end of the peak since its cached.

As you know malloc() is using sbrk() which is at the end simply a large array of memory.
If the memory at the end of this vector is used (eg. the memory is cached somewhere for the next thousand years) all free's inside this vector will be never really done.

to explain in a other graphic:

malloc'd:


 0                      100MB
[UffffffffffffffffffffffU]

U = used
f = free'd

really inuse: 100MB

Only for information, maybe it helps (maybe not..)

Solution for that caching problem's is to use mmap().

Bye,
Ralf

#32 Updated by stbuehler almost 6 years ago

Yes, you may be right Ralf.

But i won't start implementing a new memory manager to use mmap, and i don't think we should start using glib2 in 1.4.x. So if someone has a good idea how to solve this, please share it with us.

#33 Updated by Anonymous almost 6 years ago

I can confirm this bug. I run a medium-traffic server which has two daemons of lighttpd running, one normal and one SSL. It seems that the SSL daemon suffers more greatly from this bug than the other one, despite having more traffic running on the HTTP daemon. After 1 day and 9 hours the SSL daemon takes 53 MB RAM:

#top -bn 1 | grep lighttpd
2538 www-data 20 0 66456 53m 1348 S 0.0 34.4 3:54.97 lighttpd
7558 www-data 20 0 7528 1180 524 S 0.0 0.7 0:00.00 lighttpd

I run the precompiled version of lighttpd that shipped with Debian Lenny (1.4.19). Mods installed are mod_ssl, mod_auth, mod_access and mod_fastcgi. I use fastcgi for PHP5.

Currently I have solved this by having a cron job restarting the server every once in a while.

#34 Updated by uberamd almost 6 years ago

I can confirm this while running the latest Debian stable build. Server uptime is: 23 days. Lighttpd is using

$ top -bn 1 | grep lighttpd
24864 www-data 15 0 938m 757m 664 S 0.0 76.0 1:21.39 lighttpd

Currently the server is handling 2 ~350MB file downloads. When the server was serving 4 simultaneous downloads it became unable to display any webpages until at least 1 of the downloads was finished. The server is on a 10Mbps line, and it wasn't even running at 50% capacity so there was room for the requests.

This has been happening for a long time, just never said anything until now because I noticed when one person did they said it was an isolated incident. Well, its not.

#35 Updated by davidb54 over 5 years ago

I can also confirm this behavior. It does not appear to be a memory leak. I have lighttpd 1.5.0 (from SVN) running on:

Linux zfc02 2.6.23-gentoo #1 SMP Tue Feb 26 09:55:05 PST 2008 x86_64 Intel(R) Xeon(R) CPU E5420 @ 2.50GHz GenuineIntel GNU/Linux

with 16 GB of memory. I am using mod_proxy_core/mod_proxy_backend_http, serving no local files. The server has been up for 3.5 days and its memory usage has climbed to nearly 2 GB. During this time, lighttpd has proxied around 57 million requests, with at least a terabyte of response data.

I ran lighttpd for a while under valgrind and did not find any significant memory leaks. I have included the valgrind log.

It seems that Ralf's explanation is the most plausible.

-dave

#36 Updated by davidb54 over 5 years ago

Do you have any estimate of if/when this issue will be addressed?

-dave

#37 Updated by davidb54 over 5 years ago

I monitored lighttpd's memory usage over the course of half an hour and watched it grow from 25 MB to 1.1 GB in two short spurts where RSS grew by hundreds of megabytes. The rest of the time, memory usage grew very gradually or not at all.

I looked around through the code and started to suspect the following pointers, which can be reallocated frequently:

srv->conns->ptr
srv->joblist->ptr
srv->fdwaitqueue->ptr

When these pointers are reallocated, they stick around until the next realloc() call, which is more and more likely to occur when the load is high. One possible scenario that might trigger a realloc of one or more of these pointers is where you have large files getting streamed to high-bandwidth clients, which might starve other connections in the fdevent loop, leading to an increase in the number of connections and in the size of the joblist.

I tried preallocating more elements in these arrays, using the attached patch. I'll report later on the result.

-dave

#38 Updated by daveb over 5 years ago

This didn't work. I still saw lighttpd's memory usage climb to around 1 GB. It took longer to get there, but I can't say whether that's related to the changes I made. I did notice one other realloc() in fdevent.c (in fdevent_revents_add()) - this might be worth a look.

I think it should be possible to separate allocation/reallocation of memory that persists for a long time (i.e. the connection, fdwaitqueue, joblist arrays and so on) from allocation/freeing of memory that persists for a much shorter amount of time (namely, chunkqueues and the chunks and buffers that they contain). The global chunkpool is a little bit problematic - perhaps it should just be flushed every now and then.

-dave

#39 Updated by stbuehler over 4 years ago

  • Subject changed from memory leak to memory fragmentation leads to high memory usage after peaks
  • Status changed from New to Invalid
  • Assignee deleted (jan)
  • Missing in 1.5.x set to No

We got reports that this bug report is misleading, so I will close it.

Summary: lighttpd of course had some memory leaks (and perhaps even has today), but this bug is not about these problems.
The main problem here is that the memory gets fragmented, and that is why malloc()/free() doesn't return the memory to the system; the memory is not lost to lighttpd, and lighttpd/malloc() will reuse the memory.

If you have new input regarding this bug (like good solutions for it, or just want to discuss it), please open a new bug (we can link it from here).
Please do not reopen/reply to this one, thank you very much.

Also available in: Atom