Bug #1911
closedsegfault with lighttpd 1.4.20 + scgi
Description
Hello, I'm using lighttpd 1.4.20 on CentOS 4.4 (x86_64). I have a Python+scgi app running on localhost, port 4000 and I have lighttpd connecting to that. This works fine in my own testing but when I send real traffic to the server lighttpd crashes after about 200 requests. I get this message in /var/log/messages:
kernel: lighttpd[19549]: segfault at 00000000005b0000 rip 00000036455725b0 rsp 0000007fbffff498 error 4
Below is the configuration I'm using:
server.modules = ("mod_compress", "mod_status", "mod_rewrite", "mod_access", "mod_cgi", "mod_accesslog", "mod_setenv", "mod_scgi") server.event-handler = "linux-sysepoll" server.document-root = "/var/www/mysite/static" server.port = 8000 $SERVER["socket"] == "10.10.10.34:81" { server.document-root = "/var/www/mysite" scgi.server = ( "/" => ( "127.0.0.1" => ( "host" => "127.0.0.1", "port" => 4000, "check-local" => "disable") ) ) server.tag = "lighttpd" accesslog.format = "%{X-Cluster-Client-Ip}i %l %u %t %{Host}i \"%r\" %s %b \"%{Referer}i\" \"%{User-Agent}i\"" accesslog.filename = "|/usr/sbin/cronolog /var/log/mysite/access_log.%Y%m%d" } mimetype.assign = ( ".html" => "text/html", ".txt" => "text/plain", ".jpg" => "image/jpeg", ".png" => "image/png", ".gif" => "image/gif", ".js" => "application/x-javascript", ".css" => "text/css", ".xsl" => "text/plain", ".ico" => "image/x-icon", ".src" => "text/plain", ".htc" => "text/x-component", ".mp3" => "audio/mpeg" ) compress.cache-dir = "/var/www/cache/" compress.filetype = ("text/plain", "text/html", "text/javascript")
Files
Updated by kevinsl almost 16 years ago
I've tried to reproduce the problem with apache benchmark (ab) but am not able to reproduce it. Also, I don't think this is a problem with my python/scgi app because it stays running and produces no errors.
Updated by stbuehler almost 16 years ago
- Target version changed from 1.4.20 to 1.4.22
Updated by stbuehler almost 16 years ago
- Status changed from New to Need Feedback
I don't think we can help you without a backtrace (or a way to reproduce it).
Updated by kevinsl almost 16 years ago
- File lighttpd.17720.txt lighttpd.17720.txt added
Ok, here is a traceback produced by valgrind.
Updated by stbuehler almost 16 years ago
- It would be nice if you could try the attached patch (i have no scgi application to test it... and "proper" applications shouldn't trigger that bug anyway).
- I guess you mixed the line endings "\n" and "\r\n" in the response header - you really should always use "\r\n", or at least always the same.
Updated by kevinsl almost 16 years ago
- File lighttpd.28533.txt lighttpd.28533.txt added
I wasn't able to try your patch since I'm using a binary distribution. But I checked my code and found a few places where headers had \n and \r\n mixed together. I corrected that and now the application runs better.
But there is a new problem. Now I get several of these messages in lighttpd's error_log:
2009-02-26 10:13:48: (mod_scgi.c.2467) emergency exit: scgi: connection-fd: 12 fcgi-fd: 9 2009-02-26 10:14:30: (mod_scgi.c.2467) emergency exit: scgi: connection-fd: 12 fcgi-fd: 10 2009-02-26 10:14:42: (mod_scgi.c.2467) emergency exit: scgi: connection-fd: 16 fcgi-fd: 17 2009-02-26 10:16:11: (mod_scgi.c.1790) Connection reset by peer 11 9 2009-02-26 10:16:11: (mod_scgi.c.2575) response already sent out, termination connection connection-fd: 11 fcgi-fd: 9 2009-02-26 10:16:49: (mod_scgi.c.1790) Connection reset by peer 14 12 2009-02-26 10:16:49: (mod_scgi.c.2575) response already sent out, termination connection connection-fd: 14 fcgi-fd: 12 2009-02-26 10:16:50: (mod_scgi.c.2467) emergency exit: scgi: connection-fd: 11 fcgi-fd: 9 2009-02-26 10:17:14: (mod_scgi.c.1790) Connection reset by peer 12 9 2009-02-26 10:17:14: (mod_scgi.c.2575) response already sent out, termination connection connection-fd: 12 fcgi-fd: 9 2009-02-26 10:20:26: (mod_scgi.c.1790) Connection reset by peer 9 10 2009-02-26 10:20:26: (mod_scgi.c.2575) response already sent out, termination connection connection-fd: 9 fcgi-fd: 10 2009-02-26 10:21:15: (mod_scgi.c.2467) emergency exit: scgi: connection-fd: 11 fcgi-fd: 13 2009-02-26 10:22:35: (mod_scgi.c.1790) Connection reset by peer 8 9 2009-02-26 10:22:35: (mod_scgi.c.2575) response already sent out, termination connection connection-fd: 8 fcgi-fd: 9 2009-02-26 10:22:53: (mod_scgi.c.1790) Connection reset by peer 11 8 2009-02-26 10:22:53: (mod_scgi.c.2575) response already sent out, termination connection connection-fd: 11 fcgi-fd: 8
While getting these errors my application will run for about one hour and then lighttd stops accepting new connections but the daemon is still running. My scgi app seems fine.
I'm attaching a traceback I captured while these errors were logged.
Any ideas what the problem is?
Updated by stbuehler almost 16 years ago
I don't know how that ECONNRESET is triggered for read(), perhaps a strace could help us there. The "emergency exit" is probably triggered when the client aborts the request.
But that should go into a new bug, this one was for the segfault :)
Updated by stbuehler almost 16 years ago
- Status changed from Need Feedback to Fixed
- % Done changed from 0 to 100
Applied in changeset r2404.
Also available in: Atom