Project

General

Profile

Bug #399

FastCGI performance on high load

Added by Anonymous over 11 years ago. Updated 12 months ago.

Status:
Fixed
Priority:
High
Assignee:
-
Category:
mod_fastcgi
Target version:
Start date:
Due date:
% Done:

100%

Missing in 1.5.x:

Description

lighttpd seems to enqueue all new incoming FastCGI requests so that every single request is processed one time. This works very well while FastCGI processes handle incoming requests in a moderate time. But if the machine is under high load and FastCGI processes need a long time to proceed, the load increases dramatically since (I guess) lighttpd hands over requests, that were already aborted by the requesting browser. Since processes take a long time to complete a request, users may (and they do) visit another link on the server, aborting the previous request - but FastCGI will still process this aborted request.

I know, a webserver should never be under that high load, but if it comes to this, the system load seems to explode.

I tested this by just hitting a link to delete some entries from a website using PHP with FastCGI on a very high load server...However, when I close the new browser window immediately, entries will be still deleted. In fact, no free FastCGI process was available at the time of the request and there still was none when the browser window was closed.

I think lighttpd would perform much better under that high load, if there would be an internal queue in lighttpd for fastcgi requests and an additional check if the connection of the request is still active before the requests will be processed by the next free FastCGI process.

  • Sorry for my bad english ;)

-- daniel /at/ schlach.com


Related issues

Related to Bug #647: handle-req timeout Duplicate
Related to Bug #2058: Closed connection by peer is not reported to fastcgi/scgi service Fixed 2009-08-27

Associated revisions

Revision 4b0c822e (diff)
Added by gstrauss about 1 year ago

always poll for client POLLHUP/POLLERR events (fixes #399)

to detect client disconnect. Do so even when waiting on backend,
and not polling for POLLRD or POLLWR on client connection.

This reduces unnecessary load on backends when backends are slow
to respond and client has given up waiting.

x-ref:
"https://redmine.lighttpd.net/issues/399"
FastCGI performance on high load

History

#1 Updated by Anonymous over 11 years ago

Additionally I think, that attacking a server by just opening (and immediatly closing) many connections to FastCGI scripts would not be possible anymmore.

Please correct me if I'm wrong.

-- daniel /at/ schlach.com

#2 Updated by Anonymous over 11 years ago

I had a look on this for a few days now, and I assume that anyone who just hits F5 or Refresh for a few times is a real risc for server performance when using FastCGI on servers running lighttpd with high load.

I'll adjust the tickets status a litle bit...

-- daniel /at/ schlach.com

#3 Updated by Anonymous almost 9 years ago

lighttpd should inform the fcgi server about dropped connections, so the server can decide what to do about it (complete the request or abort). I'm pretty sure there's a mechanism for doing this in the fcgi spec. I'm not sure if recent versions (1.4.19) properly handles this. Using Rails over fcgi, closed connections are not visible to the Rails code, but I'm not sure if this is lighttpd's fault (not signalling the dropped connection) or Rails's fault (not making use of the signal); it may be both.

If you have an expensive, long-running request, and the user drops the connection, it's pretty important to be able to have your code decide whether or not the request should complete or abort.

#4 Updated by gstrauss about 1 year ago

  • Description updated (diff)

https://github.com/lighttpd/lighttpd1.4/pull/50 adds the ability to tune listen backlog queue size if you have measured that your FastCGI can keep up with a given number of requests in a given amount of time and want to limit the load on the backend. See discussions in #2116 and #1825.

https://github.com/lighttpd/lighttpd1.4/pull/53 improves the control flow logic in dynamic handlers, and so they will abort the connection to the backend after configured timeouts. (Note there is more work that needs to be done to add additional timeout configuration options.)

While there is an application-level FastCGI message to send to abort a FastCGI request, lighttpd does not send it. Instead, since lighttpd sends only one request on a FastCGI socket connection (instead of using the feature of multiplexing multiple requests on a FastCGI socket connection), lighttpd closes the FastCGI socket connection when lighttpd detects that a client connection has been aborted.

For a backend application to get this information, the backend FastCGI application code must be written to asynchronously listen for FastCGI messages -- including handling the closing of the FastCGI socket connection -- and then must communicate this information to the running FastCGI request handler. Please be aware that many FastCGI applications do not do this, as they are written in a simple serialized loop that accepts a request, processes it, sends the response, and then waits for the next request.

#5 Updated by gstrauss about 1 year ago

  • Related to Bug #647: handle-req timeout added

#6 Updated by gstrauss about 1 year ago

  • Status changed from New to Patch Pending
  • Target version set to 1.4.40

The following is a minimal patch so that lighttpd will detect if client disconnects. It removes interest in FDEVENT_READ and FDEVENT_WRITE, but will still be able to receive FDEVENT_HUP or FDEVENT_ERR. This will avoid sending the request to backend if the request has not yet been sent to the backend, and will result in closing the connection to the backend if the response has not yet been received, or is in the process of being received.

The check for (-1 == fd) in fdevent_event_set() is because the connection might have been added to the joblist and connection_state_machine() might be called after the connection has been closed and con->state has been reset to CON_STATE_CONNECT.

diff --git a/src/connections.c b/src/connections.c
index f33fcd6..b316f2a 100644
--- a/src/connections.c
+++ b/src/connections.c
@@ -143,6 +143,7 @@ int connection_close(server *srv, connection *con) {
                                "(warning) close:", con->fd, strerror(errno));
        }
 #endif
+       con->fd = -1;

        srv->cur_fds--;
 #if 0
@@ -1650,11 +1651,11 @@ int connection_state_machine(server *srv, connection *con) {
                    (con->traffic_limit_reached == 0)) {
                        fdevent_event_set(srv->ev, &(con->fde_ndx), con->fd, FDEVENT_OUT);
                } else {
-                       fdevent_event_del(srv->ev, &(con->fde_ndx), con->fd);
+                       fdevent_event_set(srv->ev, &(con->fde_ndx), con->fd, 0);
                }
                break;
        default:
-               fdevent_event_del(srv->ev, &(con->fde_ndx), con->fd);
+               fdevent_event_set(srv->ev, &(con->fde_ndx), con->fd, 0);
                break;
        }

diff --git a/src/fdevent.c b/src/fdevent.c
index e9038ab..0a9522e 100644
--- a/src/fdevent.c
+++ b/src/fdevent.c
@@ -165,6 +165,7 @@ int fdevent_event_del(fdevents *ev, int *fde_ndx, int fd) {

 int fdevent_event_set(fdevents *ev, int *fde_ndx, int fd, int events) {
        int fde = fde_ndx ? *fde_ndx : -1;
+       if (-1 == fd) return 0;

        if (ev->event_set) fde = ev->event_set(ev, fde, fd, events);
        ev->fdarray[fd]->events = events;

A slightly more efficient patch has been added to https://github.com/lighttpd/lighttpd1.4/pull/53

#7 Updated by gstrauss about 1 year ago

  • Related to Bug #2058: Closed connection by peer is not reported to fastcgi/scgi service added

#8 Updated by gstrauss 12 months ago

  • Status changed from Patch Pending to Fixed
  • % Done changed from 0 to 100

#9 Updated by gstrauss 12 months ago

  • Assignee deleted (jan)

Also available in: Atom