Bug #2882
closedfastcgi.balance not working
Description
when setting fastcgi.balance="round-robin" (or anything else than least-connection) and setting fastcgi.debug=1
i get logs saying i'm using least-connection proxy
2018-04-09 14:52:30: (gw_backend.c.1468) --- gw spawning local \n\tproc: /usr/lib/cgi-bin/httptime.fpl \n\tport: 0 \n\tsocket /var/run/lighttpd/httptime-30373.sock \n\tmin-procs: 2 \n\tmax-procs: 2
2018-04-09 14:52:30: (gw_backend.c.1492) --- gw spawning \n\tport: 0 \n\tsocket /var/run/lighttpd/httptime-30373.sock \n\tcurrent: 0 / 2
2018-04-09 14:52:30: (gw_backend.c.461) new proc, socket: 0 /var/run/lighttpd/httptime-30373.sock-0
2018-04-09 14:52:30: (gw_backend.c.1492) --- gw spawning \n\tport: 0 \n\tsocket /var/run/lighttpd/httptime-30373.sock \n\tcurrent: 1 / 2
2018-04-09 14:52:30: (gw_backend.c.461) new proc, socket: 0 /var/run/lighttpd/httptime-30373.sock-1
2018-04-09 14:52:30: (gw_backend.c.841) proxy - used least connection
2018-04-09 14:52:30: (gw_backend.c.933) gw - found a host 127.0.0.1 5000
2018-04-09 14:52:30: (gw_backend.c.841) proxy - used least connection
2018-04-09 14:52:30: (gw_backend.c.933) gw - found a host 127.0.0.1 5000
2018-04-09 14:52:30: (gw_backend.c.841) proxy - used least connection
2018-04-09 14:52:30: (gw_backend.c.933) gw - found a host 192.168.1.2 5000
2018-04-09 14:52:30: (gw_backend.c.841) proxy - used least connection
2018-04-09 14:52:30: (gw_backend.c.933) gw - found a host 192.168.1.2 5000
2018-04-09 14:52:30: (gw_backend.c.841) proxy - used least connection
2018-04-09 14:52:30: (gw_backend.c.933) gw - found a host 192.168.1.3 5000
2018-04-09 14:52:30: (gw_backend.c.841) proxy - used least connection
2018-04-09 14:52:30: (gw_backend.c.933) gw - found a host 192.168.1.3 5000
2018-04-09 14:52:30: (gw_backend.c.234) got proc: pid: 0 socket: tcp:127.0.0.1:5000 load: 1
2018-04-09 14:52:30: (gw_backend.c.234) got proc: pid: 0 socket: tcp:192.168.1.2:5000 load: 1
2018-04-09 14:52:30: (gw_backend.c.234) got proc: pid: 0 socket: tcp:192.168.1.3:5000 load: 1
2018-04-09 14:52:30: (gw_backend.c.308) released proc: pid: 0 socket: tcp:192.168.1.3:5000 load: 0
2018-04-09 14:52:30: (gw_backend.c.841) proxy - used least connection
when setting some other string to fastcgi.balance i receive correct error:
2018-04-09 14:33:15: (server.c.1423) server started (lighttpd/1.4.49 (PLD Linux))
2018-04-09 14:33:15: (gw_backend.c.1634) xxxxx.balance has to be one of: least-connection, round-robin, hash, sticky, but not: static
2018-04-09 14:33:15: (server.c.1431) Configuration of plugins failed. Going down.
Files
Updated by taffff over 6 years ago
- File konfff.conf konfff.conf added
Added config that was used to test this
Updated by gstrauss over 6 years ago
- Status changed from New to Patch Pending
- Target version changed from 1.4.x to 1.4.50
Thanks for your report. Looks like this was never merged into configs for mod_fastcgi. I have to make a similar patch to mod_scgi.
--- a/src/mod_fastcgi.c +++ b/src/mod_fastcgi.c @@ -463,6 +463,7 @@ static int fcgi_patch_connection(server *srv, connection *con, plugin_data *p) { PATCH(exts_auth); PATCH(exts_resp); PATCH(debug); + PATCH(balance); PATCH(ext_mapping); /* skip the first, the global context */ @@ -483,6 +484,8 @@ static int fcgi_patch_connection(server *srv, connection *con, plugin_data *p) { PATCH(exts_resp); } else if (buffer_is_equal_string(du->key, CONST_STR_LEN("fastcgi.debug"))) { PATCH(debug); + } else if (buffer_is_equal_string(du->key, CONST_STR_LEN("fastcgi.balance"))) { + PATCH(balance); } else if (buffer_is_equal_string(du->key, CONST_STR_LEN("fastcgi.map-extensions"))) { PATCH(ext_mapping); }
Updated by glen over 6 years ago
hotfixed in pld linux:
https://github.com/pld-linux/lighttpd/commit/47f4746562342f0fda1ea029eaeffebf5ede381f as lighttpd-1.4.49-3
Updated by taffff over 6 years ago
It seems that there is some serios problem regarding round robin working properly in lighttpd. Tried this fix and the problem lies if you are using multiple fastcgi pools under $HTTP["url"]. for example:
fastcgi.server = (
".php" => (
"localhost" => (
"host" => "127.0.0.1",
"port" => 5000,
"check-local" => "enable",
# 3
),
"box2" => (
"host" => "192.168.1.2",
"port" => 5000,
"check-local" => "enable",
# 3
),
"box3" => (
"host" => "192.168.1.3",
"port" => 5000,
"check-local" => "enable",
# 3
),
"box4" => (
"host" => "192.168.1.4",
"port" => 5000,
"check-local" => "enable",
# 3
),
"box5" => (
"host" => "192.168.1.5",
"port" => 5000,
"check-local" => "enable",
# 3
),
),
)
fastcgi.balance = "round-robin"
$HTTP["url"] =~ "^/randomurl/" {
fastcgi.server = (
".php" => (
"localhost" => (
"host" => "127.0.0.1",
"port" => 5001,
"check-local" => "enable",
# 3
),
"box2_randomurl" => (
"host" => "192.168.1.2",
"port" => 5001,
"check-local" => "enable",
# 3
),
"box3_randomurl" => (
"host" => "192.168.1.3",
"port" => 5001,
"check-local" => "enable",
# 3
),
"box4_randomurl" => (
"host" => "192.168.1.4",
"port" => 5001,
"check-local" => "enable",
# 3
),
"box5_randomurl" => (
"host" => "192.168.1.5",
"port" => 5001,
"check-local" => "enable",
# 3
),
),
)
}
in this case when stopping php-fpm on box3, default fastcgi.server works normally: disables box3 from fastcgi pool. the problem starts with the pool inside $HTTP. from that fastcgi pool it disables box3 and i can see it from the logs also, but it disables 2 random nodes aswell - from status page, i can see that only box3 has died count increasing but not the other two. The two extra boxes will always remain the same (disabling box2 disables box4 and localhost, disabling box 5 - disables also box3 and box4).
the connections to the working boxes are restored when using least-connection as balancing method - so theres something wrong also with the balancing code itself (it disables the died instance + skips next one (keeps it in pool), disables after that, skips (keeps it in pool) and disables the next one after that) - i'm guessing that adding more servers to pool will continue the pattern for disabling every other host in list
Updated by taffff over 6 years ago
Please note that this happens only the hosts in inside $HTTP["url"] fastcgi.server. the main fastcgi.server behaves normally
Updated by gstrauss over 6 years ago
@taffff: I am having trouble reproducing this in the HEAD of my development branch. Maybe what you are seeing was already fixed in 4a674224 ?
The configuration parsing code for fastcgi.server does not distinguish between whether or not fastcgi.server is in a condition (e.g. $HTTP["url"] { ... }) or not. It's the same code. Therefore, it is very curious the behavior that you describe, and the problem might not be what it seems. Also, the code for disabling and enabling backends is identical for all balancing options. The balancing options are used solely for selecting a host to which to connect a request, and does not disable any backends.
Updated by taffff over 6 years ago
i'm using v1.4.49 with the patch you provided for the round-robin and the problem still comes out for the fastcgi servers that are defined in a condition. will try to reproduce and provide you full config of the setup that causes this weird behaviour
Updated by glen over 6 years ago
looks like this patch is missing from master:
https://redmine.lighttpd.net/issues/2882#note-3
created PR: https://github.com/lighttpd/lighttpd1.4/pull/91
Updated by gstrauss over 6 years ago
See https://redmine.lighttpd.net/projects/lighttpd/wiki/DevelGit and personal/gstrauss/master branch. Note that that branch may be rewritten as I please.
Hint: @glen if I put a ticket into Patch Pending state, it means that I have a patch pending and don't need a pull request.
Updated by gstrauss over 6 years ago
- master
personal/gstrauss/master
Updated by gstrauss over 6 years ago
- Status changed from Patch Pending to Fixed
- % Done changed from 0 to 100
Applied in changeset 3efaff973fbed758655ac07420b14473a3fb29a2.
Also available in: Atom