The goal is to provide a generic interface to balance between multiple backends each balancer acts like a backend itself.
A balancer is an
action, which selects a backend from a list (and executes it via
action_enter); if the backend fails due to specific reasons (overload/timeout), the balancer
gets called again (
- backend process died
- backend process cannot be spawned
- backend overloaded
- connect() timeout
- connect() reset (no backend listening, i.e. process down)
There are more problems, like connection dropping after data was sent; but it should not be the job of the balancer to "fix" such problems.The above problems can be classified in two categories:
- backend down
The spawn problems like restarting after the process died should be handled in another place.
As balancer can be stacked like other actions, at most one balancer (in a path in the "actions-tree") should have a backlog-queue.Some ideas for handling the backlog-queue:
- a backend has the states: "alive", "overloaded", "down", "down-retry"
- queue has the states: "alive", "overloaded", "down"
- if not all backends are "down" or "down-retry" stay/goto "alive"
- if "alive" and all backends are "down" or "down-retry", goto "overloaded" and start a small timeout
- if "overloaded" and overload-timeout is reached, goto "down"
- while "down" and all backends "down", return "503 Service Unavailable" for all requests
- "overloaded" backends are tried again (switched to "alive") after a small timeout (e.g. 3 seconds) or when another request from that backend gets completed.
- "down" backends are tried again (switched to "down-retry") after a small timeout (e.g. 1 seconds)
- the queue has a limit, after it return "503 Service Unavailable"
- requests have timeouts for finding a backend, return "504 Gateway Timeout" after it