Project

General

Profile

[WAD] HTTP/2 GOAWAY with excessive PUT requests exceeding stream concurrency limit

Added by stenvaag about 2 years ago

RFC 7540 states that

5.4.2. Stream Error Handling
...
A RST_STREAM is the last frame that an endpoint can send on a stream.
The peer that sends the RST_STREAM frame MUST be prepared to receive
any frames that were sent or enqueued for sending by the remote peer.
These frames can be ignored, except where they modify connection
state (such as the state maintained for header compression
(Section 4.3) or flow control).
...

Adding logging code as described in case #3102 and #3149 I suspect that lighttpd is not ignoring data frames on the stream after RST_STREAM is sent.

From the h2.c source code

static int
h2_recv_data (connection * const con, const uint8_t * const s, const uint32_t len)
{
...
        else {
            if (!h2c->sent_goaway && 0 != alen)
                h2_send_goaway_e(con, H2_E_NO_ERROR);
            return 0;
        }...
}

If I remove (comment out) this goaway in the source code the logging in #3149 shows that lighttpd is prefectly handling a burst of puts and gets from Chrome within one single connection. It even shows that when the initial excess streams are "cut off" with RST_STREAM Chrome respects the SETTINGS_MAX_CONCURRENT_STREAMS of 8 and no more RST_STREAM is sent.

I'm not sure if just removing the sending of goaway is a correct solution.

Is this a bug?

Could this be fixed in lighttpd?


Replies (5)

RE: Possible HTTP/2 bug - Added by gstrauss about 2 years ago

Is this a bug?

Could this be fixed in lighttpd?

Your should stop sharing your presumptions. They are annoying and they continue to be wrong. No, it is not a bug in lighttpd.

5.4.2. Stream Error Handling
...
A RST_STREAM is the last frame that an endpoint can send on a stream.
The peer that sends the RST_STREAM frame MUST be prepared to receive
any frames that were sent or enqueued for sending by the remote peer.
These frames can be ignored, except where they modify connection
state (such as the state maintained for header compression
(Section 4.3) or flow control).
...

None of that states that lighttpd must ignore the frames. It says that the sender of RST_STREAM "can" ignore those frames.

lighttpd treats your uncontrolled [1] spray of initial PUT requests as an attack.
lighttpd's behavior is intentional and is conformant with RFC7540, c.f.

Stream Concurrency
https://www.rfc-editor.org/rfc/rfc7540.html#section-5.1.2

Request Reliability Mechanisms in HTTP/2
https://www.rfc-editor.org/rfc/rfc7540.html#section-8.1.4

Denial-of-Service Considerations
https://www.rfc-editor.org/rfc/rfc7540.html#section-10.5

An HTTP/2 server can send GOAWAY at any time and for any reason. Since lighttpd continues to handle already-accepted request streams, lighttpd chooses to send GOAWAY with NO_ERROR when too many streams with request body are sent. lighttpd could choose to send ENHANCE_YOUR_CALM, though I do not know whether or not clients would treat that as a fatal error on the HTTP/2 connection.

Connection Error Handling
https://datatracker.ietf.org/doc/html/rfc7540#section-5.4.1
After sending the GOAWAY frame for an error condition, the endpoint MUST close the TCP connection.

(Quoting from #3149)

Refering to #3148 I was a little suprised by your conclusion, that I needed to rewrite my code to make this work.

(and from above the original post above)

I'm not sure if just removing the sending of goaway is a correct solution.

You did not read -- or did not understand -- the comment a few lines above the call to h2_send_goaway_e that you commented out. You did not use git annotate to try to understand why that code was added. commit 81d18a8e
https://git.lighttpd.net/lighttpd/lighttpd1.4/commit/81d18a8e359685c169cfd30e6a1574b98aedbaeb


[1] If you actually controlled your concurrent requests, it would be trivial for you to marginally reduce the rate at which your PUT requests are sprayed. You do not currently have that control (or have chosen not to use it). This is a shortcoming in the client code you are using.


Now it remains possible that lighttpd code could be tweaked, similar to what was done in commit 81d18a8e, to workaround the shortcomings of apps such as yours which poorly manage resources (or do not attempt to manage resources at all), or similar apps such as in #3078 which assume error handling, error recovery, and retry is for somebody else to code. However, any such tweaks are not bug fixes, as current lighttpd behavior is conformant with RFC7540.

Before posting a response, please re-read the sections from the RFC I linked above, specifically "Denial-of-Service Considerations".

RE: [WAD] HTTP/2 GOAWAY with excessive PUT requests exceeding stream concurrency limit - Added by stenvaag about 2 years ago

I was now able to find the bug and "the shortcomings of apps such as yours which poorly manage resources (or do not attempt to manage resources at all)" is working as expected.

"You did not read -- or did not understand -- the comment a few lines above the call to h2_send_goaway_e that you commented out", did not help me as the description and the idea described here is correct and not the problem.

After finding the bug, the problem was visible in the timing of the logs from #3149.

The test

if (h2c->half_closed_ts + 2 >= log_monotonic_secs) {

in h2_recv_data is never true and goaway is always sent.

h2_send_refused_stream calls h2_send_rst_stream_id, which never sets h2c->half_closed_ts.

Adding

h2c->half_closed_ts = log_monotonic_secs;

to h2_send_refused_stream solves my original problem.

RE: [WAD] HTTP/2 GOAWAY with excessive PUT requests exceeding stream concurrency limit - Added by gstrauss about 2 years ago

You seem confident that this is a bug. Let me be clear that it is not, you are not helping your case with your approach, and you continue to fail to take those hints. You also have not provided any compelling argument for why this is a bug. "I would like different behavior" is not a compelling argument that the existing behavior is a bug. More direct hint: I strongly recommend changing your approach.

I wrote:

Now it remains possible that lighttpd code could be tweaked, similar to what was done in commit 81d18a8e, to workaround the shortcomings of apps such as yours which poorly manage resources (or do not attempt to manage resources at all), or similar apps such as in #3078 which assume error handling, error recovery, and retry is for somebody else to code. However, any such tweaks are not bug fixes, as current lighttpd behavior is conformant with RFC7540.

True or false: your client app manages resources (i.e. for concurrent PUT requests)
True or false: your client app expects someone else to manage resources (i.e. for concurrent PUT requests)

Your app qualifies as lazy and/or dumb. As I have stated multiple times, lighttpd is treating your app's behavior as an attack. Even though I emphasized that you should re-read the "Denial-of-Service Considerations" section of the RFC, and regardless of whether you agree or disagree, you failed to acknowledge what I have written.

You wrote:

After finding the bug, the problem was visible in the timing of the logs from #3149.
The test
if (h2c->half_closed_ts + 2 >= log_monotonic_secs) {
in h2_recv_data is never true and goaway is always sent.

I think you mean that in your case that test is never true. That code is documented in the comment above it and in the commit message for that change. It works as designed. "half_closed_ts" is short for "half-closed timestamp". None of your streams on the new connection have successfully reached half-closed state in your PUT attack on the server. You appear to acknowledge this with:

"You did not read -- or did not understand -- the comment a few lines above the call to h2_send_goaway_e that you commented out", did not help me as the description and the idea described here is correct and not the problem.


Once again, I state:

Now it remains possible that lighttpd code could be tweaked, similar to what was done in commit 81d18a8e, to workaround the shortcomings of apps such as yours which poorly manage resources (or do not attempt to manage resources at all), or similar apps such as in #3078 which assume error handling, error recovery, and retry is for somebody else to code. However, any such tweaks are not bug fixes, as current lighttpd behavior is conformant with RFC7540.

RE: [WAD] HTTP/2 GOAWAY with excessive PUT requests exceeding stream concurrency limit - Added by stenvaag about 2 years ago

I´m really sorry that my postings have given the impressions that I´m not reading / acknowledging your replies, before I claim a bug is found. I have read all the references in your post to rfc7540, including the comment above the call to h2_send_goaway_e. I´m not posting a lot of messages without reading your answers. I have spent a lot of time on this.

I will give this one more try. The last. Then I will not bother you with this case again. Hopefully I have found a relevant reference in the rfc7540 this time to support my case.

Reading rfc7540 section "5.1. Stream States", in "The lifecycle of a stream is shown in Figure 2.", show that a stream receiving the header is in open state. Sending the RST_STREAM (on excess streams) closes the stream. This is according to what you wrote "None of your streams on the new connection have successfully reached half-closed state in your PUT attack". So the half_closed_ts is not used in this case, and my "bug fix" is wrong and no bug. Sorry about claiming this as a bug.

Later in section "5.1. Stream States" describes the different states of a stream.

In the closed section I read:

If this state is reached as a result of sending a RST_STREAM
frame, the peer that receives the RST_STREAM might have already
sent -- or enqueued for sending -- frames on the stream that
cannot be withdrawn. An endpoint MUST ignore frames that it
receives on closed streams after it has sent a RST_STREAM frame.
An endpoint MAY choose to limit the period over which it ignores
frames and treat frames that arrive after this time as being in
error.
Flow-controlled frames (i.e., DATA) received after sending
RST_STREAM are counted toward the connection flow-control window.
Even though these frames might be ignored, because they are sent
before the sender receives the RST_STREAM, the sender will
consider the frames to count against the flow-control window.

This is strange since the section "5.4.2. Stream Error Handling" says it "may" ignore frames, when this section it says "MUST". It even says that an endpoint MAY choose to limit the period before sending an error.

RE: [WAD] HTTP/2 GOAWAY with excessive PUT requests exceeding stream concurrency limit - Added by gstrauss about 2 years ago

You continue to be mired in the details and are missing the big picture. A server may defend itself from attacks. A server may send GOAWAY whenever it pleases and for whatever reason. Period. End of story. No debate. It's in the RFC. These facts stand by themselves. They are independent of anything else you might find.

There are multiple ways that a server may handle various scenarios. You are requesting that lighttpd modify lighttpd's behavior, but you have been waaaaaaayyyy too thick to pick up on the message that your "I want different behavior, so it must be a bug in your software" is a poor approach. Note how many "bugs" you have filed in the lighttpd issue tracker (instead of "feature" requests) and note how you have titled all of your posts with your conclusion, even when told to ask questions in the forums. (I retitled the post here.) Newsflash: a question is not the same thing as a conclusion/judgement. Adding a question mark to the end of your conclusion is still a conclusion, even if in the technical form of a question. Presenting observations and a hypothesis is a better way to phrase a question. Asking "Can this behavior be modified?" is more polite than "Is this a bug?"

I posted (multiple times now) pointing to commit 81d18a8e which was not a "bugfix" but rather a workaround to poor client behavior. You failed to recognize that your client is similar, and that you, too, are requesting that lighttpd (somewhat) lower lighttpd's defenses to accept your poor client behavior.

I understand that you would like lighttpd to modify lighttpd's behavior to act as a data sink for your client application uncontrolled spray. To keep lighttpd resource use light, lighttpd does not carefully track every closed or refused stream. I think you are requesting that lighttpd discard DATA frames for untracked streams near the time of connection init, when the server has not yet received SETTINGS ackn from the client after lighttpd sends SETTINGS with lighttpd's stream concurrency limit. Now then, a malicious client might never send SETTINGS ackn, so if lighttpd discards DATA frames for untracked streams, lighttpd should do so for a limited time, similar in behavior to the workaround which uses h2c->half_closed_ts. To do this, lighttpd could check if waiting for SETTINGS ackn (and for how long) in h2_recv_data():
if (h2c->half_closed_ts + 2 >= log_monotonic_secs || h2c->sent_settings + 2 >= log_monotonic_secs)
Since those conditions are so similar, then as an implementation detail, lighttpd could overload h2c->half_closed_ts, which I did last night, and you suggested today. From my development branch last night -- you can check the timestamp on the commit -- though I have further updated the comments today:
https://git.lighttpd.net/lighttpd/lighttpd1.4/commit/6bb2689fc318120c6b902d23179208230ed9f9f4

None of the above changes the fact that your client is sending an uncontrolled spray of requests. Again, I refer you to the "Denial-of-Service Considerations" section of RFC7540 and tell you that your client is exhibiting poor behavior that can legitimately be interpreted by a server as an attack.

    (1-5/5)