Bug #1964

Date previous than 1970 make lighttpd take 100% cpu

Added by gaspa over 5 years ago. Updated about 5 years ago.

Status:FixedStart date:2009-04-17
Priority:NormalDue date:
Assignee:-% Done:

100%

Category:core
Target version:1.4.23
Missing in 1.5.x:

Description

Lighttpd doesn't really close sockets when con->cur_ts (in src/connection.c) has a negative value.
This happens because the connection_close() is guarded by:

1617 if (srv->cur_ts - con->close_timeout_ts > 1) {
1618 connection_close(srv, con);

In fact if cur_ts is negative, and close_timeout_ts is zero (set few rows abowe), it will never be more than 1.

I solved this issue change all the "close_timeout_ts = 0" occurences with "close_timeout_ts = LONG_MIN", but perhaps it's the right thing only with libc.

I know also that's uncommon having a negative time. But, even this case is rare, I found unacceptable having a total use of the CPU, given that it might even bring to a denial of service in some situations.

Thanks.

log.txt Magnifier - Strace Log (1.46 MB) gaspa, 2009-04-17 10:31

connection.patch Magnifier - Simple patch that fix (for me) (894 Bytes) gaspa, 2009-04-17 10:31

Associated revisions

Revision 2532
Added by stbuehler about 5 years ago

Fix 100% cpu usage if time() < 0 (thx to gaspa and cate, fixes #1964)

Revision 2533
Added by stbuehler about 5 years ago

Fix 100% cpu usage if time() < 0 (thx to gaspa and cate, fixes #1964)

History

#1 Updated by gaspa over 5 years ago

About log.txt:
in strace you can see the shutdown() of a socket, without the related close(). In fact, after a few time, the poll()s begin to answer continuosly "POLLIN|POLLHUP".

#2 Updated by icy about 5 years ago

  • Target version changed from 1.4.22 to 1.4.23

#3 Updated by stbuehler about 5 years ago

If you have negative time, why are you certain that you don't have LONG_MIN as time?

I don't think we care enough about negative time to risk changing the special value for timestamps from "0" to LONG_MIN.

#4 Updated by gaspa about 5 years ago

I'm not certain, but you can be <= LONG_MIN at maximum for "one second", and a single second of 100% CPU occupation is acceptable.
Instead, you can be < "0" for years, that's sometimes not acceptable.

About the risk: what kind of risks do you think it can happen? I can't think any.
From what I remember close_timeout is use only in this check and not in other calcolus.

Thanks for your care.
Regards.

#5 Updated by stbuehler about 5 years ago

Iirc we use 0 as "invalid timestamp" in many places. And as we should always use the same value for "invalid timestamp" there are more changes to make... and i don't like that.

And i see no good reason to support negative timestamps...

#6 Updated by cate about 5 years ago

I raised the question of negative time in POSIX, and unfortunately they think that a POSIX application should support old time. BTW they discovered an error on time() specification, and in future errno will be set on errors: they consider that -1 could be also a valid time.

Anyway: "invalid timestamp" must be an invalid value, thus not used on regular calculation. BTW I fail to see the logic of the use: close_timeout_ts.

I would use "con->close_timeout_ts = srv->cur_ts" instead of = 0 to be consistent of the meaning the of the variable, or I would use it as flag.

BTW LONG_MIN is also wrong, time_t could be an other type (signed or unsigned, and also a float.

#7 Updated by stbuehler about 5 years ago

srv->cur_ts changes every second... not a good value for "invalid timestamp".

#8 Updated by gaspa about 5 years ago

stbuehler wrote:

srv->cur_ts changes every second... not a good value for "invalid timestamp".

I really can't catch the point.
The mean of the code that checks close_timeout_ts isn't to check if it's a valid or invalid timestamp; the logic is checking if it passed enough time in order to close the connection or not.
So, the state that "0" is used in other points of the code seems useless to me: different cases use different patterns....

I'm not even saying that you MUST use LONG_MIN (that depends from archs, library implemtation, etc...); I'm just saying that the logic behind close_timeout could be better.

#9 Updated by cate about 5 years ago

stbuehler wrote:

srv->cur_ts changes every second... not a good value for "invalid timestamp".

You are right, and I also forgot to add "-2".

I don't consider good coding standard because you use "invalid timestamp" in an arithmetic calculation, but this is a subjective taste. Also about POSIX: in such case I consider also a security problem having a server with negative timestamps.

#10 Updated by stbuehler about 5 years ago

I guess you are right, srv->cur_ts - 2 looks like a good value to mark it as "close now".

#11 Updated by stbuehler about 5 years ago

  • Status changed from New to Fixed
  • % Done changed from 0 to 100

Applied in changeset r2532.

Also available in: Atom