Project

General

Profile

Actions

Bug #2020

closed

HTTP header buffer corruption: "invalid character in key"

Added by intgr almost 15 years ago. Updated over 11 years ago.

Status:
Missing Feedback
Priority:
Normal
Category:
core
Target version:
-
ASK QUESTIONS IN Forums:

Description

I've seen this error happen very rarely, at most 45 times over the last 8 months (some of those instances were probably legitimate header corruption). Very rare considering the server is handling over 6 million requests a day now.
The error log reports something like:

    2009-06-30 16:56:43: (request.c.589) invalid character in key GET /<uri> HTTP/1.0

I've attached an example of a log message. I replaced the sensitive parts of the request with 'X', but I took extra care not to insert/remove any bytes.

This message is followed by a garbled set of HTTP headers. In the attached example, the garbled header looks like:

    Accept-LanguGET/script.php?space=524&[....]HTTP/1.1

The place where "GET" is injected is always the 1024th byte of the request (counting from the first "GET"). Also note that there's no space after "GET" or before "HTTP/1.1" on this line anymore. Also, HTTP/1.0 changed to 1.1

And ends with:

     47 -> 400

where 47 stands for a '/' character -- the character right after "GET"

This happens on Linux kernel 2.6.25.11-0.1-xen. We are not using any reverse proxies (The MT-Proxy-ID and X-Forwarded-For are from the client's gateway proxy)

Line 598 of request.c is:

log_error_write(srv, __FILE__, __LINE__, "sbsds",
    "invalid character in key", con->request.request, cur, *cur, "-> 400");

So, it prints the headers twice; note that the first copy is corrupted whereas the latter isn't (though it's missing "GET" from the beginning -- don't know if it's supposed to behave like that)

Two posters on lighttpd forums have also experienced something similar: http://forum.lighttpd.net/topic/146

If it helps, I can post more log messages of this bug.


Files

lighttpd-hdr-corruption.txt (3.41 KB) lighttpd-hdr-corruption.txt intgr, 2009-07-03 15:55
lighttpd.conf (9.84 KB) lighttpd.conf lighttpd.conf intgr, 2009-08-13 15:18
modules.conf (3.16 KB) modules.conf intgr, 2009-08-17 10:16
lighttpd.tar.gz (9.76 KB) lighttpd.tar.gz All configuration files intgr, 2009-08-17 10:16
Actions #1

Updated by intgr almost 15 years ago

  • Target version deleted (1.4.19)

(Oops, remove 'target version' -- it means something else)

The bug occurs in lighttpd 1.4.19

Also, I've seen this happen with MSIE6, MSIE7, Opera8, Firefox3 and Safari -- so I doubt this is a client-side bug.

Actions #2

Updated by pi3orama over 14 years ago

Could you provide your config file? Which plugins you used?

Actions #3

Updated by intgr over 14 years ago

Here's the config. I think this configuration file originated from openSUSE 11.0

I can't currently attach the vhost config file because I don't have access to it, but I don't think it changes anything significant.

Actions #4

Updated by pi3orama over 14 years ago

modules.conf is missing. I try to reproduce your problem with only fcgi php. I use the urls you provided, with http_load run nearly 3 hours, 100 reqs/sec, everything workss fine.

Updated by intgr over 14 years ago

Wait a second... We handle anywhere between 6 and 10 million requests a day. This bug occurs just a few times every month. And you expect to reproduce it in 3 hours?

Anyway, here's modules.conf and all configuration files as a tar.gz archive.

Actions #6

Updated by pi3orama over 14 years ago

Reproduce is a hard work. But I believe if I reproduce it once, I can reproduce it again and again, and finally solve the problem.

In fact I'm a poor student, the author of http://gitorious.org/currf2. I have bet my future on that project. I have to use it to solve a real bug to show its power to my professor. This is why I interest in your problem. If you use x86 + Linux, you are welcome to try it in your environment.

Actions #7

Updated by dmitry.hohlov over 13 years ago

I noticed the same messages in lighttpd's log. We handle about 12 millions requests per day. As for us this message appears slightly more often. At least one time per day. But I think this is not lighttpd bug but client side bug (browser, proxy or whatever else) or network bug (TCP error protection isn't the best it the world). Really, why not? Client's browser can have a bug and in very rare situations duplicate request headers. pi3orama, you could never reproduce this message just because you dont use this browser :) Think we can summarize User-Agent headers from this messages and see how many browsers (and browser versions) sends us this garbled headers.

For my side I use this command. Not so accurate because it assumes that User-Agent header will appear in first 6 lines of the request and that it won't appear multiple times in this lines.

grep -h -A 5 'invalid character in key' /var/log/lighttpd/error.log* | grep 'User-Agent'

I got this results:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.9.1.15) Gecko/20101026 MRA 5.7 (build 03773) Firefox/3.5.15 sputnik 2.3.0.102
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.9.1.15) Gecko/20101026 MRA 5.7 (build 03686) Firefox/3.5.15 sputnik 2.3.0.86
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:2.0b8pre) Gecko/20101130 Firefox/4.0b8pre
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:2.0b8pre) Gecko/20101130 Firefox/4.0b8pre
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; ru) Opera 10.63
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; ru) Opera 10.63
User-Agent: Opera/9.80 (Windows NT 5.1; U; ru) Presto/2.6.30 Version/10.63

Our production environment only three days on lighttpd, so there are not much results. But what we see is: Firefox 3.5.15 with different builds of version 2.3 of something called sputnik (may be the case) - 2 times, Firefox/4.0 beta 8 pre (beta, no more words needed) - 2 times, Opera 10.63 and Opera/9.80 but with Version/10.63 in the end (can't say any thing bad about this browser, I used it before Google Chrome appears, but it can joke sometimes :) - 3 times. As of issue starter log, we see Chrome/2.0.172.33 (this was really long ago... I'm posting this from Chrome 8.0.552.215 and I think 2.0.172.33 was able to do such a mess in headers).

Actions #8

Updated by intgr over 13 years ago

All 4 INDEPENDENT browser vendors have managed to trigger this symptom in lighttpd and you're saying that it's the browser's fault? Sorry, I don't buy it.

However, if it occurs daily, then isolating the client is only a matter of capturing it on a packet dump. If confirmed, post it and we can close this bug. If not, it could shed some light on what triggers this bug.

Actions #9

Updated by darix over 13 years ago

  • Missing in 1.5.x set to No

just because it has an user-agent string from a webbrowser, doesnt mean it actually was a webbrowser. the likeliness that it was some scriptkiddie script is much higher.

Actions #10

Updated by niosop almost 12 years ago

Using 1.4.28 we're also seeing this occasionally. Random characters in the headers will be replaced with something else. For instance:

Cache-Contr/l: no-cache

It seems it's usually '/', '(' or ')', but we may be getting a lot more corruption that either doesn't happen in the header, or the corrupted character is no longer an invalid character so it just becomes an unknown header. It doesn't happen often, we handle about 15 million requests per day and I see only about 8 of the errors per day. But I'm worried that these are only the ones that are logged (random character is both in header and invalid) and that we're getting more frequent corruption elsewhere.

It's very possible it's a frequent network issue but that I'm only seeing those that have multiple errors and cause a TCP checksum collision, but it's not browser related as I'm seeing it from a variety of browsers. I'm not seeing any TCP errors on the interface, so it's not between me and the upstream router.

Actions #11

Updated by stbuehler over 11 years ago

  • Status changed from New to Missing Feedback

It seems unlikely that we have a memory corruption bug in lighty that only modifies a single character in a header; and while '/' looks like a character lighttpd might try to write, '(' and ')' certainly are not.

There are many alternative explanations:
  • script kiddie/fuzzer trying to break it
  • network corrupted packets
  • bit flips in memory modules on server (unlikely with ECC, also should cause processes to crash sometimes) or client side
  • broken client

I don't think we will be able to find out where the bug comes from based on this evidence, so I'll close it with "Missing Feedback" - if you have real information to help us find the source please reopen, for example we'd need:

  • Other hints that lighttpd has a memory corruption bug
  • strace which shows that lighttpd reads a correct header but prints an error for a broken one
  • network pcap (tcpdump) showing a valid request with a "400 Bad Request" response and a matching (by time) log line
  • A way to reproduce it
Actions

Also available in: Atom