Feature #2008

[PATCH] X-Sendfile-Extended header support

Added by shellsage about 5 years ago. Updated almost 5 years ago.

Status:FixedStart date:2009-06-15
Priority:NormalDue date:
Assignee:-% Done:

100%

Category:mod_fastcgi
Target version:1.4.24
Missing in 1.5.x:Yes

Description

I've implemented the X-Sendfile-Extended header for mod_fastcgi.

Header field definition:

X-Sendfile-Extended: x y /path/to/your/file

Where x is the zero-based first byte to serve of the file, y is the last byte to serve of the file + 1, and /path/to/your/file is obviously the path of the file to serve. x and y may have a value of -1, although when x -1 an error will be noted in the log. When y -1, y is automatically set to the size of the file. For each X-Sendfile-Extended response header provided, Content-Length is increased appropriately.

Example:

X-Sendfile-Extended: 5 5000 /tmp/data.txt    # Serve byte 5 -> 4999 (inclusive) of /tmp/data.txt
X-Sendfile-Extended: 57 -1 /tmp/data2.txt    # Serve from byte 57 -> end of /tmp/data2.txt
X-Sendfile-Extended: 0 998 /tmp/data3.txt    # Serve byte 0 -> 997 (inclusive) of /tmp/data3.txt
X-Sendfile-Extended: -1 998 /tmp/data3.txt    # Serve byte 0 -> 997 (inclusive) of /tmp/data3.txt, but also show an error in the log because -1 shouldn't be specified as first parameter

This example will serve the given portions of the files in order, assuming that the FastCGI application providing the headers can both preserve response header order and deliver multiple response headers of the same name.

Important:
When used in conjunction with X-Sendfile (and optionally with X-Sendfile-Range,) the files specified by the X-Sendfile-Extended header(s) are served first. Also, Content-Length may not be preserved when X-Sendfile-Extended and X-Sendfile are used in conjunction. With this patch, use of X-Sendfile-Extended and X-Sendfile at the same time is not recommended. Tests showed that the file specified by X-Sendfile was not served fully. stbuehler, maybe you know how to work around this?

x-sendfile-extended.patch Magnifier - r2539 patch to add X-Sendfile-Extended support (1.98 KB) shellsage, 2009-06-15 02:54

blah.py Magnifier (1.23 KB) peto, 2009-07-21 21:46


Related issues

Related to Feature #2005: [PATCH] X-Sendfile-Range header support Wontfix 2009-06-11

Associated revisions

Revision 2542
Added by stbuehler about 5 years ago

Remove X-Sendfile-Range feature; it will be replaced with something more powerful (#2005, #2008)

Revision 2651
Added by stbuehler almost 5 years ago

mod_fastcgi: Add "X-Sendfile2" - supporting multiple ranged files (fixes #2008)

History

#1 Updated by shellsage about 5 years ago

  • Status changed from New to Patch Pending

#2 Updated by darix about 5 years ago

ideally we would implement range request support for x-sendfile at the same time.

and how about:
x-sendfile: path [range]

so range becomes an optional argument for x-sendfile

#3 Updated by icy about 5 years ago

darix: it's impossible to add a range parameter to X-sendfile without breaking backwardscompatibility (quoting issues).

#4 Updated by ralf about 5 years ago

shellsage wrote:

I've implemented the X-Sendfile-Extended header for mod_fastcgi.

Header field definition:
[...]
Where x is the zero-based first byte to serve of the file, y is the last byte to serve of the file + 1, and /path/to/your/file is obviously the path of the file to serve. x and y may have a value of -1, although when x -1 an error will be noted in the log. When y -1, y is automatically set to the size of the file.

Hi,

why this special syntax?

I mean there is already a "Range" syntax in http:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35

#5 Updated by simmel about 5 years ago

ralf wrote:

why this special syntax?

I mean there is already a "Range" syntax in http:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35

Yeah, I agree with ralf here. Shouldn't X-Sendfile "just" be patched to take Range into account (if set at all).

#6 Updated by stbuehler about 5 years ago

No, we cannot use X-Sendfile without breaking old scripts.

#7 Updated by peto about 5 years ago

The difference, I guess, is that this is used to send slices of multiple files. I do think it's cleaner to extend X-Sendfile with X-Sendfile-Range rather than redefining it completely. Define X-Sendfile-Range as affecting the following X-Sendfile only, and permit multiple X-Sendfile headers.

In any case, this should probably mimic the byte-range-resp-spec syntax (defined for Content-Range). For example,

X-Sendfile-Extended: 500-599 /tmp/data.txt # [500,599] inclusive; 500 bytes
X-Sendfile-Extended: 57- /tmp/data2.txt
X-Sendfile-Extended: 0-998 /tmp/data3.txt

This isn't the same as byte-range-resp-spec, which doesn't allow omitting last-byte-pos, but it's closer.

I'd also recommend mimicing 14.16 Content-Range's error handling behavior: if the range is invalid (end > start), ignore the header entirely; don't try to guess what was intended. Don't be lenient when parsing a new header; you don't have existing code to accomodate, so being lenient will just encourage new code to be sloppy.

#8 Updated by stbuehler about 5 years ago

  • I think the new header should be named "X-Sendfile-Ranged" (not "-range", as it includes range + filename)
  • the rfc says that you should be able to combine multiple header values with "," - so i guess we should encode the filename somehow to allow this
    urlencode would be nice, but i doesn't encode , or " by default iirc
  • i just had a look again at the http range syntax: i think it is too complicated and not really needed as we will support multiple headers
  • i think we only have to support the following types of ranges - these should be pretty obvious to read; i don't think we should support "-end" or "-lastbytes"
    • complete file (i guess "*" should be ok)
    • a normal range: "start-end"
    • a start offset: "start-"

#9 Updated by shellsage about 5 years ago

peto wrote:

In any case, this should probably mimic the byte-range-resp-spec syntax (defined for Content-Range). For example,

X-Sendfile-Extended: 500-599 /tmp/data.txt # [500,599] inclusive; 500 bytes
X-Sendfile-Extended: 57- /tmp/data2.txt
X-Sendfile-Extended: 0-998 /tmp/data3.txt

I agree, this seems like a good way to do the range.

@stbuehler, I think we should maybe just name the header X-Sendfile-Range, or X-Send-File-Range, because the header will be used to send a "range" from a "file."

To send the complete file, we could just mandate the use of "0-"

Is it not enough to just escape commas in filenames provided to the header? e.g. X-Sendfile-Range: 100-200 /tmp/myfile\,withcomma.txt, 44-555 /tmp/otherfile\,test.txt

#10 Updated by stbuehler about 5 years ago

Ah yes, "*" is not needed. escaping commas is not enough, as filenames may even contain "\r\n".
But we could use urlencode if we use "X-Sendfile-Ranged: filename+with%20spaces 0-,another-file 500-"

#11 Updated by peto about 5 years ago

You've arrived back at exactly the range syntax I suggested...

I'd recommend sticking with URL escapes with the added note that commas must be escaped. (Some URL encoders do escape commas; Python's urllib does.) It fits cleanly with generic header parsing, uses standard urldecode methods with no changes, and the only extra work that might needed for clients is to say urlencode(s).replace(",", "%2C") instead of just urlencode(s).

#12 Updated by icy about 5 years ago

Is ignoring the header entirely in case of a bad range really a good idea?
What if multiple headers are specified and one has a bad range. Should we ignore just that one or all X-Sendfile-Range headers?
Probably all because headers could be merged...
But then there wont be any content in the response and still a 200 OK status.
What if we would throw a generic 500 error? Or a 416 "Requested range not satisfiable" or maybe even 502 "Bad Gateway"?

Also I'd like everyone to give a thought on how to handle normal response content in conjunktion with this header. Throw away, prepend or append?

#13 Updated by tx about 5 years ago

I think will be very nice to include support for X-LIGHTTPD-send-tempfile

X-LIGHTTPD-send-tempfile-extednded /dev/shm/pre-data.txt
X-Sendfile-Extended: 5 5000 /tmp/data.txt
X-LIGHTTPD-send-tempfile-extednded /dev/shm/post-data.txt

#14 Updated by peto about 5 years ago

normal response content in conjunktion with this header

Do whatever X-Sendfile does, so the headers can be used interchangably without causing obscure differences.

Let's call this X-Sendfile2. Don't call it X-Sendfile-Range; assume that this header will do more things than specifying ranges. Don't call it X-Sendfile-Extended, so if someone really needs to start from scratch with X-Sendfile again, they can call it X-Sendfile3.

Then, let's define the header as having space-separated parameters, like so:

X-Sendfile2: url-escaped-filename range [param1 [param2...]]

and saying that any unknown parameters will cause an error. This lets us extend the header cleanly later on; for example, by saying "if any parameter is the string delete-after-sending, the file will be deleted when the request terminates".

And let's not actually define more features for this header here. I can think of some trickiness related to the notion of "deleting a file after it's finished sending" that should probably be discussed, if that's what tx meant--let's do that another time, after the basic range stuff is done.

#15 Updated by stbuehler about 5 years ago

I don't think an "uber" powerful specification for the header is a good idea, if we need to combine more than ranges+tmp-file we should do it with JSON and a new header (I don't see this coming in 1.x)

1.5 already supports send-tempfile, so it should be no problem to support x-send-tempfile-range; I propose using two different header names. The downside with 2 header names is, that the http rfc doesn't care about the order of different headers, so you have to rely on implementation details to be able to mix normal sendfile-range with send-tempfile-range.

#16 Updated by peto about 5 years ago

The method I suggested is simple. It avoids the user-side mess of having to deal with multiple headers for slight variations of the same thing.

Also, this is much cleaner for implementing in other webservers, which is a very good thing. Other webservers are likely to preprocess headers, parsing apart comma-separated header data, combining repeated header names, etc., which will very easily introduce the header ordering problem.

I can easily think up cases where I'd want to mix and match tempfiles and not; for example, to send generated ZIPs by generating headers and footers on the fly, with large static files interleaved. (Actually, I've already written mod_zipfile for this, which I may post at some point...)

Just for explanation purposes, attached a trivial Python script parsing all cases.

#17 Updated by stbuehler almost 5 years ago

  • Missing in 1.5.x set to No

#18 Updated by stbuehler almost 5 years ago

  • Assignee deleted (stbuehler)

#19 Updated by stbuehler almost 5 years ago

  • Status changed from Patch Pending to Fixed
  • % Done changed from 80 to 100

Applied in changeset r2651.

#20 Updated by stbuehler almost 5 years ago

  • Missing in 1.5.x changed from No to Yes

Also available in: Atom