Project

General

Profile

Actions

Mod magnet » History » Revision 92

« Previous | Revision 92/119 (diff) | Next »
gstrauss, 2022-02-21 06:53


lighttpd request manipulation using Lua

Module: mod_magnet

Overview

mod_magnet enables programmatic manipulation of lighttpd request handling via Lua (programming language) scripts.

mod_magnet lua examples demonstrate the power and flexibility of a few lines of Lua with lighttpd. mod_magnet allows you to do more complex URL rewrites and caching than you would otherwise be able to do, including .htaccess-like functionality in lighttpd.

While Lua is a very powerful programming language, learning basic string manipulation and plugging a lua script into lighttpd should be relatively straightforward for anyone with basic scripting experience in another scripting language (Python, PHP, Ruby, etc). Please review the lua examples here and give it a try! Questions? Post in the lighttpd Forums

Please note: mod_magnet is aimed at request manipulation such as url-rewriting and is not meant to be a general replacement for your entire scripting environment. mod_magnet executes lua scripts in the lighttpd core which makes lua scripting very fast in lighttpd, but any blocking operation (e.g. I/O to an external database) will pause the lighttpd server for all other requests. For time-consuming or blocking lua scripts, please run your script outside of lighttpd, using mod_fastcgi, mod_proxy, or other dynamic backend. For those interested in running time-consuming or blocking lua scripts, or for access in lua to POST request body, one option is a third-party FastCGI daemon such as luafcgid

Requirements

lighttpd 1.4.12 or higher built --with-lua
lua >= 5.1

Options

mod_magnet can attract a request in several stages in the request-handling.

  • when URL is processed (but after rewrite); this is the same stage that mod_proxy and other handlers (fastcgi in certain modes) use which do not need a physical file.
  • when the doc-root is known and the physical-path is already set up
  • when response starts, right before response headers are finalized

The stage to intercept depends on the purpose of the each script. Usually you want to use the 2nd stage where the physical-path which relates to your request is known. At this level you can run checks against lighty.env["physical.path"].

magnet.attract-raw-url-to = ( ... )
magnet.attract-physical-path-to = ( "/absolute/path/to/script.lua"  )
magnet.attract-response-start-to = ( ... )  # (since 1.4.56)

You can define multiple scripts when separated by a comma. The scripts are executed in the specified order. If one of them returns a bad status-code, the scripts following will not be executed.

For performance reasons, mod_magnet caches the compiled script. For each request, the script is checked. If the script has been modified, the script is reloaded and recompiled (no need to restart lighttpd).

Return Codes

  • If script return 0, or returns nothing (or nil), request handling continues.
  • If script return lighty.RESTART_REQUEST (currently equal to 99), the request is restarted, reprocessing the request-uri. This is usually used in combination with changing the ["request.uri"] attribute in a rewrite.
  • If script returns 1xx, a 1xx intermediate response is sent, and request handling continues.
  • If script returns >= 200, the value is used as final HTTP status code and the response is finalized. No other modules are executed.

Example redirecting "http" to "https": (lighty.* accessors are described further below)

  if (lighty.env["uri.scheme"] == "http") then
    lighty.header["Location"] = "https://" .. lighty.env["uri.authority"] .. lighty.env["request.uri"]
    return 308
  end

mod_magnet API before lighttpd 1.4.60

lighty.* tables

Most of the interaction between mod_magnet and lighttpd is done through tables. Tables in lua are similar to hashes (Perl, Ruby), dictionaries (Java, Python), associative arrays (PHP), ...

  • lighty.request[] - certain request headers like Host, Cookie, or User-Agent are available
  • lighty.req_env[] - request environment variables
  • lighty.env[]
    • physical.path
    • physical.rel-path
    • physical.doc-root
    • uri.path (the URI without the query-string)
    • uri.path-raw
    • uri.scheme (http or https)
    • uri.authority (the server-name)
    • uri.query (the URI after the ? )
    • request.method (e.g. GET)
    • request.uri (uri after rewrite)
    • request.orig-uri (before rewrite)
    • request.path-info
    • request.remote-ip
    • request.protocol (e.g. "HTTP/1.0", "HTTP/1.1", "HTTP/2.0")
    • response.http-status # (since 1.4.56) (read-only value)
    • response.body-length # (since 1.4.56) (read-only value; not nil only if response body is complete)
    • response.body # (since 1.4.56) (read-only value; not nil only if response body is complete)
  • lighty.header[] - certain response headers like Location are available
  • lighty.content[]
  • lighty.status[]

You can loop with "pairs()" through the special tables "lighty.request", "lighty.env", "lighty.req_env" and "lighty.status"; "lighty.header" and and "lighty.content" are normal lua tables, so you can use them with "pairs()" too.

lighty.env[]

lighttpd has its internal variables which are exported as read/write to the magnet.

If "http://example.org/search.php?q=lighty" is requested, this results in a request like:

  GET /search.php?q=lighty HTTP/1.1
  Host: example.org

When you are using magnet.attract-raw-url-to you can access the following variables:

  • parts of the request-line
    • lighty.env["request.uri"] = "/search.php?q=lighty"
  • HTTP request-headers
    • lighty.request["Host"] = "example.org"
  • parts of the URI
    • lighty.env["uri.path"] = "/search.php"
    • lighty.env["uri.path-raw"] = "/search.php"
    • lighty.env["uri.scheme"] = "http"
    • lighty.env["uri.authority"] = "example.org"
    • lighty.env["uri.query"] = "q=lighty"

Later in the request-handling, the URL is split, cleaned up and turned into a physical path name:

  • filenames, pathnames
    • lighty.env["physical.path"] = "/my-docroot/search.php"
    • lighty.env["physical.rel-path"] = "/search.php"
    • lighty.env["physical.doc-root"] = "/my-docroot"

All of them are readable, but not all of them are writable (or have no effect if you write to them).

  -- 1. simple rewriting is done via the request.uri
  lighty.env["request.uri"] = ... 
  return lighty.RESTART_REQUEST

  -- 2. changing the physical-path
  lighty.env["physical.path"] = ...

  -- 3. changing the query-string
  lighty.env["uri.query"] = ...
lighty.header[]

If you want to set a response header for your request, you can add a field to the lighty.header[] table:

lighty.header["Content-Type"] = "text/html"
lighty.content[]

You can generate your own content and send it out to the clients.

  lighty.content = { "<pre>", { filename = "/etc/passwd" }, "</pre>" }
  lighty.header["Content-Type"] = "text/html" 

  return 200

The lighty.content[] table is executed when the script is finished. The elements of the array are processed left to right and the elements can either be a string or a table. Strings are included AS IS into the output of the request.

  • Strings
    • are included as is
  • Tables
    • filename = "<absolute-path>" is required
    • offset = <number> [default: 0]
    • length = <number> [default: size of the file]

This results in sending the range [offset, length-1] of the file.
('length' param is misnamed and actually indicates the range offset + 1; kept as such for historical compatibility with existing scripts)

lighty.status[]

mod_status support a global statistics page and mod_magnet allows to add and update values in the status page:

Config

status.statistics-url = "/server-counters" 
magnet.attract-raw-url-to = (server.docroot + "/counter.lua")

counter.lua

lighty.status["core.connections"] = lighty.status["core.connections"] + 1

Result

core.connections: 7
fastcgi.backend.php-foo.0.connected: 0
fastcgi.backend.php-foo.0.died: 0
fastcgi.backend.php-foo.0.disabled: 0
fastcgi.backend.php-foo.0.load: 0
fastcgi.backend.php-foo.0.overloaded: 0
fastcgi.backend.php-foo.1.connected: 0
fastcgi.backend.php-foo.1.died: 0
fastcgi.backend.php-foo.1.disabled: 0
fastcgi.backend.php-foo.1.load: 0
fastcgi.backend.php-foo.1.overloaded: 0
fastcgi.backend.php-foo.load: 0

mod_magnet API since lighttpd 1.4.60

(The earlier mod_magnet API (above) is still supported, but the newer mod_magnet API (below) should be preferred.)

lighty.r request object

(since lighttpd 1.4.60)

lighty.r description
lighty.r.req_header[] HTTP request headers
lighty.r.req_attr[] HTTP request attributes / components
lighty.r.req_env[] HTTP request environment variables
lighty.r.resp_header[] HTTP response headers
lighty.r.resp_body.* HTTP response body attributes and accessors
lighty.r request object modification
  • lighty.r.req_header[] allows get/set of request headers
    If modifications would affect config processing, script should return
    lighty.RESTART_REQUEST to have lighttpd restart the modified request.
    lighty.r.req_header[] differs from the older API lighty.env[] table,
    which (previously) did not permit modification of request headers.
    Note: header policy is not applied to values set in lighty.r.req_header[];
    Do not set unvalidated, untrusted, or non-normalized values.
    Note: if iterating using pairs(), do not set request header to blank value
    during iteration, or else iteration may end up skipping request headers.
  • lighty.r.req_attr[] allows get/set of request attributes and is detailed further below.
    lighty.r.req_attr[] is the same as the (less clearly named) older API lighty.env[]
  • lighty.r.req_env[] allows get/set of request environment variables
    lighty.r.req_env[] is the same as the older API lighty.req_env[]
    Note: modifications made to standard CGI environment variables
    will be overwritten by backends recreating the CGI environment.
    However, new variables will persist into the env passed to backend scripts.
  • lighty.r.resp_header[] allows get/set of response headers
    (Certain connection-level headers such as Connection and
    Transfer-Encoding are restricted from modification)
    lighty.r.resp_headers[] differs from the older API lighty.header[] table,
    which is collected and deferred, being applied after the script exits.
    Note: header policy is not applied to values set in lighty.r.resp_header[];
    Do not set unvalidated, untrusted, or non-normalized values.
    To repeated header names, such as Set-Cookie or Link, join with "\r\nNAME:"
    lighty.r.resp_header["Link"] = "http://a.com/a.css\r\nLink: http:/b.com/b.js"
  • lighty.r.resp_body.* adds/sets response body content
    lighty.r.resp_body.* differs from the older API lighty.content[] table,
    which is collected and deferred, being applied after the script exits.
    lighty.r.resp_body description
    lighty.r.resp_body.len HTTP response body length
    lighty.r.resp_body.add() HTTP response body add (string or table)
    lighty.r.resp_body.set() HTTP response body set (string or table)
-- examples
local r = lighty.r
local resp_header = r.resp_header
resp_header["Content-Type"] = "text/html" 
resp_header["Cache-Control"] = "max-age=0" 
r.resp_body:set({'bar\n'})  -- equivalent to below 'set'
-- alternatives
r.resp_body.set({'bar\n'})  -- equivalent to above 'set'
lighty.r.resp_header["Content-Type"] = "text/html" 
-- older syntax (less clearly named)
lighty.header["Content-Type"] = "text/html" 
lighty.content = {'bar\n'}
lighty.r.req_attr[] readable attributes
lighty.r.req_attr[] description
["uri.scheme"] ("http", "https")
["uri.authority"] URI authority or Host request header
["uri.path"] url-path without the query-string; url-path is url-decoded
["uri.path-raw"] url-path without the query-string; url-path is not url-decoded
(url-path component from ["request.uri"], i.e. without query-string)
["uri.query"] query-string; URL part following '?'; query-string is not url-decoded
["request.method"] request method (e.g. GET)
["request.protocol"] request protocol ("HTTP/1.0", "HTTP/1.1", "HTTP/2.0")
["request.uri"] URI after mod_rewrite rules, if any, else same as request.orig-uri
["request.orig-uri"] URI before mod_rewrite; original request-uri sent by client
["request.path-info"] path-info following url-path
["request.server-addr"] server addr
["request.server-port"] server port
["request.remote-addr"] remote addr
["request.remote-port"] remote port
["physical.doc-root"] filesystem document root (original)
["physical.basedir"] filesystem document root (same as physical.doc-root unless adjusted by mod_alias, mod_userdir, ...)
["physical.path"] filesystem path to request (beginning with physical.basedir)
["physical.rel-path"] filesystem path to request (piece appended to physical.basedir)
["response.http-status"] HTTP response status (0 if not yet set)
["response.body-length"] HTTP response body length (same as lighty.r.resp_body.len)
(nil unless response body is complete)
["response.body"] HTTP response body
(nil unless response body is complete)
(copies response in memory; should not be used on very large responses)

A full URI can be reconstructed using components:

local req_attr = lighty.r.req_attr
local url = req_attr["uri.scheme"]
         .. "://" 
         .. req_attr["uri.authority"]
         .. req_attr["uri.path-raw"]
         .. (req_attr["uri.query"] and ("?" .. req_attr["uri.query"]) or "")

Some lighty.r.req_attr[] attributes provide similar values to those in the standard CGI/1.1 environment
lighty.r.req_attr[] CGI/1.1 env var
uri.scheme REQUEST_SCHEME
uri.authority SERVER_NAME
uri.path SCRIPT_NAME
uri.query QUERY_STRING
request.method REQUEST_METHOD
request.protocol SERVER_PROTOCOL
request.path-info PATH_INFO
request.remote-addr REMOTE_ADDR
request.remote-port REMOTE_PORT
request.server-addr SERVER_ADDR
request.server-port SERVER_PORT
physical.doc-root DOCUMENT_ROOT
physical.basedir DOCUMENT_ROOT
physical.path SCRIPT_FILENAME
physical.rel-path SCRIPT_NAME
lighty.r.req_attr[] writable attributes

Modifications to specific attributes or components are fairly direct interfaces
into lighttpd internals and do not affect other related attributes, including
the full request from which the attributes or component may have been derived.

What does this mean?
It means: carefully test your scripts and verify desired behavior.

If full request reprocessing is needed after any modification,
e.g. if modifications would affect config processing, script should
return lighty.RESTART_REQUEST

lighty.r.req_attr[] description
["uri.scheme"] modification has similar effect as changing scheme in mod_extforward
If reprocessing request is needed, then return lighty.RESTART_REQUEST
["uri.authority"] modification should generally be repeated to lighty.r.req_header["Host"]
If reprocessing request is needed, then return lighty.RESTART_REQUEST
["uri.path"] modification discouraged;
derived from ["request.uri"], url-decoded and path-simplified;
prefer to modify ["request.uri"] and return lighty.RESTART_REQUEST
["uri.query"] modification affects subsequent use of query-string by other modules
If reprocessing request is needed, prefer to modify ["request.uri"] and return lighty.RESTART_REQUEST
["request.uri"] modification has similar effect as using mod_rewrite and should be followed by return lighty.RESTART_REQUEST
so that lighttpd reprocesses the request and reparses the URI into components.
"request.uri" can be reconstructed using components: ["request.uri"] = ["uri.path-raw"] "?" ["uri.query"]
["request.orig-uri"] modification discouraged
["request.path-info"] modification affects subsequent use of path-info by other modules
If reprocessing request is needed, prefer to modify ["request.uri"] and return lighty.RESTART_REQUEST
["request.remote-addr"] modification changes remote_addr for all subsequent requests on connection
modification has similar effect as using mod_extforward
(though mod_extforward additionally updates forwarding headers)
If reprocessing request is needed, then return lighty.RESTART_REQUEST
["request.remote-port"] modification changes remote_addr port for all subsequent requests on connection
["physical.doc-root"] modification affects subsequent use of the doc_root by other modules
(e.g. mod_ssi mod_webdav (limited use as fallback))
["physical.basedir"] modification affects subsequent use of the basedir by other modules
(e.g. as DOCUMENT_ROOT passed to backend scripts, unless modified elsewhere)
["physical.path"] modification affects subsequent use of the path by other modules
(e.g. mod_staticfile and many other filesystem based modules)
modification has similar effect as using mod_alias
(when script called from magnet.attract-physical-path-to hook)
["physical.rel-path"] modification affects subsequent use of the relative path by other modules
(e.g. mod_ssi mod_userdir mod_webdav)

["physical.*"] attributes are valid for scripts called from magnet.attract-physical-path-to hook. The attributes are not defined in earlier magnet hooks, and have almost no effect in later magnet hooks.

lighty.c.* library functions (experimental)

(since lighttpd 1.4.60)

Note: the lighty.c.* namespace is EXPERIMENTAL / UNSTABLE
In the future, these may be removed, altered, or moved to a different namespace.


-- digests and passwords

lighty.c.time()             -- seconds since 1 Jan 1970 00:00:00 (cached; faster than os.time())
lighty.c.rand()             -- generate pseudo-random number
lighty.c.md()               -- calculate message digest (md5,sha1,sha256,sha512)
lighty.c.hmac()             -- calculate HMAC           (md5,sha1,sha256,sha512)
lighty.c.digest_eq()        -- timing-safe comparison of two hex digests
lighty.c.secret_eq()        -- timing-safe comparison of two strings

-- decode/encode

lighty.c.b64urldec()        -- base64url decode (validate and decode)
lighty.c.b64urlenc()        -- base64url encode, no padding
lighty.c.b64dec()           -- base64 decode (validate and decode)
lighty.c.b64enc()           -- base64 encode, no padding
lighty.c.hexdec()           -- hex decode (validate and decode)
lighty.c.hexenc()           -- hex encode uc; lc w/ lua s = s:lower()
lighty.c.xmlenc()           -- xml-encode/html-encode: <>&'\"`
lighty.c.urldec()           -- url-decode
lighty.c.urlenc()           -- url-encode
lighty.c.urldec_query()     -- url-decode query-string into table (since 1.4.65)
lighty.c.urlenc_query()     -- url-encode query-string from table (since 1.4.65)
lighty.c.urlenc_normalize() -- url-encode normalization
lighty.c.fspath_simplify()  -- simplify fspath (remove "/." "/.." "/../" "//")
lighty.c.quoteddec()        -- decode MIME quoted-string (since 1.4.65)
lighty.c.quotedenc()        -- encode input as MIME quoted-string (since 1.4.65)

-- misc

lighty.c.cookie_tokens()    -- parse HTTP Cookie header into table (since 1.4.65)
lighty.c.header_tokens()    -- parse HTTP header into sequence table (since 1.4.65)
lighty.c.readdir()          -- dir walk
lighty.c.stat()             -- stat() path

message digest (md) and hash-based message authentication code (HMAC)
(MD5, SHA1, SHA256, SHA512)
lighty.c.md("algo", "data")
lighty.c.hmac("algo", "secret", "data")
  • "algo" can be one of: "md5", "sha1", "sha256", "sha512"
    (as long as lighttpd is compiled w/ crypto lib supporting those algorithms)
  • returns uppercase hex string of digest
lighty.c.digest_eq("digest1", "digest2")
  • performs a timing-safe, case-insensitive comparison of two hex digests
    (timing-safe comparison is slightly more secure than digest1 == digest2)
  • "digest1" and "digest2" are hex strings (of binary digests)
  • returns boolean true or false
lighty.c.secret_eq("data1", "data2")
  • performs a timing-safe comparison of two strings
    (and attempts to hide differences in string lengths)
    (timing-safe comparison is slightly more secure than data1 == data2)
  • "data1" and "data2" are strings
  • returns boolean true or false
decode/encode
lighty.c.b64urldec("base64url-string")
lighty.c.b64urlenc("string") lighty.c.b64dec("base64-string")
lighty.c.b64enc("string") lighty.c.urldec_query("query-string") (since 1.4.65)
lighty.c.urlenc_query("query-string") (since 1.4.65)
  • url-decode query-string into table or url-encode table into query-string
  • table value for a decoded key is blank ("") if no '=' follows "key"
  • table value for a decoded key is blank ("") if blank value follows "key="
  • encoding results in "key=" if value is blank ("")
lighty.c.quoteddec("quoted-string") (since 1.4.65)
lighty.c.quotedenc("string") (since 1.4.65)
  • quoteddec: decode input quoted-string (remove surrounding double-quotes, unescape quoted-pairs)
  • quotedenc: encode input as quoted-string (backslash-escape " and \ to quoted-pairs, surround with double-quotes)
misc
lighty.c.cookie_tokens("cookie-string") (since 1.4.65)
  • parse HTTP Cookie header into table (Note: quoted-strings preserved as-is; not decoded)
  • local cookies = lighty.c.cookie_tokens(lighty.r.req_header['Cookie'])
    local identity = lighty.c.urldec(cookies["ident"])
lighty.c.header_tokens("header-string") (since 1.4.65)
  • parse HTTP header into sequence table (Note: quoted-strings preserved as-is; not decoded)
  • sequence table contains works/tokens and separators (, ; =) as separate elements
    (optional whitespace (OWS) and bad whitespace (BWS) is removed)
  • local tokens = lighty.c.header_tokens(lighty.r.req_header['Accept-Encoding'])
  • for i = 1, #tokens do tok = lighty.c.quoteddec(tokens[i]) ......... end
    In the case of Accept-Encoding, each encoding might optionally include ;q=0.x quality,
    which would be multiple elements in the tokens table (; q = 0.x)
lighty.c.readdir("/path/to/dir")
  • dir walk
  • skips "." or ".." for convenience
  • local name
    for name in lighty.c.readdir("/tmp") do r.resp_body:add({name, "\n"}) end
lighty.c.stat("/path")
  • checks the existence of a file/dir/socket and returns the stat() information for it
  • uses lighttpd internal stat-cache
  • path: (string) absolute path
  • returns: table with the following fields, or nil on error
    • is_file
    • is_dir
    • is_char
    • is_block
    • is_socket
    • is_link
    • is_fifo
    • st_mode
    • st_mtime
    • st_ctime
    • st_atime
    • st_uid
    • st_gid
    • st_size
    • st_ino
    • etag
    • content-type
    • http-response-send-file (similar to mod_staticfile) (lighttpd 1.4.64)

Library Functions

mod_magnet exports a few additional functions to the script:

  • pairs() - extends the default pairs() function
  • print() - writes to the lighttpd error log
  • lighty.stat() - stat() file (see lighty.c.stat())
print()

print() replaces the lua-default version and redirects the trace to the lighttpd error log. print() is useful for debugging.

print("Host: " .. lighty.request["Host"])
print("Request-URI: " .. lighty.env["request.uri"])

lighty.stat()

lighty.stat() returns stat() information for given path, e.g. checks existence of file/dir/socket (see lighty.c.stat())

Examples

Porting mod_cml scripts

mod_cml got replaced by mod_magnet.
  • mod_cml function dir_files should be replaced with lighty.c.readdir() (since 1.4.60)
  • CACHE_HIT in mod_cml:
    output_include = { "file1", "file2" }
    return CACHE_HIT
    becomes in mod_magnet:
    lighty.content = { { filename = "/path/to/file1" }, { filename = "/path/to/file2"} }
    return 200
  • CACHE_MISS in mod_cml:
    trigger_handler = "/index.php"
    return CACHE_MISS
    becomes in mod_magnet:
    lighty.env["request.uri"] = "/index.php"
    return lighty.RESTART_REQUEST

Questions? Post in the lighttpd Forums

Updated by gstrauss almost 3 years ago · 92 revisions