Project

General

Profile

Mod magnet » History » Revision 76

Revision 75 (gstrauss, 2021-08-19 14:25) → Revision 76/119 (gstrauss, 2021-08-20 04:02)

 
 {{>toc}} 

 h1. lighttpd request manipulation using Lua 

 *Module: mod_magnet* 

 see [[AbsoLUAtion|AbsoLUAtion for additional infos, code-snippets, examples ...]] 

 h2. Requirements 

 lighttpd 1.4.12 or higher built @--with-lua@ 
 lua >= 5.1 

 h2. Overview 

 mod_magnet enables programmatic manipulation of lighttpd request handling via Lua (programming language) scripts.    mod_magnet allows you to do more complex URL rewrites and caching than you would otherwise be able to do. 

 While the Lua language is very powerful, mod_magnet is not meant to be a general replacement for your regular scripting environment. This is because    mod_magnet is executed in the core of lighttpd and EVERY long-running operation is blocking for ALL connections in the server. You have been warned. For time-consuming or blocking scripts, use mod_fastcgi and friends. 

 For performance reasons, mod_magnet caches the compiled script. For each script-run the script itself is checked for freshness and recompiled if necessary. 


 h2. Options 

 mod_magnet can attract a request in several stages in the request-handling.  

 * when URL is processed (but after rewrite); this is the same stage that mod_proxy and other handlers (fastcgi in certain modes) use which do not need a physical file. 
 * when the doc-root is known and the physical-path is already set up 
 * when response starts, right before response headers are finalized 

 The stage to intercept depends on the purpose of the each script. Usually you want to use the 2nd stage where the physical-path which relates to your request is known. At this level you can run checks against lighty.env["physical.path"]. 

 <pre> 
 magnet.attract-raw-url-to = ( ... ) 
 magnet.attract-physical-path-to = ( "/absolute/path/to/script.lua"    ) 
 magnet.attract-response-start-to = ( ... )    # (since 1.4.56) 
 </pre> 

 You can define multiple scripts when separated by a comma. The scripts are executed in the specified order. If one of them returns a bad status-code, the scripts following will not be executed. 


 h2. Return Codes 

 * If script @return 0@, or returns nothing (or nil), request handling continues. 
 * If script @return lighty.RESTART_REQUEST@ (currently equal to 99), the request is restarted, reprocessing the request-uri.    This is usually used in combination with changing the @["request.uri"]@ attribute in a rewrite. 
 * If script returns 1xx, a 1xx intermediate response is sent, and request handling continues. 
 * If script returns >= 200, the value is used as final HTTP status code and the response is finalized.    No other modules are executed. 

 Example redirecting "http" to "https": (@lighty.*@ accessors are described further below) 
 <pre> 
   if (lighty.env["uri.scheme"] == "http") then 
     lighty.header["Location"] = "https://" .. lighty.env["uri.authority"] .. lighty.env["request.uri"] 
     return 302 
   end 
 </pre> 


 h2. mod_magnet API before lighttpd 1.4.60 


 h3. @lighty.*@ tables 

 Most of the interaction between mod_magnet and lighttpd is done through tables. Tables in lua are similar to hashes (Perl, Ruby), dictionaries (Java, Python), associative arrays (PHP), ... 

 * lighty.request[] - certain request headers like Host, Cookie, or User-Agent are available 
 * lighty.req_env[] - request environment variables 
 * lighty.env[] 
 ** physical.path 
 ** physical.rel-path 
 ** physical.doc-root 
 ** uri.path (the URI without the query-string) 
 ** uri.path-raw  
 ** uri.scheme (http or https) 
 ** uri.authority (the server-name) 
 ** uri.query (the URI after the ? ) 
 ** request.method (e.g. GET) 
 ** request.uri (uri after rewrite) 
 ** request.orig-uri (before rewrite) 
 ** request.path-info 
 ** request.remote-ip 
 ** request.protocol (e.g. "HTTP/1.0", "HTTP/1.1", "HTTP/2.0") 
 ** response.http-status     # (since 1.4.56) (read-only value) 
 ** response.body-length     # (since 1.4.56) (read-only value; not nil only if response body is complete) 
 ** response.body            # (since 1.4.56) (read-only value; not nil only if response body is complete) 
 * lighty.header[] - certain response headers like Location are available 
 * lighty.content[] 
 * lighty.status[] 

 You can loop with "pairs()" through the special tables "lighty.request", "lighty.env", "lighty.req_env" and "lighty.status"; "lighty.header" and and "lighty.content" are normal lua tables, so you can use them with "pairs()" too. 


 h5. lighty.env[] 

 lighttpd has its internal variables which are exported as read/write to the magnet.  

 If "http://example.org/search.php?q=lighty" is requested, this results in a request like: 
 <pre> 
   GET /search.php?q=lighty HTTP/1.1 
   Host: example.org 
 </pre> 

 When you are using @magnet.attract-raw-url-to@ you can access the following variables: 

 * parts of the request-line 
 ** lighty.env["request.uri"] = "/search.php?q=lighty" 

 * HTTP request-headers 
 ** lighty.request["Host"] = "example.org" 

 * parts of the URI 
 ** lighty.env["uri.path"] = "/search.php" 
 ** lighty.env["uri.path-raw"] = "/search.php" 
 ** lighty.env["uri.scheme"] = "http" 
 ** lighty.env["uri.authority"] = "example.org" 
 ** lighty.env["uri.query"] = "q=lighty" 

 Later in the request-handling, the URL is split, cleaned up and turned into a physical path name: 

 * filenames, pathnames 
 ** lighty.env["physical.path"] = "/my-docroot/search.php" 
 ** lighty.env["physical.rel-path"] = "/search.php" 
 ** lighty.env["physical.doc-root"] = "/my-docroot" 

 All of them are readable, but not all of them are writable (or have no effect if you write to them).  

 <pre> 
   -- 1. simple rewriting is done via the request.uri 
   lighty.env["request.uri"] = ...  
   return lighty.RESTART_REQUEST 

   -- 2. changing the physical-path 
   lighty.env["physical.path"] = ... 

   -- 3. changing the query-string 
   lighty.env["uri.query"] = ... 
 </pre> 


 h5. lighty.header[] 

 If you want to set a response header for your request, you can add a field to the lighty.header[] table: 

   lighty.header["Content-Type"] = "text/html" 


 h5. lighty.content[] 

 You can generate your own content and send it out to the clients. 

 <pre> 
   lighty.content = { "<pre>", { filename = "/etc/passwd" }, "</pre>" } 
   lighty.header["Content-Type"] = "text/html" 

   return 200 
 </pre> 

 The lighty.content[] table is executed when the script is finished. The elements of the array are processed left to right and the elements can either be a string or a table. Strings are included AS IS into the output of the request. 

 * Strings 
 ** are included as is 

 * Tables 
 ** filename = "<absolute-path>" is required 
 ** offset = <number> [default: 0] 
 ** length = <number> [default: size of the file] 

 This results in sending the range [offset, length-1] of the file. 
 ('length' param is misnamed and actually indicates the range offset + 1; kept as such for historical compatibility with existing scripts) 


 h5. lighty.status[] 

 mod_status support a global statistics page and mod_magnet allows to add and update values in the status page: 

 Config 

   status.statistics-url = "/server-counters" 
   magnet.attract-raw-url-to = (server.docroot + "/counter.lua") 

 counter.lua 

   lighty.status["core.connections"] = lighty.status["core.connections"] + 1 

 Result 

   core.connections: 7 
   fastcgi.backend.php-foo.0.connected: 0 
   fastcgi.backend.php-foo.0.died: 0 
   fastcgi.backend.php-foo.0.disabled: 0 
   fastcgi.backend.php-foo.0.load: 0 
   fastcgi.backend.php-foo.0.overloaded: 0 
   fastcgi.backend.php-foo.1.connected: 0 
   fastcgi.backend.php-foo.1.died: 0 
   fastcgi.backend.php-foo.1.disabled: 0 
   fastcgi.backend.php-foo.1.load: 0 
   fastcgi.backend.php-foo.1.overloaded: 0 
   fastcgi.backend.php-foo.load: 0 



 h2. mod_magnet API since lighttpd 1.4.60 

 (The earlier mod_magnet API (above) is still supported, but the newer mod_magnet API (below) should be preferred.) 

 h3. @lighty.r@ request object                                                                       

 (since lighttpd 1.4.60)                                                                             

 |_. @lighty.r@                |_. description |                                                       
 | @lighty.r.req_header[]@     | HTTP request headers |                                                
 | @lighty.r.req_attr[]@       | HTTP request attributes / components |                                
 | @lighty.r.req_env[]@        | HTTP request environment variables |                                  
 | @lighty.r.resp_header[]@    | HTTP response headers | 
 | @lighty.r.resp_body.*@      | HTTP response body attributes and accessors |                         
        
                                                                                                   
 h5. @lighty.r@ request object modification                                                          
        
 * @lighty.r.req_header[]@ allows get/set of request headers                                         
   If modifications would affect config processing, script should return                             
   @lighty.RESTART_REQUEST@ to have lighttpd restart the modified request.                           
   @lighty.r.req_header[]@ differs from the older API @lighty.env[]@ table,                          
   which (previously) did not permit modification of request headers.                                
   Note: header policy is not applied to values set in @lighty.r.req_header[]@;                      
         Do not set unvalidated, untrusted, or non-normalized values.                                
   Note: if iterating using @pairs()@, do not set request header to blank value                      
         during iteration, or else iteration may end up skipping request headers.                     
                                                                                                   
 * @lighty.r.req_attr[]@ allows get/set of request attributes and is detailed further below.                 
   @lighty.r.req_attr[]@ is the same as the (less clearly named) older API @lighty.env[]@            
        
 * @lighty.r.req_env[]@ allows get/set of request environment variables                              
   @lighty.r.req_env[]@ is the same as the older API @lighty.req_env[]@                              
   Note: modifications made to standard CGI environment variables                                    
   will be overwritten by backends recreating the CGI environment.                                   
   However, new variables will persist into the env passed to backend scripts.                       
                                                                                                   
 * @lighty.r.resp_header[]@ allows get/set of response headers                                       
   (Certain connection-level headers such as Connection and                                          
   Transfer-Encoding are restricted from modification)                                               
   @lighty.r.resp_headers[]@ differs from the older API @lighty.header[]@ table,                     
   which is collected and deferred, being applied after the script exits.                            
   Note: header policy is not applied to values set in @lighty.r.resp_header[]@; 
         Do not set unvalidated, untrusted, or non-normalized values.                                
   To repeated header names, such as Set-Cookie or Link, join with "\r\nNAME:" 
   @lighty.r.resp_header["Link"] = "http://a.com/a.css\r\nLink: http:/b.com/b.js"@                   
                
 * @lighty.r.resp_body.*@ adds/sets response body content                                            
   @lighty.r.resp_body.*@ differs from the older API @lighty.content[]@ table, 
   which is collected and deferred, being applied after the script exits. 
 |_. @lighty.r.resp_body@       |_. description | 
 | @lighty.r.resp_body.len@     | HTTP response body length | 
 | @lighty.r.resp_body.add()@ | HTTP response body add (string or table) | 
 | @lighty.r.resp_body.set()@ | HTTP response body set (string or table) | 

 <pre> 
 -- examples 
 local r = lighty.r 
 local resp_header = r.resp_header 
 resp_header["Content-Type"] = "text/html" 
 resp_header["Cache-Control"] = "max-age=0" 
 r.resp_body:set({'bar\n'})    -- equivalent to below 'set' 
 -- alternatives 
 r.resp_body.set({'bar\n'})    -- equivalent to above 'set' 
 lighty.r.resp_header["Content-Type"] = "text/html" 
 -- older syntax (less clearly named) 
 lighty.header["Content-Type"] = "text/html" 
 lighty.content = {'bar\n'} 
 </pre> 


 h5. @lighty.r.req_attr[]@ readable attributes 

 |_. @lighty.r.req_attr[]@       |_. description | 
 | @["uri.scheme"]@              | ("http", "https") | 
 | @["uri.authority"]@           | URI authority or Host request header | 
 | @["uri.path"]@                | url-path without the query-string; url-path is url-decoded | 
 | @["uri.path-raw"]@            | url-path without the query-string; url-path is not url-decoded 
                                 (url-path component from @["request.uri"]@, i.e. without query-string) | 
 | @["uri.query"]@               | query-string; URL part following '?'; query-string is not url-decoded | 
 | | | 
 | @["request.method"]@          | request method (e.g. GET) | 
 | @["request.protocol"]@        | request protocol ("HTTP/1.0", "HTTP/1.1", "HTTP/2.0") | 
 | @["request.uri"]@             | URI after mod_rewrite rules, if any, else same as request.orig-uri | 
 | @["request.orig-uri"]@        | URI before mod_rewrite; original request-uri sent by client | 
 | @["request.path-info"]@       | path-info following url-path | 
 | @["request.server-addr"]@     | server addr | 
 | @["request.server-port"]@     | server port | 
 | @["request.remote-addr"]@     | remote addr | 
 | @["request.remote-port"]@     | remote port | 
 | | | 
 | @["physical.doc-root"]@       | filesystem document root (original) | 
 | @["physical.basedir"]@        | filesystem document root (same as physical.doc-root unless adjusted by mod_alias, mod_userdir, ...) | 
 | @["physical.path"]@           | filesystem path to request (beginning with physical.basedir) | 
 | @["physical.rel-path"]@       | filesystem path to request (piece appended to physical.basedir) | 
 | | | 
 | @["response.http-status"]@    | HTTP response status (0 if not yet set) | 
 | @["response.body-length"]@    | HTTP response body length (same as @lighty.r.resp_body.len@) 
                                 (nil unless response body is complete) | 
 | @["response.body"]@           | HTTP response body 
                                 (nil unless response body is complete) 
                                 (copies response in memory; should not be used on very large responses) | 

 A full URI can be reconstructed using components: 
 <pre> 
 local req_attr = lighty.r.req_attr 
 local url = req_attr["uri.scheme"] 
          .. "://" 
          .. req_attr["uri.authority"] 
          .. req_attr["uri.path-raw"] 
          .. (req_attr["uri.query"] and ("?" .. req_attr["uri.query"]) or "") 
 </pre> 

 Some @lighty.r.req_attr[]@ attributes provide similar values to those in the standard CGI/1.1 environment 
 |_. @lighty.r.req_attr[]@ |_. CGI/1.1 env var | 
 | uri.scheme                | REQUEST_SCHEME      | 
 | uri.authority             | SERVER_NAME         | 
 | uri.path                  | SCRIPT_NAME         | 
 | uri.query                 | QUERY_STRING        | 
 | request.method            | REQUEST_METHOD      | 
 | request.protocol          | SERVER_PROTOCOL     | 
 | request.path-info         | PATH_INFO           | 
 | request.remote-addr       | REMOTE_ADDR         | 
 | request.remote-port       | REMOTE_PORT         | 
 | request.server-addr       | SERVER_ADDR         | 
 | request.server-port       | SERVER_PORT         | 
 | physical.doc-root         | DOCUMENT_ROOT       | 
 | physical.basedir          | DOCUMENT_ROOT       | 
 | physical.path             | SCRIPT_FILENAME     | 
 | physical.rel-path         | SCRIPT_NAME         | 


 h5. @lighty.r.req_attr[]@ writable attributes 

 Modifications to specific attributes or components are fairly direct interfaces 
 into lighttpd internals and do not affect other related attributes, including 
 the full request from which the attributes or component may have been derived. 

 What does this mean? 
 It means: *carefully test your scripts and verify desired behavior.* 

 If full request reprocessing is needed after any modification, 
 e.g. if modifications would affect config processing, script should 
 return @lighty.RESTART_REQUEST@ 

 |_. lighty.r.req_attr[]         |_. description | 
 | @["uri.scheme"]@              | modification has similar effect as changing scheme in mod_extforward 
                                 If reprocessing request is needed, then @return lighty.RESTART_REQUEST@ | 
 | @["uri.authority"]@           | modification should generally be repeated to @lighty.r.req_header["Host"]@ 
                                 If reprocessing request is needed, then @return lighty.RESTART_REQUEST@ | 
 | @["uri.path"]@                | modification discouraged; 
                                 derived from @["request.uri"]@, url-decoded and path-simplified; 
                                 prefer to modify @["request.uri"]@ and @return lighty.RESTART_REQUEST@ | 
 | @["uri.query"]@               | modification affects subsequent use of query-string by other modules 
                                 If reprocessing request is needed, prefer to modify @["request.uri"]@ and @return lighty.RESTART_REQUEST@ | 
 | @["request.uri"]@             | modification has similar effect as using mod_rewrite and should be followed by @return lighty.RESTART_REQUEST@  
                                 so that lighttpd reprocesses the request and reparses the URI into components. 
                                 @"request.uri"@ can be reconstructed using components: @["request.uri"]@ = @["uri.path-raw"]@ "?" @["uri.query"]@ | 
 | @["request.orig-uri"]@        | modification discouraged | 
 | @["request.path-info"]@       | modification affects subsequent use of path-info by other modules 
                                 If reprocessing request is needed, prefer to modify @["request.uri"]@ and @return lighty.RESTART_REQUEST@ | 
 | @["request.remote-addr"]@     | modification changes remote_addr for all subsequent requests on connection 
                                 modification has similar effect as using mod_extforward 
                                 (though mod_extforward additionally updates forwarding headers) 
                                 If reprocessing request is needed, then @return lighty.RESTART_REQUEST@ | 
 | @["request.remote-port"]@     | modification changes remote_addr port for all subsequent requests on connection | 
 | @["physical.doc-root"]@       | modification affects subsequent use of the doc_root by other modules 
                                 (e.g. mod_ssi mod_webdav (limited use as fallback)) | 
 | @["physical.basedir"]@        | modification affects subsequent use of the basedir by other modules 
                                 (e.g. as DOCUMENT_ROOT passed to backend scripts, unless modified elsewhere) | 
 | @["physical.path"]@           | modification affects subsequent use of the path by other modules 
                                 (e.g. mod_staticfile and many other filesystem based modules) 
                                 modification has similar effect as using mod_alias 
                                 (when script called from @magnet.attract-physical-path-to@ hook) | 
 | @["physical.rel-path"]@       | modification affects subsequent use of the relative path by other modules 
                                 (e.g. mod_ssi mod_userdir mod_webdav) | 

 @["physical.*"]@ attributes are valid for scripts called from @magnet.attract-physical-path-to@ hook.    The attributes are not defined in earlier magnet hooks, and have almost no effect in later magnet hooks. 



 h3. @lighty.c.*@ library functions (experimental) 

 (since lighttpd 1.4.60) 

 *Note: the @lighty.c.*@ namespace is EXPERIMENTAL / UNSTABLE* 
 In the future, these may be removed, altered, or moved to a different namespace. 
 <pre> 

 -- digests and passwords 

 lighty.c.rand()               -- generate pseudo-random number 
 lighty.c.md()                 -- calculate message digest (md5,sha1,sha256,sha512) 
 lighty.c.hmac()               -- calculate HMAC             (md5,sha1,sha256,sha512) 
 lighty.c.digest_eq()          -- timing-safe comparison of two hex digests 
 lighty.c.secret_eq()          -- timing-safe comparison of two strings 

 -- decode/encode 

 lighty.c.b64dec()             -- base64url decode (validate and decode) 
 lighty.c.b64enc()             -- base64url encode, no padding 
 lighty.c.hexdec()             -- hex decode (validate and decode) 
 lighty.c.hexenc()             -- hex encode uc; lc w/ lua s = s:lower() 
 lighty.c.xmlenc()             -- xml-encode/html-encode: <>&'\"` 
 lighty.c.urldec()             -- url-decode 
 lighty.c.urlenc()             -- url-encode 
 lighty.c.urldec_query()       -- url-decode query-string into table 
 lighty.c.urlenc_query()       -- url-encode query-string from table 
 lighty.c.urlenc_normalize() -- url-encode normalization 
 lighty.c.fspath_simplify()    -- simplify fspath (remove "/." "/.." "/../" "//") 

 -- misc 

 lighty.c.cookie_tokens()      -- parse HTTP Cookie header into table 
 lighty.c.readdir()            -- dir walk 
 lighty.c.stat()               -- stat() path 
 </pre> 


 h5. message digest (md) and hash-based message authentication code (HMAC) 
 (MD5, SHA1, SHA256, SHA512) 

 @lighty.c.md("algo", "data")@ 
 @lighty.c.hmac("algo", "secret", "data")@ 
 * "algo" can be one of: "md5", "sha1", "sha256", "sha512" 
   (as long as lighttpd is compiled w/ crypto lib supporting those algorithms) 
 * returns uppercase hex string of digest 

 @lighty.c.digest_eq("digest1", "digest2")@ 
 * performs a timing-safe, case-insensitive comparison of two hex digests 
   (timing-safe comparison is slightly more secure than @digest1 == digest2@) 
 * "digest1" and "digest2" are hex strings (of binary digests) 
 * returns boolean true or false 

 @lighty.c.secret_eq("data1", "data2")@ 
 * performs a timing-safe comparison of two strings 
   (and attempts to hides differences in string lengths) 
   (timing-safe comparison is slightly more secure than @data1 == data2@) 
 * "data1" and "data2" are strings 
 * returns boolean true or false 

 h5. decode/encode 

 @lighty.c.b64dec("base64url-string")@ 
 @lighty.c.b64enc("string")@ 
 * base64url decode/encode (without padding) 
 * RFC4648 base64url (URL- and filename-safe standard), not base64 (standard) 
   https://en.wikipedia.org/wiki/Base64 

 @lighty.c.urldec_query("query-string")@ 
 * url-decode query-string into table 
 * table value for a given key is @nil@ if no '=' follows "key" 
 * table value for a given key is blank ("") if blank value follows "key=" 

 h5. misc 

 @lighty.c.cookie_tokens("cookie-string")@ 
 * parse HTTP Cookie header into table 
 * @local cookies = lighty.c.cookie_tokens(lighty.r.req_header['Cookie'])@ 
   @local identity = lighty.c.urldec(cookies["ident"])@ 

 @lighty.c.readdir("/path/to/dir")@ 
 * dir walk 
 * skips "." or ".." for convenience 
 * @local name@ 
   @for name in lighty.c.readdir("/tmp") do r.resp_body:add({name, "\n"}) end@ 

 @lighty.c.stat("/path")@ 
 * checks the existence of a file/dir/socket and returns the @stat()@ information for it 
 * uses lighttpd internal stat-cache 
 * path: (string) absolute path 
 * returns: table with the following fields, or nil on error 
 ** is_file 
 ** is_dir 
 ** is_char 
 ** is_block 
 ** is_socket 
 ** is_link 
 ** is_fifo 
 ** st_mode 
 ** st_mtime 
 ** st_ctime 
 ** st_atime 
 ** st_uid 
 ** st_gid 
 ** st_size 
 ** st_ino 
 ** etag 
 ** content-type 



 h2. Library Functions 

 mod_magnet exports a few additional functions to the script: 

 * @pairs()@ - extends the default @pairs()@ function 
 * @print()@ - writes to the lighttpd error log 
 * @lighty.stat()@ - @stat()@ file (see @lighty.c.stat()@) @lighty.c.stat()@ further below) 


 h5. @print()@ 

 @print()@ replaces the lua-default version and redirects the trace to the lighttpd error log.    @print()@ is useful for debugging. 
 <pre> 
 print("Host: " .. lighty.request["Host"]) 
 print("Request-URI: " .. lighty.env["request.uri"]) 
 </pre> 

 h5. @lighty.stat()@ 

 @lighty.stat()@ returns @stat()@ information for given path, e.g. checks existence of file/dir/socket (see @lighty.c.stat()@) 



 h2. Examples 

 see [[AbsoLUAtion|Abso *lua* tion for additional infos, code-snippets, examples ...]] 


 h2. Porting mod_cml scripts 

 mod_cml got replaced by mod_magnet. 
 * mod_cml functions @memcache_get_string()@ @memcache_get_long()@ @memcache_exists()@ (and lighttpd.conf @cml.memcache-hosts@) should be replaced with a lua-only solution: 
   https://github.com/silentbicycle/lua-memcached 

 * mod_cml function @dir_files@ should be replaced with @lighty.c.readdir()@ (since 1.4.60) 

 * CACHE_HIT in mod_cml 
 @output_include = { "file1", "file2" }@ 
 @return CACHE_HIT@ 
 becomes 
 @lighty.content = { { filename = "/path/to/file1" }, { filename = "/path/to/file2"} }@ 
 @return 200@ 

 * CACHE_MISS like (CML): 
 @trigger_handler = "/index.php"@ 
 @return CACHE_MISS@ 
 becomes (magnet) 
 @lighty.env["request.uri"] = "/index.php"@ 
 @return lighty.RESTART_REQUEST@ 

 Questions?    Post in the "lighttpd Forums":https://redmine.lighttpd.net/projects/lighttpd/boards