Project

General

Profile

Mod magnet » History » Revision 43

Revision 42 (mamo, 2007-05-10 17:14) → Revision 43/117 (Anonymous, 2007-06-02 18:31)

{{{ 
 #!rst 
 ============== 
 a power-magnet 
 ============== 

 ------------------ 
 Module: mod_magnet 
 ------------------ 

 .. contents:: Table of Contents 

 Requirements 
 ============ 

 :Version: lighttpd 1.4.12 or higher 
 :Packages: lua >= 5.1 

 Overview 
 ======== 

 mod_magnet is a module to control the request handling in lighty.  

 .. note:: 

   Keep in mind that the magnet is executed in the core of lighty. EVERY long-running operation is blocking  
   ALL connections in the server. You are warned. For time-consuming or blocking scripts use mod_fastcgi and friends. 

 For performance reasons mod_magnet caches the compiled script. For each script-run the script itself is checked for  
 freshness and recompiled if necessary. 

 External Resources 
 ================== 

 * darix is maintaining the cleanurl.lua at http://pixel.global-banlist.de/ 
 * http://www.sitepoint.com/blogs/2007/04/10/faster-page-loads-bundle-your-css-and-javascript/ 

 Installation 
 ============ 

 mod_magnet needs a lighty which is compiled with the lua-support option ( --with-lua). Lua 5.1 or higher are required by 
 the module. Use "--with-lua=lua5.1" to install on Debian and friends. :: 

   server.modules = ( ..., "mod_magnet", ... ) 

 Options 
 ======= 

 mod_magnet can attract a request in several stages in the request-handling.  

 * either at the same level as mod_rewrite, before any parsing of the URL is done 
 * or at a later stage, when the doc-root is known and the physical-path is already setup 

 It depends on the purpose of the script which stage you want to intercept. Usually you want to use 
 the 2nd stage where the physical-path which relates to your request is known. At this level you 
 can run checks against lighty.env["physical.path"]. 

 :: 

   magnet.attract-raw-url-to = ( ... ) 
   magnet.attract-physical-path-to = ( [absolute path to lua script]    ) 

 You can define multiple scripts when separated by a comma. The scripts are executed in the specified  
 order. If one of them returns a bad status-code, the following scripts will not be executed. 

 Tables 
 ====== 

 Most of the interaction between between mod_magnet and lighty is done through tables. Tables in lua are hashes (Perl, Ruby), dictionaries (Java, Python), associative arrays (PHP), ... 

 * lighty.request[] 
 * lighty.env[] 

  * physical.path 
  * physical.rel-path 
  * physical.doc-root 
  * uri.path (the URI without the query-string) 
  * uri.path-raw  
  * uri.scheme (http or https) 
  * uri.authority (the server-name) 
  * uri.query (the URI after the ? ) 
  * request.method (e.g. GET) 
  * request.uri (uri after rewrite) 
  * request.orig-uri (before rewrite) 
  * request.protocol (e.g. HTTP/1.0) 

 * lighty.header[] 
 * lighty.status[] 
 * lighty.content[] 



 lighty.env[] 
 ------------ 

 Lighttpd has its internal variables which are exported as read/write to the magnet.  

 If "http://example.org/search.php?q=lighty" is requested this results in a request like :: 

   GET /search.php?q=lighty HTTP/1.1 
   Host: example.org 

 When you are using ``attract-raw-url-to`` you can access the following variables: 

 * parts of the request-line 

  * lighty.env["request.uri"] = "/search.php?q=lighty" 

 * HTTP request-headers 

  * lighty.request["Host"] = "example.org" 

 Later in the request-handling, the URL is splitted, cleaned up and turned into a physical path name: 

 * parts of the URI 

  * lighty.env["uri.path"] = "/search.php" 
  * lighty.env["uri.path-raw"] = "/search.php" 
  * lighty.env["uri.scheme"] = "http" 
  * lighty.env["uri.authority"] = "example.org" 
  * lighty.env["uri.query"] = "q=lighty" 

 * filenames, pathnames 

  * lighty.env["physical.path"] = "/my-docroot/search.php" 
  * lighty.env["physical.rel-path"] = "/search.php" 
  * lighty.env["physical.doc-root"] = "/my-docroot" 

 All of them are readable, not all of the are writable (or don't have an effect if you write to them).  

 As a start, you might want to use those variables for writing: :: 

   -- 1. simple rewriting is done via the request.uri 
   lighty.env["request.uri"] = ...  
   return lighty.RESTART_REQUEST 

   -- 2. changing the physical-path 
   lighty.env["physical.path"] = ... 

   -- 3. changing the query-string 
   lighty.env["uri.query"] = ... 

 lighty.header[] 
 --------------- 

 If you want to set a response header for your request, you can add a field to the lighty.header[] table: :: 

   lighty.header["Content-Type"] = "text/html" 

 lighty.status[] 
 --------------- 

 mod_status support a global statistics page and mod_magnet allows to add and update values in the status page: 

 Config :: 

   status.statistics-url = "/server-counters" 
   magnet.attract-raw-url-to = (server.docroot + "/counter.lua") 

 counter.lua :: 

   lighty.status["core.connections"] = lighty.status["core.connections"] + 1 

 Result:: 

   core.connections: 7 
   fastcgi.backend.php-foo.0.connected: 0 
   fastcgi.backend.php-foo.0.died: 0 
   fastcgi.backend.php-foo.0.disabled: 0 
   fastcgi.backend.php-foo.0.load: 0 
   fastcgi.backend.php-foo.0.overloaded: 0 
   fastcgi.backend.php-foo.1.connected: 0 
   fastcgi.backend.php-foo.1.died: 0 
   fastcgi.backend.php-foo.1.disabled: 0 
   fastcgi.backend.php-foo.1.load: 0 
   fastcgi.backend.php-foo.1.overloaded: 0 
   fastcgi.backend.php-foo.load: 0 

 Exported Functions 
 ================== 

 mod-magnet exports a few functions to the script: 

 * print (writes to the error-log) 
 * lighty.stat() 


 print() 
 ------- 

 print() overwrites the lua-default version and sends the content to the errorlog. 

 lighty.stat() 
 ------------- 

 lighty.stat() checks the existence of a file/dir/socket and returns the stat() information for it. 
 It is using lighty's internal stat-cache.:: 

   /** 
   * array lighty.stat(path) 
   *  
   * @param path (string) absolute path to stat() 
   * @returns array or nil on error 
   */ 

 If the call was successful you'll be able to query the following fields from the array: 

 * is_file 
 * is_dir 
 * is_char 
 * is_block 
 * is_socket 
 * is_link 
 * is_fifo 
 * st_mtime 
 * st_ctime 
 * st_atime 
 * st_uid 
 * st_gui 
 * st_size 
 * st_ino 
 * etag 
 * content-type 


 Sending Content 
 =============== 

 You can generate your own content and send it out to the clients. :: 

   lighty.content = { "<pre>", { filename = "/etc/passwd" }, "</pre>" } 
   lighty.header["Content-Type"] = "text/html" 

   return 200 

 The lighty.content[] table is executed when the script is finished. The elements of the array are processed left to right and the elements can either be a string or a table. Strings are included AS IS into the output of the request. 

 * Strings 

   * are included as is 

 * Tables 

   * filename = "<absolute-path>" is required 
   * offset = <number> [default: 0] 
   * length = <number> [default: size of the file - offset] 

 Internally lighty will use the sendfile() call to send out the static files at full speed. 

 Status Codes 
 ============ 

 You might have seen it already in other examples: In case you are handling the request completly in the magnet you 
 can return your own status-codes. Examples are: Redirected, Input Validation, ... :: 

   if (lighty.env["uri.scheme"] == "http") then 
     lighty.header["Location"] = "https://" .. lighty.env["uri.authority"] .. lighty.env["request.uri"] 
     return 302 
   end 

 You every number above and equal to 100 is taken as final status code and finishes the request. No other modules are  
 executed after this return. 

 A special return-code is lighty.RESTART_REQUEST (currently equal to 99) which is usually used in combination with  
 changing the request.uri in a rewrite. It restarts the splitting of the request-uri again. 

 If you return nothing (or nil) the request-handling just continues. 

 Debugging 
 ========= 

 To easy debugging we overloaded the print()-function in lua and redirect the output of print() to the error-log. :: 

   print("Host: " .. lighty.request["Host"]) 
   print("Request-URI: " .. lighty.env["request.uri"]) 



 Examples 
 ======== 

 Sending text-files as HTML 
 -------------------------- 

 This is a bit simplistic, but it illustrates the idea: Take a text-file and cover it in a <pre> tag. 

 Config-file :: 

   magnet.attract-physical-path-to = (server.docroot + "/readme.lua") 

 readme.lua :: 

   lighty.content = { "<pre>", { filename = "/README" }, "</pre>" } 
   lighty.header["Content-Type"] = "text/html" 
  
   return 200 

 Maintainance pages 
 ------------------ 

 Your site might be on maintainance from time to time. Instead of shutting down the server confusing all 
 users, you can just send a maintainance page. 

 Config-file :: 

   magnet.attract-physical-path-to = (server.docroot + "/maintainance.lua") 

 maintainance.lua :: 

   if not (nil == lighty.stat(lighty.env["physical.doc-root"] .. "/maintainance.html")) then 
     lighty.content = { { filename = lighty.env["physical.doc-root"] .. "/maintainance.html" } } 

     lighty.header["Content-Type"] = "text/html" 

     return 200 
   end 

 mod_flv_streaming 
 ----------------- 

 Config-file :: 

   magnet.attract-physical-path-to = (server.docroot + "/flv-streaming.lua") 

 flv-streaming.lua:: 

   if (lighty.env["uri.query"]) then 
     -- split the query-string 
     get = {} 
     for k, v in string.gmatch(lighty.env["uri.query"], "(%w+)=(%w+)") do 
       get[k] = v 
     end 
       
     header="" 
     if (get and get["start"]) then 
       start = tonumber(get["start"]) 
     else 
       start=0 
     end 

     -- send te FLV header only when seeking + a seek into the file 
     if (start ~= nil and start > 0) then 
       header="FLV\1\1\0\0\0\9\0\0\0\9" 
     end 

     lighty.content = { header ,  
        { filename = lighty.env["physical.path"], offset = start } } 
     lighty.header["Content-Type"] = "video/x-flv" 

     return 200 
    
   end 

 You can also use a backend like php to use your own authorization or stuff like mod_secdl. Just activate x-rewrite in the backend configuration and use a header like  

   header("X-Rewrite-URI: flvstreaming?start=" . $start . "&path=" . $path); 

 The request is restarted and in the lua, you can catch the non-existing uri with the following code (wrap it between the example below):: 

   if (string.find(lighty.env["uri.path"],"/flvstreaming") then 
     <flv streaming lua code> 
   end 

 In the future, there will be a new magnet for response headers, maybe you can give your own headers like:: 

   header("X-StreamMyFlv: $path"); 

 to lua and use the header data as parameter for the streaming. 

  
 selecting a random file from a directory 
 ---------------------------------------- 

 Say, you want to send a random file (ad-content) from a directory.  

 To simplify the code and to improve the performance we define: 

 * all images have the same format (e.g. image/png) 
 * all images use increasing numbers starting from 1 
 * a special index-file names the highest number 

 Config :: 

   server.modules += ( "mod_magnet" ) 
   magnet.attract-physical-path-to = ("random.lua") 

 random.lua :: 

   dir = lighty.env["physical.path"] 

   f = assert(io.open(dir .. "/index", "r")) 
   maxndx = f:read("*all") 
   f:close() 

   ndx = math.random(maxndx) 

   lighty.content = { { filename = dir .. "/" .. ndx }} 
   lighty.header["Content-Type"] = "image/png" 

   return 200 

 denying illegal character sequences in the URL 
 ---------------------------------------------- 

 Instead of implementing mod_security, you might just want to apply filters on the content 
 and deny special sequences that look like SQL injection.  

 A common injection is using UNION to extend a query with another SELECT query. 

 :: 

   if (string.find(lighty.env["request.uri"], "UNION%s")) then 
     return 400 
   end 

 Traffic Quotas 
 -------------- 

 If you only allow your virtual hosts a certain amount for traffic each month and want to  
 disable them if the traffic is reached, perhaps this helps: :: 

   host_blacklist = { ["www.example.org"] = 0 } 

   if (host_blacklist[lighty.request["Host"]]) then 
     return 404 
   end 

 Just add the hosts you want to blacklist into the blacklist table in the shown way. 

 Complex rewrites 
 ---------------- 

 If you want to implement caching on your document-root and only want to regenerate  
 content if the requested file doesn't exist, you can attract the physical.path: :: 

   magnet.attract-physical-path-to = ( server.document-root + "/rewrite.lua" ) 

 rewrite.lua :: 

   attr = lighty.stat(lighty.env["physical.path"]) 

   if (not attr) then 
     -- we couldn't stat() the file for some reason 
     -- let the backend generate it 

     lighty.env["uri.path"] = "/dispatch.fcgi" 
     lighty.env["physical.rel-path"] = lighty.env["uri.path"] 
     lighty.env["physical.path"] = lighty.env["physical.doc-root"] .. lighty.env["physical.rel-path"] 
   end 

 Extension rewrites 
 ------------------ 

 If you want to hide your file extensions (like .php) you can attract the physical.path: :: 

   magnet.attract-physical-path-to = ( server.document-root + "/rewrite.lua" ) 

 rewrite.lua :: 

   attr = lighty.stat(lighty.env["physical.path"] .. ".php") 

   if (attr) then 
     lighty.env["uri.path"] = lighty.env["uri.path"] .. ".php" 
     lighty.env["physical.rel-path"] = lighty.env["uri.path"] 
     lighty.env["physical.path"] = lighty.env["physical.doc-root"] .. lighty.env["physical.rel-path"] 
   end 

 User tracking 
 ------------- 

 ... or how to store data globally in the script-context: 

 Each script has its own script-context. When the script is started it only contains the lua-functions 
 and the special lighty.* name-space. If you want to save data between script runs, you can use the global-script 
 context: 

 :: 

   if (nil == _G["usertrack"]) then 
     _G["usertrack"] = {} 
   end 
   if (nil == _G["usertrack"][lighty.request["Cookie"]]) then 
     _G["usertrack"][lighty.request["Cookie"]] 
   else  
     _G["usertrack"][lighty.request["Cookie"]] = _G["usertrack"][lighty.request["Cookie"]] + 1 
   end 

   print _G["usertrack"][lighty.request["Cookie"]] 

 The global-context is per script. If you update the script without restarting the server, the context will still be maintained. 


 Porting mod_cml scripts 
 ----------------------- 

 mod_cml got replaced by mod_magnet. 

 A CACHE_HIT in mod_cml:: 
 
   output_include = { "file1", "file2" }  

   return CACHE_HIT 

 becomes:: 

   lighty.content content = { { filename = "/path/to/file1" }, { filename = "/path/to/file2"} } 

   return 200 

 while a CACHE_MISS like (CML) :: 

   trigger_handler = "/index.php" 

   return CACHE_MISS 

 becomes (magnet) :: 

   lighty.env["request.uri"] = "/index.php" 

   return lighty.RESTART_REQUEST 

 }}}