Project

General

Profile

Actions

Mod magnet » History » Revision 55

« Previous | Revision 55/119 (diff) | Next »
Anonymous, 2008-10-06 18:04
adding detail to the Overview


TracNav(DocsToc)


#!rst
==============
a power-magnet
==============

------------------
Module: mod_magnet
------------------

.. contents:: Table of Contents

Requirements
============

:Version: lighttpd 1.4.12 or higher
:Packages: lua >= 5.1

Overview
========

mod_magnet is a module to control the request handling in lighty. It allows you to do more complex URL rewrites and caching than you would otherwise be able to do.

.. note::

  While the lua language the mod_magnet uses is very powerful, mod_magnet is not meant to be a general replacement for your regular scripting environment. This is because  mod_magnet is executed in the core of Lighty and EVERY long-running operation is blocking 
  ALL connections in the server. You are warned. For time-consuming or blocking scripts use mod_fastcgi and friends.

For performance reasons mod_magnet caches the compiled script. For each script-run the script itself is checked for 
freshness and recompiled if necessary.

External Resources
==================

* darix is maintaining the cleanurl.lua at http://pixel.global-banlist.de/
* Jippi is maintaining the bundle.lua at http://www.cakephp.nu/faster-page-loads-bundle-your-css-and-javascript-lighttpd-mod_magnet-lua
* http://www.sitepoint.com/blogs/2007/04/10/faster-page-loads-bundle-your-css-and-javascript/
* Google CodeSearch is great for looking Lua examples. http://www.google.com/codesearch start your queries with "lang:lua" 

Installation
============

mod_magnet needs a lighty which is compiled with the lua-support option ( --with-lua). Lua 5.1 or higher are required by
the module. Use "--with-lua=lua5.1" to install on Debian and friends. ::

  server.modules = ( ..., "mod_magnet", ... )

Options
=======

mod_magnet can attract a request in several stages in the request-handling. 

* either at the same level as mod_rewrite, before any parsing of the URL is done
* or at a later stage, when the doc-root is known and the physical-path is already setup

It depends on the purpose of the script which stage you want to intercept. Usually you want to use
the 2nd stage where the physical-path which relates to your request is known. At this level you
can run checks against lighty.env["physical.path"].

::

  magnet.attract-raw-url-to = ( ... )
  magnet.attract-physical-path-to = ( [absolute path to lua script]  )

You can define multiple scripts when separated by a comma. The scripts are executed in the specified 
order. If one of them returns a bad status-code, the following scripts will not be executed.

Tables
======

Most of the interaction between between mod_magnet and lighty is done through tables. Tables in lua are hashes (Perl, Ruby), dictionaries (Java, Python), associative arrays (PHP), ...

* lighty.request[] - certain request headers like Host, Cookie, or User-Agent are available
* lighty.env[]

 * physical.path
 * physical.rel-path
 * physical.doc-root
 * uri.path (the URI without the query-string)
 * uri.path-raw 
 * uri.scheme (http or https)
 * uri.authority (the server-name)
 * uri.query (the URI after the ? )
 * request.method (e.g. GET)
 * request.uri (uri after rewrite)
 * request.orig-uri (before rewrite)
 * request.protocol (e.g. HTTP/1.0)

* lighty.header[] - certain response headers like Location are available
* lighty.status[]
* lighty.content[]

lighty.env[]
------------

Lighttpd has its internal variables which are exported as read/write to the magnet. 

If "http://example.org/search.php?q=lighty" is requested this results in a request like ::

  GET /search.php?q=lighty HTTP/1.1
  Host: example.org

When you are using ``attract-raw-url-to`` you can access the following variables:

* parts of the request-line

 * lighty.env["request.uri"] = "/search.php?q=lighty" 

* HTTP request-headers

 * lighty.request["Host"] = "example.org" 

Later in the request-handling, the URL is splitted, cleaned up and turned into a physical path name:

* parts of the URI

 * lighty.env["uri.path"] = "/search.php" 
 * lighty.env["uri.path-raw"] = "/search.php" 
 * lighty.env["uri.scheme"] = "http" 
 * lighty.env["uri.authority"] = "example.org" 
 * lighty.env["uri.query"] = "q=lighty" 

* filenames, pathnames

 * lighty.env["physical.path"] = "/my-docroot/search.php" 
 * lighty.env["physical.rel-path"] = "/search.php" 
 * lighty.env["physical.doc-root"] = "/my-docroot" 

All of them are readable, not all of the are writable (or don't have an effect if you write to them). 

As a start, you might want to use those variables for writing: ::

  -- 1. simple rewriting is done via the request.uri
  lighty.env["request.uri"] = ... 
  return lighty.RESTART_REQUEST

  -- 2. changing the physical-path
  lighty.env["physical.path"] = ...

  -- 3. changing the query-string
  lighty.env["uri.query"] = ...

lighty.header[]
---------------

If you want to set a response header for your request, you can add a field to the lighty.header[] table: ::

  lighty.header["Content-Type"] = "text/html" 

lighty.status[]
---------------

mod_status support a global statistics page and mod_magnet allows to add and update values in the status page:

Config ::

  status.statistics-url = "/server-counters" 
  magnet.attract-raw-url-to = (server.docroot + "/counter.lua")

counter.lua ::

  lighty.status["core.connections"] = lighty.status["core.connections"] + 1

Result::

  core.connections: 7
  fastcgi.backend.php-foo.0.connected: 0
  fastcgi.backend.php-foo.0.died: 0
  fastcgi.backend.php-foo.0.disabled: 0
  fastcgi.backend.php-foo.0.load: 0
  fastcgi.backend.php-foo.0.overloaded: 0
  fastcgi.backend.php-foo.1.connected: 0
  fastcgi.backend.php-foo.1.died: 0
  fastcgi.backend.php-foo.1.disabled: 0
  fastcgi.backend.php-foo.1.load: 0
  fastcgi.backend.php-foo.1.overloaded: 0
  fastcgi.backend.php-foo.load: 0

Exported Functions
==================

mod-magnet exports a few functions to the script:

* print (writes to the error-log)
* lighty.stat()

print()
-------

print() overwrites the lua-default version and sends the content to the errorlog.

lighty.stat()
-------------

lighty.stat() checks the existence of a file/dir/socket and returns the stat() information for it.
It is using lighty's internal stat-cache.::

  /**
  * array lighty.stat(path)
  * 
  * @param path (string) absolute path to stat()
  * @returns array or nil on error
  */

If the call was successful you'll be able to query the following fields from the array:

* is_file
* is_dir
* is_char
* is_block
* is_socket
* is_link
* is_fifo
* st_mtime
* st_ctime
* st_atime
* st_uid
* st_gui
* st_size
* st_ino
* etag
* content-type

Sending Content
===============

You can generate your own content and send it out to the clients. ::

  lighty.content = { "<pre>", { filename = "/etc/passwd" }, "</pre>" }
  lighty.header["Content-Type"] = "text/html" 

  return 200

The lighty.content[] table is executed when the script is finished. The elements of the array are processed left to right and the elements can either be a string or a table. Strings are included AS IS into the output of the request.

* Strings

  * are included as is

* Tables

  * filename = "<absolute-path>" is required
  * offset = <number> [default: 0]
  * length = <number> [default: size of the file]

This results in sending the range [offset, length-1] of the file.

Internally lighty will use the sendfile() call to send out the static files at full speed.

Status Codes
============

You might have seen it already in other examples: In case you are handling the request completly in the magnet you
can return your own status-codes. Examples are: Redirected, Input Validation, ... ::

  if (lighty.env["uri.scheme"] == "http") then
    lighty.header["Location"] = "https://" .. lighty.env["uri.authority"] .. lighty.env["request.uri"]
    return 302
  end

You every number above and equal to 100 is taken as final status code and finishes the request. No other modules are 
executed after this return.

A special return-code is lighty.RESTART_REQUEST (currently equal to 99) which is usually used in combination with 
changing the request.uri in a rewrite. It restarts the splitting of the request-uri again.

If you return nothing (or nil) the request-handling just continues.

Debugging
=========

To easy debugging we overloaded the print()-function in lua and redirect the output of print() to the error-log. ::

  print("Host: " .. lighty.request["Host"])
  print("Request-URI: " .. lighty.env["request.uri"])

Examples
========

Sending text-files as HTML
--------------------------

This is a bit simplistic, but it illustrates the idea: Take a text-file and cover it in a <pre> tag.

Config-file ::

  magnet.attract-physical-path-to = (server.docroot + "/readme.lua")

readme.lua ::

  lighty.content = { "<pre>", { filename = "/README" }, "</pre>" }
  lighty.header["Content-Type"] = "text/html" 

  return 200

Maintenance pages
------------------

Your site might be on maintenance from time to time. Instead of shutting down the server confusing all
users, you can just send a maintenance page.

Config-file ::

  magnet.attract-physical-path-to = (server.docroot + "/maintenance.lua")

maintenance.lua ::

  if not (nil == lighty.stat(lighty.env["physical.doc-root"] .. "/maintenance.html")) then
    lighty.content = { { filename = lighty.env["physical.doc-root"] .. "/maintenance.html" } }

    lighty.header["Content-Type"] = "text/html" 

    return 200
  end

mod_flv_streaming
-----------------

Config-file ::

  magnet.attract-physical-path-to = (server.docroot + "/flv-streaming.lua")

flv-streaming.lua::

  if (lighty.env["uri.query"]) then
    -- split the query-string
    get = {}
    for k, v in string.gmatch(lighty.env["uri.query"], "(%w+)=(%w+)") do
      get[k] = v
    end

    header="" 
    if (get and get["start"]) then
      start = tonumber(get["start"])
    else
      start=0
    end

    -- send te FLV header only when seeking + a seek into the file
    if (start ~= nil and start > 0) then
      header="FLV\1\1\0\0\0\9\0\0\0\9" 
    end

    lighty.content = { header , 
       { filename = lighty.env["physical.path"], offset = start } }
    lighty.header["Content-Type"] = "video/x-flv" 

    return 200

  end

You can also use a backend like php to use your own authorization or stuff like mod_secdl. Just activate x-rewrite in the backend configuration and use a header like 

  header("X-Rewrite-URI: flvstreaming?start=" . $start . "&path=" . $path);

The request is restarted and in the lua, you can catch the non-existing uri with the following code (wrap it between the example below)::

  if (string.find(lighty.env["uri.path"],"/flvstreaming") then
    <flv streaming lua code>
  end

In the future, there will be a new magnet for response headers, maybe you can give your own headers like::

  header("X-StreamMyFlv: $path");

to lua and use the header data as parameter for the streaming.

selecting a random file from a directory
----------------------------------------

Say, you want to send a random file (ad-content) from a directory. 

To simplify the code and to improve the performance we define:

* all images have the same format (e.g. image/png)
* all images use increasing numbers starting from 1
* a special index-file names the highest number

Config ::

  server.modules += ( "mod_magnet" )
  magnet.attract-physical-path-to = ("random.lua")

random.lua ::

  dir = lighty.env["physical.path"]

  f = assert(io.open(dir .. "/index", "r"))
  maxndx = f:read("*all")
  f:close()

  ndx = math.random(maxndx)

  lighty.content = { { filename = dir .. "/" .. ndx }}
  lighty.header["Content-Type"] = "image/png" 

  return 200

denying illegal character sequences in the URL
----------------------------------------------

Instead of implementing mod_security, you might just want to apply filters on the content
and deny special sequences that look like SQL injection. 

A common injection is using UNION to extend a query with another SELECT query.

::

  if (string.find(lighty.env["request.uri"], "UNION%s")) then
    return 400
  end

Traffic Quotas
--------------

If you only allow your virtual hosts a certain amount for traffic each month and want to 
disable them if the traffic is reached, perhaps this helps: ::

  host_blacklist = { ["www.example.org"] = 0 }

  if (host_blacklist[lighty.request["Host"]]) then
    return 404
  end

Just add the hosts you want to blacklist into the blacklist table in the shown way.

Complex rewrites
----------------

If you want to implement caching on your document-root and only want to regenerate 
content if the requested file doesn't exist, you can attract the physical.path: ::

  magnet.attract-physical-path-to = ( server.document-root + "/rewrite.lua" )

rewrite.lua ::

  attr = lighty.stat(lighty.env["physical.path"])

  if (not attr) then
    -- we couldn't stat() the file for some reason
    -- let the backend generate it

    lighty.env["uri.path"] = "/dispatch.fcgi" 
    lighty.env["physical.rel-path"] = lighty.env["uri.path"]
    lighty.env["physical.path"] = lighty.env["physical.doc-root"] .. lighty.env["physical.rel-path"]
  end

Extension rewrites
------------------

If you want to hide your file extensions (like .php) you can attract the physical.path: ::

  magnet.attract-physical-path-to = ( server.document-root + "/rewrite.lua" )

rewrite.lua ::

  attr = lighty.stat(lighty.env["physical.path"] .. ".php")

  if (attr) then
    lighty.env["uri.path"] = lighty.env["uri.path"] .. ".php" 
    lighty.env["physical.rel-path"] = lighty.env["uri.path"]
    lighty.env["physical.path"] = lighty.env["physical.doc-root"] .. lighty.env["physical.rel-path"]
  end

User tracking
-------------

... or how to store data globally in the script-context:

Each script has its own script-context. When the script is started it only contains the lua-functions
and the special lighty.* name-space. If you want to save data between script runs, you can use the global-script
context:

::

  if (nil == _G["usertrack"]) then
    _G["usertrack"] = {}
  end
  if (nil == _G["usertrack"][lighty.request["Cookie"]]) then
    _G["usertrack"][lighty.request["Cookie"]]
  else 
    _G["usertrack"][lighty.request["Cookie"]] = _G["usertrack"][lighty.request["Cookie"]] + 1
  end

  print _G["usertrack"][lighty.request["Cookie"]]

The global-context is per script. If you update the script without restarting the server, the context will still be maintained.

Porting mod_cml scripts
-----------------------

mod_cml got replaced by mod_magnet.

A CACHE_HIT in mod_cml::

  output_include = { "file1", "file2" } 

  return CACHE_HIT

becomes::

  lighty.content = { { filename = "/path/to/file1" }, { filename = "/path/to/file2"} }

  return 200

while a CACHE_MISS like (CML) ::

  trigger_handler = "/index.php" 

  return CACHE_MISS

becomes (magnet) ::

  lighty.env["request.uri"] = "/index.php" 

  return lighty.RESTART_REQUEST

</pre>

Updated by Anonymous about 16 years ago · 55 revisions