CacheMetaLanguage » History » Revision 6
Revision 5 (jan, 2005-07-18 08:58) → Revision 6/14 (jan, 2005-07-18 09:05)
= CML aka Cache Meta Language = == What Is It == CML tries to move the decision about a cache-hit and cache-miss for a dynamic website out of the dynamic application, removing the need to start the application or dynamic language at all. Especially PHP is know to have a huge overhead before the script is started to be executed. == How To Install == The language used by CML is LUA which you can find at http://www.lua.org/ The get some background how to write LUA check out: * http://lua-users.org/wiki/LuaAddons * http://luaforge.net/ == Benifits == The main benifit of CML is its performance. A very simple benchmark showed: * about 1000 req/s for the static 'output.html' which is generated * about 600 req/s if index.cml is called (cache-hit) * about 50 req/s if index.php is called (cache-miss) Using CML improves the performance for the tested page by a factor of 12, getting near enough to the possible maximum of the static file transfer. == Usage Patterns == http://www.lighttpd.net/ is using CML to reduce the load (even if the load is minimal). The layout of the front page depends on a few files: * content-1 * content-6 * the template /main.tmpl If one of the files gets changed the cached version of the page has to be changed too. {{{ output_contenttype = "text/html" trigger_handler = "index.php" -- this file updated by the trigger output_include = { "output.html" } docroot = request["DOCUMENT_ROOT"] cwd = request["CWD"] -- the dependencies files = { cwd .. "content-1", cwd .. "content-6", docroot .. "main.tmpl" } cached_mtime = file_mtime(cwd .. "output.html") -- if one of the source files is newer than the generated files -- call the trigger for i,v in ipairs(files) do if file_mtime(v) > cached_mtime then return 1 end end return 0 }}} == Delaying recheck == If A simple way to reduce the load especially if you are building can't add a news aggregator it fine-grained caching to your app is usefull to delay cache the rebuild of the content output for some seconds as you can assume that a longer time even if the news are not changing with each request. Instead of revalidating on each request you just delay the check. backend has new changes to display: {{{ -- same as above -- # check again in 5 minutes delay_recheck = 3600 if cached_mtime + delay_recheck > os.time() then return 0 end -- we are behind the delayed recheck, check the cache ## same as usual above for i,v in ipairs(files) do if file_mtime(v) > ( cached_mtime - delay_recheck) then return 1 end end return 0 }}} And to tell the proxies inbetween not to check again in the next 5 minutes after they received this content use the setenv module and add some cache-control or expire headers. == CML and Databases == CML doesn't provide direct access to databases like MySQL or PostgreSQL. And to make sure that we don't get the request later: It will never get it. There is a better/faster way to interface CML with Databases: MemCache All you have to do is keeping the interesting information to decide if a page has to regenerated in a memcached storage. Let's say what whenever you store a entry in the database you associate a Version-ID with it. The Version-ID is incremented as soon as you make a change to the resource. This version Version-ID is now stored in the Database and in memcache at the same time. CML can now fetch the Version-ID, check if content has been generated for it, generate it if necessary. {{{ output_contenttype = "text/html" key = md5(request["PATH_INFO"]) version = memcache_get_long(key) cwd = request["CWD"] trigger_handler = "generate.php" if version >= 0 then output_include = { cwd .. key .. "-" .. version .. ".html" } return 0 else return 1 end }}} generate.php will have to: * get PATH_INFO * fetch information from database about it * generate content for the page and write it to disk * deliver it to the client To interface the database with the memcached you can use a UDF: * for [http://www.mysql.com/ MySQL] can get the mysql udf at [http://jan.kneschke.de/projects/mysql/udf/ jans mysql page] * for [http://www.postgresql.org/ PostgreSQL] Sean Chittenden has written [http://people.freebsd.org/~seanc/pgmemcache/ pgmemcache] In MySQL and the UDF you just do: {{{ BEGIN; UPDATE content SET @v := (version = version + 1) WHERE id = <id>; SELECT memcache_set("127.0.0.1:11211", <id>, @v); COMMIT; }}} To check which version is currently used by the cache: {{{ SELECT memcache_get("127.0.0.1:11211", <id>); }}}