CML aka Cache Meta Language

What Is It

CML tries to move the decision about a cache-hit and cache-miss for a dynamic website
out of the dynamic application, removing the need to start the application or dynamic
language at all.

Especially PHP is known to have a huge overhead before the script is started to be executed.

How To Install

The language used by CML is LUA which you can find at

To get some background on how to write LUA code check out:


The main benefit of CML is its performance.

A very simple benchmark showed:

  • about 1000 req/s for the static 'output.html' which is generated output from the PHP script
  • about 600 req/s if index.cml is called (cache-hit)
  • about 50 req/s if index.php is called (cache-miss)

Using CML improves the performance for the tested page by a factor of 12, getting
near enough to the possible maximum of the static file transfer.

Usage Patterns is using CML to reduce the load (even if the load is minimal).

The layout of the front page depends on a few files:

  • content-1
  • content-6
  • the template /main.tmpl

If any of the files are modified the cached version of the page must change as well.

output_contenttype = "text/html" 

trigger_handler = "index.php" 

-- this file updated by the trigger 
output_include = { "output.html" }

docroot = request["DOCUMENT_ROOT"]
cwd = request["CWD"]

-- the dependencies
files = { cwd .. "content-1", cwd .. "content-6", docroot .. "main.tmpl" }

cached_mtime = file_mtime(cwd .. "output.html")

-- if one of the source files is newer than the generated files
-- call the trigger
for i,v in ipairs(files) do
  if file_mtime(v) > cached_mtime then return 1 end

return 0

Delaying recheck

If you are building a news aggregator it is useful to be able to delay the rebuild of the cached content for a period of time, as you can assume that the news are not changing with each request. So instead of revalidating on each request you delay the validation check.

-- same as above

-- check again in 5 minutes
delay_recheck = 3600

if cached_mtime + delay_recheck > os.time() then return 0 end

-- we are behind the delayed recheck, check the cache as usual

for i,v in ipairs(files) do
  if file_mtime(v) > cached_mtime then return 1 end

return 0

And to tell the proxies inbetween not to check again in the next 5 minutes after they received this content, use the setenv module and add some cache-control or expire headers.

CML and Databases

CML does not provide direct access to databases like MySQL or PostgreSQL, and probably never will.

There is a better/faster way to interface CML with Databases: MemCache

All you have to do is keep the information needed to decide whether a page has to be regenerated in a memcached storage itself. Let's say that whenever you store an entry in the database, you associate a Version-ID with it. The Version-ID is incremented as soon as you make a change to the resource.

This Version-ID is now stored in the Database and in memcache at the same time. CML can now fetch the Version-ID, check if content already has been generated for it, and generate it if necessary.

output_contenttype = "text/html" 

content_key = md5(request["PATH_INFO"])
version = memcache_get_long(content_key)
cwd = request["CWD"]

trigger_handler = "generate.php" 

if version >= 0 then
  output_include = { cwd .. content_key .. "-" .. version .. ".html" }
  return 0
  return 1

generate.php will have to:

  • get PATH_INFO
  • fetch information from database about it
  • generate content for the page and write it to disk
  • deliver it to the client

To interface the database with the memcached you can use a UDF:

In MySQL and the UDF you just do:

UPDATE content SET @v := (version = version + 1) WHERE id = <id>;
SELECT memcache_set("", <id>, @v);

To check which version is currently used by the cache:

SELECT memcache_get("", <id>);