mod_publisher: Macros and Templates

mod_publisher can apply special processing to any element encountered in markup. Processing rules are insert, replace and hide, and can be applied to the opening and/or closing tags of any element. Elements for which such processing is defined are known as markup macros. Markup macros may be defined for any element that is well-formed as XML, regardless of whether it is defined for the markup language of a document being processed.

In insert or replace mode, contents may be inserted as a fixed string, an variable, a file, path or a URL. Attributes to an element may be interpolated in a variable, filename or URL, but not in a path.

Examples

A few cases where you might use markup macros include:

Insert headers and footers in HTML pages

(X)HTML defines a <body> element that is a container for all contents to be rendered in a browser's display area. So defining body as an insert macro enables you to insert a header or footer in every page served.

This works for any HTML page even if the body opening and/or closing tags are omitted by a page author, because they are implied. In contrast to the banners typically inserted by free hosting providers, mod_publisher keeps the pages valid, and is not open to "banner hiding" tricks.

	MLMacroPath	body insert start path /path/to/site-header-file
	MLMacroPath	body insert end path /path/to/site-footer-file
Insert an advertising banner

A custom element can be used as an alternative to SSI as a placeholder for dynamic contents.

	MLMacro banner replace url http://adserver.example.com/my-id
Include a page in another page

Elements such as HTML frames and objects permit one document to be included in another by a supporting client. mod_publisher can fold them into a single page by defining the elements concerned as macros. In this case, we need to use the relevant attributes to identify the source of the included content:

	MLMacro iframe replace url @src;
Prepare a page for inclusion in another page

The last rule will result in badly broken markup if a complete HTML page is inserted. We need to process our inserted content too:

  1. Remove opening and closing html tags.
  2. Remove the entire head section, including the contents as well as the tags
  3. Replace the opening and closing body tags with div
	MLMacro html replace start ""
	MLMacro html replace end ""
	MLMacro head hide
	MLMacro body replace start "<div class=\"included\">"
	MLMacro body replace end </div>
Change transcluded contents to links

We can replace included contents with links to them, thus offering users the choice of whether to load it. For example, <img src="foo" alt="bar" width="W" height="H" title="baz"/> to <a href="foo" title="baz">bar (image, WxH)</a>.

	MLMacro img replace var "<a href='@src;' title='@title;'>@alt; (image, @width;x@height;)</a>"
Strip out scripts

In an environment like a blog which might be vulnerable to someone inserting malicious script, we can protect against it by removing scripts. This works well when combined with a DTD (or rewriting rules) that exclude scripting events.

	MLMacro script hide

This is a much more efficient solution than mod_security (though of course it doesn't replace mod_security's other capabilities)!

Security of Included Content

A potential security risk arises if content can be included in a webpage from anywhere in the filesystem: if absolute paths can be included, a user might be able to expose sensitive information (such as /etc/passwd or another user's data) by including it in a page.

To protect servers from this, the server administrator is able to insert data from anywhere in the filesystem (subject to other security measures that may be in operation), but ordinary users are restricted:

  1. Interpolated files can only be served from within or below the current directory. That keeps it within contents that are, by definition, public. This is the same security that applies to <!--#include file="..."--> in SSI.
  2. pathss can be absolute, but are not interpolated (so can't be set from within a document), and can't be specified from a .htaccess file; only from within httpd.conf.

Directives

Performance

Insert and Replace rules will run fastest when the contents included are defined inline, or (close second) from a file or path. virtual contents requires a subrequest, which is slower, while including by url requires an entire HTTP request and is therefore much slower than the other options.

Precedence

Markup macros where defined take precedence over all other processing implemented natively by mod_publisher. However, when processing XML with namespaces enabled, any element for which a namespace handler is active will not be processed as a macro unless the handler declines it.