mod_publisher can apply special processing to any element encountered in markup. Processing rules are insert, replace and hide, and can be applied to the opening and/or closing tags of any element. Elements for which such processing is defined are known as markup macros. Markup macros may be defined for any element that is well-formed as XML, regardless of whether it is defined for the markup language of a document being processed.
In insert or replace mode, contents may be inserted as a fixed string, an variable, a file, path or a URL. Attributes to an element may be interpolated in a variable, filename or URL, but not in a path.
A few cases where you might use markup macros include:
(X)HTML defines a <body>
element that is a
container for all contents to be rendered in a browser's display area.
So defining body as an insert macro enables you to
insert a header or footer in every page served.
This works for any HTML page even if the body opening and/or closing tags are omitted by a page author, because they are implied. In contrast to the banners typically inserted by free hosting providers, mod_publisher keeps the pages valid, and is not open to "banner hiding" tricks.
MLMacroPath body insert start path /path/to/site-header-file
MLMacroPath body insert end path /path/to/site-footer-file
A custom element can be used as an alternative to SSI as a placeholder for dynamic contents.
MLMacro banner replace url http://adserver.example.com/my-id
Elements such as HTML frames and objects permit one document to be included in another by a supporting client. mod_publisher can fold them into a single page by defining the elements concerned as macros. In this case, we need to use the relevant attributes to identify the source of the included content:
MLMacro iframe replace url @src;
The last rule will result in badly broken markup if a complete HTML page is inserted. We need to process our inserted content too:
MLMacro html replace start ""
MLMacro html replace end ""
MLMacro head hide
MLMacro body replace start "<div class=\"included\">"
MLMacro body replace end </div>
We can replace included contents with links to them, thus offering users the choice of whether to load it. For example, <img src="foo" alt="bar" width="W" height="H" title="baz"/> to <a href="foo" title="baz">bar (image, WxH)</a>.
MLMacro img replace var "<a href='@src;' title='@title;'>@alt; (image, @width;x@height;)</a>"
In an environment like a blog which might be vulnerable to someone inserting malicious script, we can protect against it by removing scripts. This works well when combined with a DTD (or rewriting rules) that exclude scripting events.
MLMacro script hide
This is a much more efficient solution than mod_security (though of course it doesn't replace mod_security's other capabilities)!
A potential security risk arises if content can be included in a webpage from anywhere in the filesystem: if absolute paths can be included, a user might be able to expose sensitive information (such as /etc/passwd or another user's data) by including it in a page.
To protect servers from this, the server administrator is able to insert data from anywhere in the filesystem (subject to other security measures that may be in operation), but ordinary users are restricted:
<!--#include file="..."-->
in SSI.Insert and Replace rules will run fastest when the contents included are defined inline, or (close second) from a file or path. virtual contents requires a subrequest, which is slower, while including by url requires an entire HTTP request and is therefore much slower than the other options.
Markup macros where defined take precedence over all other processing implemented natively by mod_publisher. However, when processing XML with namespaces enabled, any element for which a namespace handler is active will not be processed as a macro unless the handler declines it.