mod_diagnostics

mod_diagnostics is a debugging and diagnostic tool for application developers - particularly filter modules. It can be inserted anywhere in the Apache filter chain, and logs traffic (buckets and brigades) passing through. It is a purely passive watcher, and will never touch the traffic passing through.

Examples

Probably the best way to explain mod_diagnistics is by example.

Strange delays in some browsers

In an update to mod_xml, a new bug was introduced. It was not immediately obvious, but in some browsers the request would hang and then timeout. The effect was only observed when using the XSLT output filter with Xalan-C, and only happened with HTTP/1.1 browser, not with HTTP/1.0. Furthermore, hitting "cancel" before the timeout in an HTTP/1.1 browser would cause the page to display!

Inserting mod_diagnostics before and after the offending filter, the bug was immediately obvious. The module was simply failing to pass an EOS bucket down the chain. A trivial fix!

Obscure bug in a third-party library

A user of mod_proxy_html reported serious performance problems when parsing an 8Mb HTML file. He had profiled the problem, and found the entire processing time was in the final call to htmlParseChunk in libxml2.

I investigated this by inserting mod_diagnostics before and after mod_proxy_html, and running it with the largest HTML document I had available (the MySQL manual, about 2.6Mb). I was able to confirm that nothing was passed down the chain until the final call, so not only was it slow, but it had also broken Apache pipelining.

To refine the diagnosis, I added a flush in each call to the filter in mod_proxy_html. Now mod_diagnostics showed a small amount of data (under 1Kb) coming through in the first call to the filter, but nothing else until the end. Further investigation showed that the data stopped coming when the first HTML comment was encountered in the source.

At this point I ran it under gdb, looking for the comment handling. I found that it was failing to find the end of the comment. The problem was resolved only in the last call to htmlParseChunk, which didn't go through the buggy code. I disabled the buggy code, and found it was now working correctly, with approximately the same amount of input and output data in each call to the mod_proxy_html filter - so pipelining was now fixed. My correspondent reported total processing Time for his 8Mb file reduced from 30 minutes to 9 seconds (on five-year-old hardware).

The bug was reported to the libxml team, who have now fixed it.

Availability

mod_diagnostics.c is available under the same terms as the Apache server itself.