1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
|
<?xml version="1.0"?>
<!DOCTYPE modulesynopsis SYSTEM "../style/modulesynopsis.dtd">
<?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
<!-- $LastChangedRevision$ -->
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<modulesynopsis metafile="mod_filter.xml.meta">
<name>mod_filter</name>
<description>Context-sensitive smart filter configuration module</description>
<status>Base</status>
<sourcefile>mod_filter.c</sourcefile>
<identifier>filter_module</identifier>
<compatibility>Version 2.1 and later</compatibility>
<summary>
<p>This module enables smart, context-sensitive configuration of
output content filters. For example, apache can be configured to
process different content-types through different filters, even
when the content-type is not known in advance (e.g. in a proxy).</p>
<p><module>mod_filter</module> works by introducing indirection into
the filter chain. Instead of inserting filters in the chain, we insert
a filter harness which in turn dispatches conditionally
to a filter provider. Any content filter may be used as a provider
to <module>mod_filter</module>; no change to existing filter modules is
required (although it may be possible to simplify them).</p>
</summary>
<section id="smart"><title>Smart Filtering</title>
<p>In the traditional filtering model, filters are inserted unconditionally
using <directive module="mod_mime">AddOutputFilter</directive> and family.
Each filter then needs to determine whether to run, and there is little
flexibility available for server admins to allow the chain to be
configured dynamically.</p>
<p><module>mod_filter</module> by contrast gives server administrators a
great deal of flexibility in configuring the filter chain. In fact,
filters can be inserted based on any Request Header, Response Header
or Environment Variable. This generalises the limited flexibility offered
by <directive module="core">AddOutputFilterByType</directive>, and fixes
it to work correctly with dynamic content, regardless of the
content generator. The ability to dispatch based on Environment
Variables offers the full flexibility of configuration with
<module>mod_rewrite</module> to anyone who needs it.</p>
</section>
<section id="terms"><title>Filter Declarations, Providers and Chains</title>
<p class="figure">
<img src="../images/mod_filter_old.gif" width="160" height="310"
alt="[This image displays the traditional filter model]"/><br />
<dfn>Figure 1:</dfn> The traditional filter model</p>
<p>In the traditional model, output filters are a simple chain
from the content generator (handler) to the client. This works well
provided the filter chain can be correctly configured, but presents
problems when the filters need to be configured dynamically based on
the outcome of the handler.</p>
<p class="figure">
<img src="../images/mod_filter_new.gif" width="423" height="331"
alt="[This image shows the mod_filter model]"/><br />
<dfn>Figure 2:</dfn> The <module>mod_filter</module> model</p>
<p><module>mod_filter</module> works by introducing indirection into
the filter chain. Instead of inserting filters in the chain, we insert
a filter harness which in turn dispatches conditionally
to a filter provider. Any content filter may be used as a provider
to <module>mod_filter</module>; no change to existing filter modules
is required (although it may be possible to simplify them). There can be
multiple providers for one filter, but no more than one provider will
run for any single request.</p>
<p>A filter chain comprises any number of instances of the filter
harness, each of which may have any number of providers. A special
case is that of a single provider with unconditional dispatch: this
is equivalent to inserting the provider filter directly into the chain.</p>
</section>
<section id="config"><title>Configuring the Chain</title>
<p>There are three stages to configuring a filter chain with
<module>mod_filter</module>. For details of the directives, see below.</p>
<dl>
<dt>Declare Filters</dt>
<dd>The <directive module="mod_filter">FilterDeclare</directive> directive
declares a filter, assigning it a name and filter type. Required
only if the filter is not the default type AP_FTYPE_RESOURCE.</dd>
<dt>Register Providers</dt>
<dd>The <directive module="mod_filter">FilterProvider</directive>
directive registers a provider with a filter. The filter may have
been declared with <directive module="mod_filter"
>FilterDeclare</directive>; if not, FilterProvider will implicitly
declare it with the default type AP_FTYPE_RESOURCE. The provider
must have been
registered with <code>ap_register_output_filter</code> by some module.
The final argument to <directive module="mod_filter"
>FilterProvider</directive> is an expression: the provider will be
selected to run for a request if and only if the expression evaluates
to true. The expression may evaluate HTTP request or response
headers, environment variables, or the Handler used by this request.
Unlike earlier versions, mod_filter now supports complex expressions
involving multiple criteria with AND / OR logic (&& / ||)
and brackets.</dd>
<dt>Configure the Chain</dt>
<dd>The above directives build components of a smart filter chain,
but do not configure it to run. The <directive module="mod_filter"
>FilterChain</directive> directive builds a filter chain from smart
filters declared, offering the flexibility to insert filters at the
beginning or end of the chain, remove a filter, or clear the chain.</dd>
</dl>
</section>
<section id="upgrade"><title>Upgrading from HTTPD 2.2 Configuration</title>
<p>The <directive module="mod_filter">FilterProvider</directive>
directive has changed from HTTPD 2.2: the <var>match</var> and
<var>dispatch</var> arguments are replaced with a single but
more versatile <var>expression</var>. In general, you can convert
a match/dispatch pair to the two sides of an expression, using
something like:</p>
<example>"dispatch = match"</example>
<p>The Request headers, Response headers and Environment variables
are now interpreted from syntax <var>$req{foo}</var>,
<var>$resp{foo}</var> and <var>$env{foo}</var> respectively.
The variables <var>$handler</var> and <var>$Content-Type</var>
are also supported.</p>
<p>Note that the match no longer supports integer comparisons
or substring matches. The latter can be replaced by regular
expression matches.</p>
</section>
<section id="examples"><title>Examples</title>
<dl>
<dt>Server side Includes (SSI)</dt>
<dd>A simple case of using <module>mod_filter</module> in place of
<directive module="core">AddOutputFilterByType</directive>
<example>
FilterDeclare SSI<br/>
FilterProvider SSI INCLUDES "$resp{Content-Type} = /^text\/html/"<br/>
FilterChain SSI
</example>
</dd>
<dt>Server side Includes (SSI)</dt>
<dd>The same as the above but dispatching on handler (classic
SSI behaviour; .shtml files get processed).
<example>
FilterProvider SSI INCLUDES "Handler = server-parsed"<br/>
FilterChain SSI
</example>
</dd>
<dt>Emulating mod_gzip with mod_deflate</dt>
<dd>Insert INFLATE filter only if "gzip" is NOT in the
Accept-Encoding header. This filter runs with ftype CONTENT_SET.
<example>
FilterDeclare gzip CONTENT_SET<br/>
FilterProvider gzip inflate "$req{Accept-Encoding} != /gzip/"<br/>
FilterChain gzip
</example>
</dd>
<dt>Image Downsampling</dt>
<dd>Suppose we want to downsample all web images, and have filters
for GIF, JPEG and PNG.
<example>
FilterProvider unpack jpeg_unpack "$resp{Content-Type} = image/jpeg"<br/>
FilterProvider unpack gif_unpack "$resp{Content-Type} = image/gif"<br/>
FilterProvider unpack png_unpack "$resp{Content-Type} = image/png"<br/>
<br />
FilterProvider downsample downsample_filter "$resp{Content-Type} = /image\/(jpeg|gif|png)/"<br/>
FilterProtocol downsample "change=yes"<br/>
<br />
FilterProvider repack jpeg_pack "$resp{Content-Type} = image/jpeg"<br/>
FilterProvider repack gif_pack "$resp{Content-Type} = image/gif"<br/>
FilterProvider repack png_pack "$resp{Content-Type} = image/png"<br/>
<Location /image-filter><br/>
<indent>
FilterChain unpack downsample repack<br/>
</indent>
</Location>
</example>
</dd>
</dl>
</section>
<section id="protocol"><title>Protocol Handling</title>
<p>Historically, each filter is responsible for ensuring that whatever
changes it makes are correctly represented in the HTTP response headers,
and that it does not run when it would make an illegal change. This
imposes a burden on filter authors to re-implement some common
functionality in every filter:</p>
<ul>
<li>Many filters will change the content, invalidating existing content
tags, checksums, hashes, and lengths.</li>
<li>Filters that require an entire, unbroken response in input need to
ensure they don't get byteranges from a backend.</li>
<li>Filters that transform output in a filter need to ensure they don't
violate a <code>Cache-Control: no-transform</code> header from the
backend.</li>
<li>Filters may make responses uncacheable.</li>
</ul>
<p><module>mod_filter</module> aims to offer generic handling of these
details of filter implementation, reducing the complexity required of
content filter modules. This is work-in-progress; the
<directive module="mod_filter">FilterProtocol</directive> implements
some of this functionality for back-compatibility with Apache 2.0
modules. For httpd 2.1 and later, the
<code>ap_register_output_filter_protocol</code> and
<code>ap_filter_protocol</code> API enables filter modules to
declare their own behaviour.</p>
<p>At the same time, <module>mod_filter</module> should not interfere
with a filter that wants to handle all aspects of the protocol. By
default (i.e. in the absence of any <directive module="mod_filter"
>FilterProtocol</directive> directives), <module>mod_filter</module>
will leave the headers untouched.</p>
<p>At the time of writing, this feature is largely untested,
as modules in common use are designed to work with 2.0.
Modules using it should test it carefully.</p>
</section>
<directivesynopsis>
<name>FilterDeclare</name>
<description>Declare a smart filter</description>
<syntax>FilterDeclare <var>filter-name</var> <var>[type]</var></syntax>
<contextlist><context>server config</context><context>virtual host</context>
<context>directory</context><context>.htaccess</context></contextlist>
<override>Options</override>
<usage>
<p>This directive declares an output filter together with a
header or environment variable that will determine runtime
configuration. The first argument is a <var>filter-name</var>
for use in <directive module="mod_filter">FilterProvider</directive>,
<directive module="mod_filter">FilterChain</directive> and
<directive module="mod_filter">FilterProtocol</directive> directives.</p>
<p>The final (optional) argument
is the type of filter, and takes values of <code>ap_filter_type</code>
- namely <code>RESOURCE</code> (the default), <code>CONTENT_SET</code>,
<code>PROTOCOL</code>, <code>TRANSCODE</code>, <code>CONNECTION</code>
or <code>NETWORK</code>.</p>
</usage>
</directivesynopsis>
<directivesynopsis>
<name>FilterProvider</name>
<description>Register a content filter</description>
<syntax>FilterProvider <var>filter-name</var> <var>provider-name</var>
<var>expression</var></syntax>
<contextlist><context>server config</context><context>virtual host</context>
<context>directory</context><context>.htaccess</context></contextlist>
<override>Options</override>
<usage>
<p>This directive registers a <em>provider</em> for the smart filter.
The provider will be called if and only if the <var>expression</var>
declared evaluates to true when the harness is first called.</p>
<p>
<var>provider-name</var> must have been registered by loading
a module that registers the name with
<code>ap_register_output_filter</code>.
</p>
<p><var>expression</var> can be any of the following:</p>
<dl>
<dt><code><var>string</var></code></dt>
<dd>true if <var>string</var> is not empty</dd>
<dt><code><var>string1</var> = <var>string2</var><br />
<var>string1</var> == <var>string2</var><br />
<var>string1</var> != <var>string2</var></code></dt>
<dd><p>Compare <var>string1</var> with <var>string2</var>. If
<var>string2</var> has the form <code>/<var>string2</var>/</code>
then it is treated as a regular expression. Regular expressions are
implemented by the <a href="http://www.pcre.org">PCRE</a> engine and
have the same syntax as those in <a href="http://www.perl.com">perl
5</a>. Note that <code>==</code> is just an alias for <code>=</code>
and behaves exactly the same way.</p>
</dd>
<dt><code><var>string1</var> < <var>string2</var><br />
<var>string1</var> <= <var>string2</var><br />
<var>string1</var> > <var>string2</var><br />
<var>string1</var> >= <var>string2</var></code></dt>
<dd>Compare <var>string1</var> with <var>string2</var>. Note, that
strings are compared <em>literally</em> (using
<code>strcmp(3)</code>). Therefore the string "100" is less than
"20".</dd>
<dt><code>( <var>expression</var> )</code></dt>
<dd>true if <var>expression</var> is true</dd>
<dt><code>! <var>expression</var></code></dt>
<dd>true if <var>expression</var> is false</dd>
<dt><code><var>expression1</var> &&
<var>expression2</var></code></dt>
<dd>true if both <var>expression1</var> and
<var>expression2</var> are true</dd>
<dt><code><var>expression1</var> ||
<var>expression2</var></code></dt>
<dd>true if either <var>expression1</var> or
<var>expression2</var> is true</dd>
</dl>
</usage>
</directivesynopsis>
<directivesynopsis>
<name>FilterChain</name>
<description>Configure the filter chain</description>
<syntax>FilterChain [+=-@!]<var>filter-name</var> <var>...</var></syntax>
<contextlist><context>server config</context><context>virtual host</context>
<context>directory</context><context>.htaccess</context></contextlist>
<override>Options</override>
<usage>
<p>This configures an actual filter chain, from declared filters.
<directive>FilterChain</directive> takes any number of arguments,
each optionally preceded with a single-character control that
determines what to do:</p>
<dl>
<dt><code>+<var>filter-name</var></code></dt>
<dd>Add <var>filter-name</var> to the end of the filter chain</dd>
<dt><code>@<var>filter-name</var></code></dt>
<dd>Insert <var>filter-name</var> at the start of the filter chain</dd>
<dt><code>-<var>filter-name</var></code></dt>
<dd>Remove <var>filter-name</var> from the filter chain</dd>
<dt><code>=<var>filter-name</var></code></dt>
<dd>Empty the filter chain and insert <var>filter-name</var></dd>
<dt><code>!</code></dt>
<dd>Empty the filter chain</dd>
<dt><code><var>filter-name</var></code></dt>
<dd>Equivalent to <code>+<var>filter-name</var></code></dd>
</dl>
</usage>
</directivesynopsis>
<directivesynopsis>
<name>FilterProtocol</name>
<description>Deal with correct HTTP protocol handling</description>
<syntax>FilterProtocol <var>filter-name</var> [<var>provider-name</var>]
<var>proto-flags</var></syntax>
<contextlist><context>server config</context><context>virtual host</context>
<context>directory</context><context>.htaccess</context></contextlist>
<override>Options</override>
<usage>
<p>This directs <module>mod_filter</module> to deal with ensuring the
filter doesn't run when it shouldn't, and that the HTTP response
headers are correctly set taking into account the effects of the
filter.</p>
<p>There are two forms of this directive. With three arguments, it
applies specifically to a <var>filter-name</var> and a
<var>provider-name</var> for that filter.
With two arguments it applies to a <var>filter-name</var> whenever the
filter runs <em>any</em> provider.</p>
<p><var>proto-flags</var> is one or more of</p>
<dl>
<dt><code>change=yes</code></dt>
<dd>The filter changes the content, including possibly the content
length</dd>
<dt><code>change=1:1</code></dt>
<dd>The filter changes the content, but will not change the content
length</dd>
<dt><code>byteranges=no</code></dt>
<dd>The filter cannot work on byteranges and requires complete input</dd>
<dt><code>proxy=no</code></dt>
<dd>The filter should not run in a proxy context</dd>
<dt><code>proxy=transform</code></dt>
<dd>The filter transforms the response in a manner incompatible with
the HTTP <code>Cache-Control: no-transform</code> header.</dd>
<dt><code>cache=no</code></dt>
<dd>The filter renders the output uncacheable (eg by introducing randomised
content changes)</dd>
</dl>
</usage>
</directivesynopsis>
<directivesynopsis>
<name>FilterTrace</name>
<description>Get debug/diagnostic information from
<module>mod_filter</module></description>
<syntax>FilterTrace <var>filter-name</var> <var>level</var></syntax>
<contextlist><context>server config</context><context>virtual host</context>
<context>directory</context></contextlist>
<usage>
<p>This directive generates debug information from
<module>mod_filter</module>.
It is designed to help test and debug providers (filter modules), although
it may also help with <module>mod_filter</module> itself.</p>
<p>The debug output depends on the <var>level</var> set:</p>
<dl>
<dt><code>0</code> (default)</dt>
<dd>No debug information is generated.</dd>
<dt><code>1</code></dt>
<dd><module>mod_filter</module> will record buckets and brigades
passing through the filter to the error log, before the provider has
processed them. This is similar to the information generated by
<a href="http://apache.webthing.com/mod_diagnostics/">mod_diagnostics</a>.
</dd>
<dt><code>2</code> (not yet implemented)</dt>
<dd>Will dump the full data passing through to a tempfile before the
provider. <strong>For single-user debug only</strong>; this will not
support concurrent hits.</dd>
</dl>
</usage>
</directivesynopsis>
</modulesynopsis>
|