summaryrefslogtreecommitdiff
path: root/docs/manual/mod/mod_charset_lite.html
blob: 42a8774cc83e97df48bb579d56b05ad51d4214d5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
 <HEAD>
  <TITLE>Apache module mod_charset_lite</TITLE>
 </HEAD>
<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
 <BODY
  BGCOLOR="#FFFFFF"
  TEXT="#000000"
  LINK="#0000FF"
  VLINK="#000080"
  ALINK="#FF0000"
 >
<!--#include virtual="header.html" -->
  <H1 ALIGN="CENTER">Module mod_charset_lite</H1>

<p>This module provides the ability to specify character set
  translation or recoding.</p>

<P><A
HREF="module-dict.html#Status"
REL="Help"
><STRONG>Status:</STRONG></A> Experimental
<BR>
<A
HREF="module-dict.html#SourceFile"
REL="Help"
><STRONG>Source File:</STRONG></A> mod_charset_lite.c
<BR>
<A
HREF="module-dict.html#ModuleIdentifier"
REL="Help"
><STRONG>Module Identifier:</STRONG></A> charset_lite_module
</P>

  <H2>Summary</H2>
  <P>
  This is an <STRONG>experimental</STRONG> module and should be used with
  care.  Experiment with your <CODE>mod_charset_lite</CODE> configuration to
  ensure that it performs the desired function.
  </P>
  <P>
  <CODE>mod_charset_lite</CODE> allows the administrator to specify the
  source character set of objects as well as the character set they should 
  be translated into before sending to the client.  
  <CODE>mod_charset_lite</CODE> does not translate the data itself but
  instead tells Apache what translation to perform.  
  <CODE>mod_charset_lite</CODE> is applicable to EBCDIC and ASCII 
  host environments.  In an EBCDIC environment, Apache normally translates 
  text content from the code page of the Apache process locale to 
  ISO-8859-1.  <CODE>mod_charset_lite</CODE> can be used to specify that
  a different translation is to be performed.  In an ASCII environment,
  Apache normally performs no translation, so <CODE>mod_charset_lite</CODE>
  is needed in order for any translation to take place.
  </P>

  <p>This module will only work if <code>APACHE_XLATE</code> is defined
  at compile time.</p>

  <P>
  This module provides a small subset of configuration mechanisms
  implemented by Russian Apache and its associated <CODE>mod_charset</CODE>.
  </P>

  <H2>Directives</H2>
  <UL>
   <LI><A HREF="#charsetsourceenc">CharsetSourceEnc</A>
   <LI><A HREF="#charsetdefault">CharsetDefault</A>
   <LI><A HREF="#charsetoptions">CharsetOptions</A>
   </LI>
  </UL>

 <H2>Common Problems</H2>

  <H3>Invalid character set names</H3>

  <P>
  The character set name parameters of CharsetSourceEnc and CharsetDefault
  must be acceptable to the translation mechanism used by APR on the system
  where mod_charset_lite is deployed.  These character set names are not 
  standardized and are usually not the same as the corresponding values used 
  in http headers.  Currently, APR can only use iconv(3), so you can easily
  test your character set names using the iconv(1) program, as follows:
  </P>

  <PRE>
  iconv -f charsetsourceenc-value -t charsetdefault-value
  </PRE>

  <H3>Mismatch between character set of content and translation rules</H3>

  <P>
  If the translation rules don't make sense for the content, translation
  can fail in various ways, including:
  </P>

  <SL>
  <LI>
  The translation mechanism may return a bad return code, and the connection
  will be aborted.
  <LI>
  The translation mechanism may silently place special characters (e.g., question
  marks) in the output buffer when it cannot translate the input buffer.
  </SL>

  <HR>

  <H2><A NAME="charsetsourceenc">CharsetSourceEnc</A></H2>
  <P>
  <A
   HREF="directive-dict.html#Syntax"
   REL="Help"
  ><STRONG>Syntax:</STRONG></A> CharsetSourceEnc <EM>charset</EM>
  <BR>
  <A
   HREF="directive-dict.html#Default"
   REL="Help"
  ><STRONG>Default:</STRONG></A> <EM>None</EM>
  <BR>
  <A
   HREF="directive-dict.html#Context"
   REL="Help"
  ><STRONG>Context:</STRONG></A> directory, virtual host
  <BR>
  <A
   HREF="directive-dict.html#Override"
   REL="Help"
  ><STRONG>Override:</STRONG></A> <EM>FileInfo</EM>
  <BR>
  <A
   HREF="directive-dict.html#Status"
   REL="Help"
  ><STRONG>Status:</STRONG></A> Experimental
  <BR>
  <A
   HREF="directive-dict.html#Module"
   REL="Help"
  ><STRONG>Module:</STRONG></A> mod_charset_lite
  <BR>

  <P>
  The <CODE>CharsetSourceEnc</CODE> directive specifies the source charset
  of files in the associated container.
  </P>

  <P>
  The value of the <EM>charset</EM> argument must be accepted as a valid
  character set name by the character set support in APR.  Generally, this
  means that it must be supported by iconv.
  </P>

  Example:

  <PRE>
    &lt;Directory "/export/home/trawick/apacheinst/htdocs/convert"&gt;
    CharsetSourceEnc  UTF-16BE
    CharsetDefault    ISO8859-1
    &lt;/Directory&gt;
  </PRE>

  The character set names in this example work with the iconv
  translation support in Solaris 8.
  <P>

<hr>

  <H2><A NAME="charsetdefault">CharsetDefault</A></H2>
  <P>
  <A
   HREF="directive-dict.html#Syntax"
   REL="Help"
  ><STRONG>Syntax:</STRONG></A> CharsetDefault <EM>charset</EM>
  <BR>
  <A
   HREF="directive-dict.html#Default"
   REL="Help"
  ><STRONG>Default:</STRONG></A> <EM>None</EM>
  <BR>
  <A
   HREF="directive-dict.html#Context"
   REL="Help"
  ><STRONG>Context:</STRONG></A> directory, virtual host
  <BR>
  <A
   HREF="directive-dict.html#Override"
   REL="Help"
  ><STRONG>Override:</STRONG></A> <EM>FileInfo</EM>
  <BR>
  <A
   HREF="directive-dict.html#Status"
   REL="Help"
  ><STRONG>Status:</STRONG></A> Experimental
  <BR>
  <A
   HREF="directive-dict.html#Module"
   REL="Help"
  ><STRONG>Module:</STRONG></A> mod_charset_lite
  <BR>

  <P>
  The <CODE>CharsetDefault</CODE> directive specifies the charset that
  content in the associated container should be translated to.
  </P>

  <P>
  The value of the <EM>charset</EM> argument must be accepted as a valid
  character set name by the character set support in APR.  Generally, this
  means that it must be supported by iconv.
  </P>

  Example:

  <PRE>
    &lt;Directory "/export/home/trawick/apacheinst/htdocs/convert"&gt;
    CharsetSourceEnc  UTF-16BE
    CharsetDefault    ISO8859-1
    &lt;/Directory&gt;
  </PRE>

  <P>

<hr>

  <H2><A NAME="charsetoptions">CharsetOptions</A></H2>
  <P>
  <A
   HREF="directive-dict.html#Syntax"
   REL="Help"
  ><STRONG>Syntax:</STRONG></A> CharsetOptions <EM>option</em>
  [<em>option</em>] ...
  <BR>
  <A
   HREF="directive-dict.html#Default"
   REL="Help"
  ><STRONG>Default:</STRONG></A> <EM>DebugLevel=0</EM> <EM>NoImplicitAdd</EM>
  <BR>
  <A
   HREF="directive-dict.html#Context"
   REL="Help"
  ><STRONG>Context:</STRONG></A> directory, virtual host
  <BR>
  <A
   HREF="directive-dict.html#Override"
   REL="Help"
  ><STRONG>Override:</STRONG></A> <EM>FileInfo</EM>
  <BR>
  <A
   HREF="directive-dict.html#Status"
   REL="Help"
  ><STRONG>Status:</STRONG></A> Experimental
  <BR>
  <A
   HREF="directive-dict.html#Module"
   REL="Help"
  ><STRONG>Module:</STRONG></A> mod_charset_lite
  <BR>

  <P>
  The <CODE>CharsetOptions</CODE> directive configures certain behaviors 
  of <CODE>mod_charset_lite</CODE>.  <EM>Option</EM> can be one of
  <DL>
  <DT>DebugLevel=<EM>n</EM>
  <DD>
  The <SAMP>DebugLevel</SAMP> keyword allows you to specify the level of
  debug messages generated by <CODE>mod_charset_lite</CODE>.  By default, no
  messages are generated.  This is equivalent to <SAMP>DebugLevel=0</SAMP>.
  With higher numbers, more debug messages are generated, and server 
  performance will be degraded.  The actual meanings of the numeric values
  are described with the definitions of the DBGLVL_ constants near the 
  beginning of <CODE>mod_charset_lite.c</CODE>.
  <DT>ImplicitAdd | NoImplicitAdd
  <DD>
  The <SAMP>ImplicitAdd</SAMP> keyword specifies that 
  <CODE>mod_charset_lite</CODE> should implicitly insert its filter when
  the configuration specifies that the character set of content should be 
  translated.  If the filter chain is explicitly configured using the 
  AddOutputFilter directive, <SAMP>NoImplicitAdd</SAMP> should be specified so 
  that <CODE>mod_charset_lite</CODE> doesn't add its filter.
  </DL>
  </P>

 
<!--#include virtual="footer.html" -->
 </BODY>
</HTML>