summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/manual/developer/API.html988
-rw-r--r--docs/manual/misc/API.html988
-rw-r--r--docs/manual/misc/FAQ.html162
-rw-r--r--docs/manual/misc/client_block_api.html70
-rw-r--r--docs/manual/misc/compat_notes.html108
-rw-r--r--docs/manual/misc/security_tips.html92
-rw-r--r--docs/manual/platform/perf-bsd44.html215
-rw-r--r--docs/manual/platform/perf-dec.html267
-rw-r--r--docs/manual/platform/perf.html134
9 files changed, 3024 insertions, 0 deletions
diff --git a/docs/manual/developer/API.html b/docs/manual/developer/API.html
new file mode 100644
index 0000000000..f860996e47
--- /dev/null
+++ b/docs/manual/developer/API.html
@@ -0,0 +1,988 @@
+<!--%hypertext -->
+<html><head>
+<title>Apache API notes</title>
+</head>
+<body>
+<!--/%hypertext -->
+<h1>Apache API notes</h1>
+
+These are some notes on the Apache API and the data structures you
+have to deal with, etc. They are not yet nearly complete, but
+hopefully, they will help you get your bearings. Keep in mind that
+the API is still subject to change as we gain experience with it.
+(See the TODO file for what <em>might</em> be coming). However,
+it will be easy to adapt modules to any changes that are made.
+(We have more modules to adapt than you do).
+<p>
+
+A few notes on general pedagogical style here. In the interest of
+conciseness, all structure declarations here are incomplete --- the
+real ones have more slots that I'm not telling you about. For the
+most part, these are reserved to one component of the server core or
+another, and should be altered by modules with caution. However, in
+some cases, they really are things I just haven't gotten around to
+yet. Welcome to the bleeding edge.<p>
+
+Finally, here's an outline, to give you some bare idea of what's
+coming up, and in what order:
+
+<ul>
+<li> <a href="#basics">Basic concepts.</a>
+<menu>
+ <li> <a href="#HMR">Handlers, Modules, and Requests</a>
+ <li> <a href="#moduletour">A brief tour of a module</a>
+</menu>
+<li> <a href="#handlers">How handlers work</a>
+<menu>
+ <li> <a href="#req_tour">A brief tour of the <code>request_rec</code></a>
+ <li> <a href="#req_orig">Where request_rec structures come from</a>
+ <li> <a href="#req_return">Handling requests, declining, and returning error codes</a>
+ <li> <a href="#resp_handlers">Special considerations for response handlers</a>
+ <li> <a href="#auth_handlers">Special considerations for authentication handlers</a>
+ <li> <a href="#log_handlers">Special considerations for logging handlers</a>
+</menu>
+<li> <a href="#pools">Resource allocation and resource pools</a>
+<li> <a href="#config">Configuration, commands and the like</a>
+<menu>
+ <li> <a href="#per-dir">Per-directory configuration structures</a>
+ <li> <a href="#commands">Command handling</a>
+ <li> <a href="#servconf">Side notes --- per-server configuration, virtual servers, etc.</a>
+</menu>
+</ul>
+
+<h2><a name="basics">Basic concepts.</a></h2>
+
+We begin with an overview of the basic concepts behind the
+API, and how they are manifested in the code.
+
+<h3><a name="HMR">Handlers, Modules, and Requests</a></h3>
+
+Apache breaks down request handling into a series of steps, more or
+less the same way the Netscape server API does (although this API has
+a few more stages than NetSite does, as hooks for stuff I thought
+might be useful in the future). These are:
+
+<ul>
+ <li> URI -&gt; Filename translation
+ <li> Auth ID checking [is the user who they say they are?]
+ <li> Auth access checking [is the user authorized <em>here</em>?]
+ <li> Access checking other than auth
+ <li> Determining MIME type of the object requested
+ <li> `Fixups' --- there aren't any of these yet, but the phase is
+ intended as a hook for possible extensions like
+ <code>SetEnv</code>, which don't really fit well elsewhere.
+ <li> Actually sending a response back to the client.
+ <li> Logging the request
+</ul>
+
+These phases are handled by looking at each of a succession of
+<em>modules</em>, looking to see if each of them has a handler for the
+phase, and attempting invoking it if so. The handler can typically do
+one of three things:
+
+<ul>
+ <li> <em>Handle</em> the request, and indicate that it has done so
+ by returning the magic constant <code>OK</code>.
+ <li> <em>Decline</em> to handle the request, by returning the magic
+ integer constant <code>DECLINED</code>. In this case, the
+ server behaves in all respects as if the handler simply hadn't
+ been there.
+ <li> Signal an error, by returning one of the HTTP error codes.
+ This terminates normal handling of the request, although an
+ ErrorDocument may be invoked to try to mop up, and it will be
+ logged in any case.
+</ul>
+
+Most phases are terminated by the first module that handles them;
+however, for logging, `fixups', and non-access authentication
+checking, all handlers always run (barring an error). Also, the
+response phase is unique in that modules may declare multiple handlers
+for it, via a dispatch table keyed on the MIME type of the requested
+object. Modules may declare a response-phase handler which can handle
+<em>any</em> request, by giving it the key <code>*/*</code> (i.e., a
+wildcard MIME type specification). However, wildcard handlers are
+only invoked if the server has already tried and failed to find a more
+specific response handler for the MIME type of the requested object
+(either none existed, or they all declined).<p>
+
+The handlers themselves are functions of one argument (a
+<code>request_rec</code> structure. vide infra), which returns an
+integer, as above.<p>
+
+<h3><a name="moduletour">A brief tour of a module</a></h3>
+
+At this point, we need to explain the structure of a module. Our
+candidate will be one of the messier ones, the CGI module --- this
+handles both CGI scripts and the <code>ScriptAlias</code> config file
+command. It's actually a great deal more complicated than most
+modules, but if we're going to have only one example, it might as well
+be the one with its fingers in every place.<p>
+
+Let's begin with handlers. In order to handle the CGI scripts, the
+module declares a response handler for them. Because of
+<code>ScriptAlias</code>, it also has handlers for the name
+translation phase (to recognise <code>ScriptAlias</code>ed URIs), the
+type-checking phase (any <code>ScriptAlias</code>ed request is typed
+as a CGI script).<p>
+
+The module needs to maintain some per (virtual)
+server information, namely, the <code>ScriptAlias</code>es in effect;
+the module structure therefore contains pointers to a functions which
+builds these structures, and to another which combines two of them (in
+case the main server and a virtual server both have
+<code>ScriptAlias</code>es declared).<p>
+
+Finally, this module contains code to handle the
+<code>ScriptAlias</code> command itself. This particular module only
+declares one command, but there could be more, so modules have
+<em>command tables</em> which declare their commands, and describe
+where they are permitted, and how they are to be invoked. <p>
+
+A final note on the declared types of the arguments of some of these
+commands: a <code>pool</code> is a pointer to a <em>resource pool</em>
+structure; these are used by the server to keep track of the memory
+which has been allocated, files opened, etc., either to service a
+particular request, or to handle the process of configuring itself.
+That way, when the request is over (or, for the configuration pool,
+when the server is restarting), the memory can be freed, and the files
+closed, <i>en masse</i>, without anyone having to write explicit code to
+track them all down and dispose of them. Also, a
+<code>cmd_parms</code> structure contains various information about
+the config file being read, and other status information, which is
+sometimes of use to the function which processes a config-file command
+(such as <code>ScriptAlias</code>).
+
+With no further ado, the module itself:
+
+<pre>
+/* Declarations of handlers. */
+
+int translate_scriptalias (request_rec *);
+int type_scriptalias (request_rec *);
+int cgi_handler (request_rec *);
+
+/* Subsidiary dispatch table for response-phase handlers, by MIME type */
+
+handler_rec cgi_handlers[] = {
+{ "application/x-httpd-cgi", cgi_handler },
+{ NULL }
+};
+
+/* Declarations of routines to manipulate the module's configuration
+ * info. Note that these are returned, and passed in, as void *'s;
+ * the server core keeps track of them, but it doesn't, and can't,
+ * know their internal structure.
+ */
+
+void *make_cgi_server_config (pool *);
+void *merge_cgi_server_config (pool *, void *, void *);
+
+/* Declarations of routines to handle config-file commands */
+
+extern char *script_alias(cmd_parms *, void *per_dir_config, char *fake,
+ char *real);
+
+command_rec cgi_cmds[] = {
+{ "ScriptAlias", script_alias, NULL, RSRC_CONF, TAKE2,
+ "a fakename and a realname"},
+{ NULL }
+};
+
+module cgi_module = {
+ STANDARD_MODULE_STUFF,
+ NULL, /* initializer */
+ NULL, /* dir config creator */
+ NULL, /* dir merger --- default is to override */
+ make_cgi_server_config, /* server config */
+ merge_cgi_server_config, /* merge server config */
+ cgi_cmds, /* command table */
+ cgi_handlers, /* handlers */
+ translate_scriptalias, /* filename translation */
+ NULL, /* check_user_id */
+ NULL, /* check auth */
+ NULL, /* check access */
+ type_scriptalias, /* type_checker */
+ NULL, /* fixups */
+ NULL /* logger */
+};
+</pre>
+
+<h2><a name="handlers">How handlers work</a></h2>
+
+The sole argument to handlers is a <code>request_rec</code> structure.
+This structure describes a particular request which has been made to
+the server, on behalf of a client. In most cases, each connection to
+the client generates only one <code>request_rec</code> structure.<p>
+
+<h3><a name="req_tour">A brief tour of the <code>request_rec</code></a></h3>
+
+The <code>request_rec</code> contains pointers to a resource pool
+which will be cleared when the server is finished handling the
+request; to structures containing per-server and per-connection
+information, and most importantly, information on the request itself.<p>
+
+The most important such information is a small set of character
+strings describing attributes of the object being requested, including
+its URI, filename, content-type and content-encoding (these being filled
+in by the translation and type-check handlers which handle the
+request, respectively). <p>
+
+Other commonly used data items are tables giving the MIME headers on
+the client's original request, MIME headers to be sent back with the
+response (which modules can add to at will), and environment variables
+for any subprocesses which are spawned off in the course of servicing
+the request. These tables are manipulated using the
+<code>table_get</code> and <code>table_set</code> routines. <p>
+
+Finally, there are pointers to two data structures which, in turn,
+point to per-module configuration structures. Specifically, these
+hold pointers to the data structures which the module has built to
+describe the way it has been configured to operate in a given
+directory (via <code>.htaccess</code> files or
+<code>&lt;Directory&gt;</code> sections), for private data it has
+built in the course of servicing the request (so modules' handlers for
+one phase can pass `notes' to their handlers for other phases). There
+is another such configuration vector in the <code>server_rec</code>
+data structure pointed to by the <code>request_rec</code>, which
+contains per (virtual) server configuration data.<p>
+
+Here is an abridged declaration, giving the fields most commonly used:<p>
+
+<pre>
+struct request_rec {
+
+ pool *pool;
+ conn_rec *connection;
+ server_rec *server;
+
+ /* What object is being requested */
+
+ char *uri;
+ char *filename;
+ char *path_info;
+ char *args; /* QUERY_ARGS, if any */
+ struct stat finfo; /* Set by server core;
+ * st_mode set to zero if no such file */
+
+ char *content_type;
+ char *content_encoding;
+
+ /* MIME header environments, in and out. Also, an array containing
+ * environment variables to be passed to subprocesses, so people can
+ * write modules to add to that environment.
+ *
+ * The difference between headers_out and err_headers_out is that
+ * the latter are printed even on error, and persist across internal
+ * redirects (so the headers printed for ErrorDocument handlers will
+ * have them).
+ */
+
+ table *headers_in;
+ table *headers_out;
+ table *err_headers_out;
+ table *subprocess_env;
+
+ /* Info about the request itself... */
+
+ int header_only; /* HEAD request, as opposed to GET */
+ char *protocol; /* Protocol, as given to us, or HTTP/0.9 */
+ char *method; /* GET, HEAD, POST, etc. */
+ int method_number; /* M_GET, M_POST, etc. */
+
+ /* Info for logging */
+
+ char *the_request;
+ int bytes_sent;
+
+ /* A flag which modules can set, to indicate that the data being
+ * returned is volatile, and clients should be told not to cache it.
+ */
+
+ int no_cache;
+
+ /* Various other config info which may change with .htaccess files
+ * These are config vectors, with one void* pointer for each module
+ * (the thing pointed to being the module's business).
+ */
+
+ void *per_dir_config; /* Options set in config files, etc. */
+ void *request_config; /* Notes on *this* request */
+
+};
+
+</pre>
+
+<h3><a name="req_orig">Where request_rec structures come from</a></h3>
+
+Most <code>request_rec</code> structures are built by reading an HTTP
+request from a client, and filling in the fields. However, there are
+a few exceptions:
+
+<ul>
+ <li> If the request is to an imagemap, a type map (i.e., a
+ <code>*.var</code> file), or a CGI script which returned a
+ local `Location:', then the resource which the user requested
+ is going to be ultimately located by some URI other than what
+ the client originally supplied. In this case, the server does
+ an <em>internal redirect</em>, constructing a new
+ <code>request_rec</code> for the new URI, and processing it
+ almost exactly as if the client had requested the new URI
+ directly. <p>
+
+ <li> If some handler signaled an error, and an
+ <code>ErrorDocument</code> is in scope, the same internal
+ redirect machinery comes into play.<p>
+
+ <li> Finally, a handler occasionally needs to investigate `what
+ would happen if' some other request were run. For instance,
+ the directory indexing module needs to know what MIME type
+ would be assigned to a request for each directory entry, in
+ order to figure out what icon to use.<p>
+
+ Such handlers can construct a <em>sub-request</em>, using the
+ functions <code>sub_req_lookup_file</code> and
+ <code>sub_req_lookup_uri</code>; this constructs a new
+ <code>request_rec</code> structure and processes it as you
+ would expect, up to but not including the point of actually
+ sending a response. (These functions skip over the access
+ checks if the sub-request is for a file in the same directory
+ as the original request).<p>
+
+ (Server-side includes work by building sub-requests and then
+ actually invoking the response handler for them, via the
+ function <code>run_sub_request</code>).
+</ul>
+
+<h3><a name="req_return">Handling requests, declining, and returning error codes</a></h3>
+
+As discussed above, each handler, when invoked to handle a particular
+<code>request_rec</code>, has to return an <code>int</code> to
+indicate what happened. That can either be
+
+<ul>
+ <li> OK --- the request was handled successfully. This may or may
+ not terminate the phase.
+ <li> DECLINED --- no erroneous condition exists, but the module
+ declines to handle the phase; the server tries to find another.
+ <li> an HTTP error code, which aborts handling of the request.
+</ul>
+
+Note that if the error code returned is <code>REDIRECT</code>, then
+the module should put a <code>Location</code> in the request's
+<code>headers_out</code>, to indicate where the client should be
+redirected <em>to</em>. <p>
+
+<h3><a name="resp_handlers">Special considerations for response handlers</a></h3>
+
+Handlers for most phases do their work by simply setting a few fields
+in the <code>request_rec</code> structure (or, in the case of access
+checkers, simply by returning the correct error code). However,
+response handlers have to actually send a request back to the client. <p>
+
+They should begin by sending an HTTP response header, using the
+function <code>send_http_header</code>. (You don't have to do
+anything special to skip sending the header for HTTP/0.9 requests; the
+function figures out on its own that it shouldn't do anything). If
+the request is marked <code>header_only</code>, that's all they should
+do; they should return after that, without attempting any further
+output. <p>
+
+Otherwise, they should produce a request body which responds to the
+client as appropriate. The primitives for this are <code>rputc</code>
+and <code>rprintf</code>, for internally generated output, and
+<code>send_fd</code>, to copy the contents of some <code>FILE *</code>
+straight to the client. <p>
+
+At this point, you should more or less understand the following piece
+of code, which is the handler which handles <code>GET</code> requests
+which have no more specific handler; it also shows how conditional
+<code>GET</code>s can be handled, if it's desirable to do so in a
+particular response handler --- <code>set_last_modified</code> checks
+against the <code>If-modified-since</code> value supplied by the
+client, if any, and returns an appropriate code (which will, if
+nonzero, be USE_LOCAL_COPY). No similar considerations apply for
+<code>set_content_length</code>, but it returns an error code for
+symmetry.<p>
+
+<pre>
+int default_handler (request_rec *r)
+{
+ int errstatus;
+ FILE *f;
+
+ if (r-&gt;method_number != M_GET) return DECLINED;
+ if (r-&gt;finfo.st_mode == 0) return NOT_FOUND;
+
+ if ((errstatus = set_content_length (r, r-&gt;finfo.st_size))
+ || (errstatus = set_last_modified (r, r-&gt;finfo.st_mtime)))
+ return errstatus;
+
+ f = fopen (r-&gt;filename, "r");
+
+ if (f == NULL) {
+ log_reason("file permissions deny server access",
+ r-&gt;filename, r);
+ return FORBIDDEN;
+ }
+
+ register_timeout ("send", r);
+ send_http_header (r);
+
+ if (!r-&gt;header_only) send_fd (f, r);
+ pfclose (r-&gt;pool, f);
+ return OK;
+}
+</pre>
+
+Finally, if all of this is too much of a challenge, there are a few
+ways out of it. First off, as shown above, a response handler which
+has not yet produced any output can simply return an error code, in
+which case the server will automatically produce an error response.
+Secondly, it can punt to some other handler by invoking
+<code>internal_redirect</code>, which is how the internal redirection
+machinery discussed above is invoked. A response handler which has
+internally redirected should always return <code>OK</code>. <p>
+
+(Invoking <code>internal_redirect</code> from handlers which are
+<em>not</em> response handlers will lead to serious confusion).
+
+<h3><a name="auth_handlers">Special considerations for authentication handlers</a></h3>
+
+Stuff that should be discussed here in detail:
+
+<ul>
+ <li> Authentication-phase handlers not invoked unless auth is
+ configured for the directory.
+ <li> Common auth configuration stored in the core per-dir
+ configuration; it has accessors <code>auth_type</code>,
+ <code>auth_name</code>, and <code>requires</code>.
+ <li> Common routines, to handle the protocol end of things, at least
+ for HTTP basic authentication (<code>get_basic_auth_pw</code>,
+ which sets the <code>connection-&gt;user</code> structure field
+ automatically, and <code>note_basic_auth_failure</code>, which
+ arranges for the proper <code>WWW-Authenticate:</code> header
+ to be sent back).
+</ul>
+
+<h3><a name="log_handlers">Special considerations for logging handlers</a></h3>
+
+When a request has internally redirected, there is the question of
+what to log. Apache handles this by bundling the entire chain of
+redirects into a list of <code>request_rec</code> structures which are
+threaded through the <code>r-&gt;prev</code> and <code>r-&gt;next</code>
+pointers. The <code>request_rec</code> which is passed to the logging
+handlers in such cases is the one which was originally built for the
+initial request from the client; note that the bytes_sent field will
+only be correct in the last request in the chain (the one for which a
+response was actually sent).
+
+<h2><a name="pools">Resource allocation and resource pools</a></h2>
+
+One of the problems of writing and designing a server-pool server is
+that of preventing leakage, that is, allocating resources (memory,
+open files, etc.), without subsequently releasing them. The resource
+pool machinery is designed to make it easy to prevent this from
+happening, by allowing resource to be allocated in such a way that
+they are <em>automatically</em> released when the server is done with
+them. <p>
+
+The way this works is as follows: the memory which is allocated, file
+opened, etc., to deal with a particular request are tied to a
+<em>resource pool</em> which is allocated for the request. The pool
+is a data structure which itself tracks the resources in question. <p>
+
+When the request has been processed, the pool is <em>cleared</em>. At
+that point, all the memory associated with it is released for reuse,
+all files associated with it are closed, and any other clean-up
+functions which are associated with the pool are run. When this is
+over, we can be confident that all the resource tied to the pool have
+been released, and that none of them have leaked. <p>
+
+Server restarts, and allocation of memory and resources for per-server
+configuration, are handled in a similar way. There is a
+<em>configuration pool</em>, which keeps track of resources which were
+allocated while reading the server configuration files, and handling
+the commands therein (for instance, the memory that was allocated for
+per-server module configuration, log files and other files that were
+opened, and so forth). When the server restarts, and has to reread
+the configuration files, the configuration pool is cleared, and so the
+memory and file descriptors which were taken up by reading them the
+last time are made available for reuse. <p>
+
+It should be noted that use of the pool machinery isn't generally
+obligatory, except for situations like logging handlers, where you
+really need to register cleanups to make sure that the log file gets
+closed when the server restarts (this is most easily done by using the
+function <code><a href="#pool-files">pfopen</a></code>, which also
+arranges for the underlying file descriptor to be closed before any
+child processes, such as for CGI scripts, are <code>exec</code>ed), or
+in case you are using the timeout machinery (which isn't yet even
+documented here). However, there are two benefits to using it:
+resources allocated to a pool never leak (even if you allocate a
+scratch string, and just forget about it); also, for memory
+allocation, <code>palloc</code> is generally faster than
+<code>malloc</code>.<p>
+
+We begin here by describing how memory is allocated to pools, and then
+discuss how other resources are tracked by the resource pool
+machinery.
+
+<h3>Allocation of memory in pools</h3>
+
+Memory is allocated to pools by calling the function
+<code>palloc</code>, which takes two arguments, one being a pointer to
+a resource pool structure, and the other being the amount of memory to
+allocate (in <code>char</code>s). Within handlers for handling
+requests, the most common way of getting a resource pool structure is
+by looking at the <code>pool</code> slot of the relevant
+<code>request_rec</code>; hence the repeated appearance of the
+following idiom in module code:
+
+<pre>
+int my_handler(request_rec *r)
+{
+ struct my_structure *foo;
+ ...
+
+ foo = (foo *)palloc (r->pool, sizeof(my_structure));
+}
+</pre>
+
+Note that <em>there is no <code>pfree</code></em> ---
+<code>palloc</code>ed memory is freed only when the associated
+resource pool is cleared. This means that <code>palloc</code> does not
+have to do as much accounting as <code>malloc()</code>; all it does in
+the typical case is to round up the size, bump a pointer, and do a
+range check.<p>
+
+(It also raises the possibility that heavy use of <code>palloc</code>
+could cause a server process to grow excessively large. There are
+two ways to deal with this, which are dealt with below; briefly, you
+can use <code>malloc</code>, and try to be sure that all of the memory
+gets explicitly <code>free</code>d, or you can allocate a sub-pool of
+the main pool, allocate your memory in the sub-pool, and clear it out
+periodically. The latter technique is discussed in the section on
+sub-pools below, and is used in the directory-indexing code, in order
+to avoid excessive storage allocation when listing directories with
+thousands of files).
+
+<h3>Allocating initialized memory</h3>
+
+There are functions which allocate initialized memory, and are
+frequently useful. The function <code>pcalloc</code> has the same
+interface as <code>palloc</code>, but clears out the memory it
+allocates before it returns it. The function <code>pstrdup</code>
+takes a resource pool and a <code>char *</code> as arguments, and
+allocates memory for a copy of the string the pointer points to,
+returning a pointer to the copy. Finally <code>pstrcat</code> is a
+varargs-style function, which takes a pointer to a resource pool, and
+at least two <code>char *</code> arguments, the last of which must be
+<code>NULL</code>. It allocates enough memory to fit copies of each
+of the strings, as a unit; for instance:
+
+<pre>
+ pstrcat (r->pool, "foo", "/", "bar", NULL);
+</pre>
+
+returns a pointer to 8 bytes worth of memory, initialized to
+<code>"foo/bar"</code>.
+
+<h3>Tracking open files, etc.</h3>
+
+As indicated above, resource pools are also used to track other sorts
+of resources besides memory. The most common are open files. The
+routine which is typically used for this is <code>pfopen</code>, which
+takes a resource pool and two strings as arguments; the strings are
+the same as the typical arguments to <code>fopen</code>, e.g.,
+
+<pre>
+ ...
+ FILE *f = pfopen (r->pool, r->filename, "r");
+
+ if (f == NULL) { ... } else { ... }
+</pre>
+
+There is also a <code>popenf</code> routine, which parallels the
+lower-level <code>open</code> system call. Both of these routines
+arrange for the file to be closed when the resource pool in question
+is cleared. <p>
+
+Unlike the case for memory, there <em>are</em> functions to close
+files allocated with <code>pfopen</code>, and <code>popenf</code>,
+namely <code>pfclose</code> and <code>pclosef</code>. (This is
+because, on many systems, the number of files which a single process
+can have open is quite limited). It is important to use these
+functions to close files allocated with <code>pfopen</code> and
+<code>popenf</code>, since to do otherwise could cause fatal errors on
+systems such as Linux, which react badly if the same
+<code>FILE*</code> is closed more than once. <p>
+
+(Using the <code>close</code> functions is not mandatory, since the
+file will eventually be closed regardless, but you should consider it
+in cases where your module is opening, or could open, a lot of files).
+
+<h3>Other sorts of resources --- cleanup functions</h3>
+
+More text goes here. Describe the the cleanup primitives in terms of
+which the file stuff is implemented; also, <code>spawn_process</code>.
+
+<h3>Fine control --- creating and dealing with sub-pools, with a note
+on sub-requests</h3>
+
+On rare occasions, too-free use of <code>palloc()</code> and the
+associated primitives may result in undesirably profligate resource
+allocation. You can deal with such a case by creating a
+<em>sub-pool</em>, allocating within the sub-pool rather than the main
+pool, and clearing or destroying the sub-pool, which releases the
+resources which were associated with it. (This really <em>is</em> a
+rare situation; the only case in which it comes up in the standard
+module set is in case of listing directories, and then only with
+<em>very</em> large directories. Unnecessary use of the primitives
+discussed here can hair up your code quite a bit, with very little
+gain). <p>
+
+The primitive for creating a sub-pool is <code>make_sub_pool</code>,
+which takes another pool (the parent pool) as an argument. When the
+main pool is cleared, the sub-pool will be destroyed. The sub-pool
+may also be cleared or destroyed at any time, by calling the functions
+<code>clear_pool</code> and <code>destroy_pool</code>, respectively.
+(The difference is that <code>clear_pool</code> frees resources
+associated with the pool, while <code>destroy_pool</code> also
+deallocates the pool itself. In the former case, you can allocate new
+resources within the pool, and clear it again, and so forth; in the
+latter case, it is simply gone). <p>
+
+One final note --- sub-requests have their own resource pools, which
+are sub-pools of the resource pool for the main request. The polite
+way to reclaim the resources associated with a sub request which you
+have allocated (using the <code>sub_req_lookup_...</code> functions)
+is <code>destroy_sub_request</code>, which frees the resource pool.
+Before calling this function, be sure to copy anything that you care
+about which might be allocated in the sub-request's resource pool into
+someplace a little less volatile (for instance, the filename in its
+<code>request_rec</code> structure). <p>
+
+(Again, under most circumstances, you shouldn't feel obliged to call
+this function; only 2K of memory or so are allocated for a typical sub
+request, and it will be freed anyway when the main request pool is
+cleared. It is only when you are allocating many, many sub-requests
+for a single main request that you should seriously consider the
+<code>destroy...</code> functions).
+
+<h2><a name="config">Configuration, commands and the like</a></h2>
+
+One of the design goals for this server was to maintain external
+compatibility with the NCSA 1.3 server --- that is, to read the same
+configuration files, to process all the directives therein correctly,
+and in general to be a drop-in replacement for NCSA. On the other
+hand, another design goal was to move as much of the server's
+functionality into modules which have as little as possible to do with
+the monolithic server core. The only way to reconcile these goals is
+to move the handling of most commands from the central server into the
+modules. <p>
+
+However, just giving the modules command tables is not enough to
+divorce them completely from the server core. The server has to
+remember the commands in order to act on them later. That involves
+maintaining data which is private to the modules, and which can be
+either per-server, or per-directory. Most things are per-directory,
+including in particular access control and authorization information,
+but also information on how to determine file types from suffixes,
+which can be modified by <code>AddType</code> and
+<code>DefaultType</code> directives, and so forth. In general, the
+governing philosophy is that anything which <em>can</em> be made
+configurable by directory should be; per-server information is
+generally used in the standard set of modules for information like
+<code>Alias</code>es and <code>Redirect</code>s which come into play
+before the request is tied to a particular place in the underlying
+file system. <p>
+
+Another requirement for emulating the NCSA server is being able to
+handle the per-directory configuration files, generally called
+<code>.htaccess</code> files, though even in the NCSA server they can
+contain directives which have nothing at all to do with access
+control. Accordingly, after URI -&gt; filename translation, but before
+performing any other phase, the server walks down the directory
+hierarchy of the underlying filesystem, following the translated
+pathname, to read any <code>.htaccess</code> files which might be
+present. The information which is read in then has to be
+<em>merged</em> with the applicable information from the server's own
+config files (either from the <code>&lt;Directory&gt;</code> sections
+in <code>access.conf</code>, or from defaults in
+<code>srm.conf</code>, which actually behaves for most purposes almost
+exactly like <code>&lt;Directory /&gt;</code>).<p>
+
+Finally, after having served a request which involved reading
+<code>.htaccess</code> files, we need to discard the storage allocated
+for handling them. That is solved the same way it is solved wherever
+else similar problems come up, by tying those structures to the
+per-transaction resource pool. <p>
+
+<h3><a name="per-dir">Per-directory configuration structures</a></h3>
+
+Let's look out how all of this plays out in <code>mod_mime.c</code>,
+which defines the file typing handler which emulates the NCSA server's
+behavior of determining file types from suffixes. What we'll be
+looking at, here, is the code which implements the
+<code>AddType</code> and <code>AddEncoding</code> commands. These
+commands can appear in <code>.htaccess</code> files, so they must be
+handled in the module's private per-directory data, which in fact,
+consists of two separate <code>table</code>s for MIME types and
+encoding information, and is declared as follows:
+
+<pre>
+typedef struct {
+ table *forced_types; /* Additional AddTyped stuff */
+ table *encoding_types; /* Added with AddEncoding... */
+} mime_dir_config;
+</pre>
+
+When the server is reading a configuration file, or
+<code>&lt;Directory&gt;</code> section, which includes one of the MIME
+module's commands, it needs to create a <code>mime_dir_config</code>
+structure, so those commands have something to act on. It does this
+by invoking the function it finds in the module's `create per-dir
+config slot', with two arguments: the name of the directory to which
+this configuration information applies (or <code>NULL</code> for
+<code>srm.conf</code>), and a pointer to a resource pool in which the
+allocation should happen. <p>
+
+(If we are reading a <code>.htaccess</code> file, that resource pool
+is the per-request resource pool for the request; otherwise it is a
+resource pool which is used for configuration data, and cleared on
+restarts. Either way, it is important for the structure being created
+to vanish when the pool is cleared, by registering a cleanup on the
+pool if necessary). <p>
+
+For the MIME module, the per-dir config creation function just
+<code>palloc</code>s the structure above, and a creates a couple of
+<code>table</code>s to fill it. That looks like this:
+
+<pre>
+void *create_mime_dir_config (pool *p, char *dummy)
+{
+ mime_dir_config *new =
+ (mime_dir_config *) palloc (p, sizeof(mime_dir_config));
+
+ new-&gt;forced_types = make_table (p, 4);
+ new-&gt;encoding_types = make_table (p, 4);
+
+ return new;
+}
+</pre>
+
+Now, suppose we've just read in a <code>.htaccess</code> file. We
+already have the per-directory configuration structure for the next
+directory up in the hierarchy. If the <code>.htaccess</code> file we
+just read in didn't have any <code>AddType</code> or
+<code>AddEncoding</code> commands, its per-directory config structure
+for the MIME module is still valid, and we can just use it.
+Otherwise, we need to merge the two structures somehow. <p>
+
+To do that, the server invokes the module's per-directory config merge
+function, if one is present. That function takes three arguments:
+the two structures being merged, and a resource pool in which to
+allocate the result. For the MIME module, all that needs to be done
+is overlay the tables from the new per-directory config structure with
+those from the parent:
+
+<pre>
+void *merge_mime_dir_configs (pool *p, void *parent_dirv, void *subdirv)
+{
+ mime_dir_config *parent_dir = (mime_dir_config *)parent_dirv;
+ mime_dir_config *subdir = (mime_dir_config *)subdirv;
+ mime_dir_config *new =
+ (mime_dir_config *)palloc (p, sizeof(mime_dir_config));
+
+ new-&gt;forced_types = overlay_tables (p, subdir-&gt;forced_types,
+ parent_dir-&gt;forced_types);
+ new-&gt;encoding_types = overlay_tables (p, subdir-&gt;encoding_types,
+ parent_dir-&gt;encoding_types);
+
+ return new;
+}
+</pre>
+
+As a note --- if there is no per-directory merge function present, the
+server will just use the subdirectory's configuration info, and ignore
+the parent's. For some modules, that works just fine (e.g., for the
+includes module, whose per-directory configuration information
+consists solely of the state of the <code>XBITHACK</code>), and for
+those modules, you can just not declare one, and leave the
+corresponding structure slot in the module itself <code>NULL</code>.<p>
+
+<h3><a name="commands">Command handling</a></h3>
+
+Now that we have these structures, we need to be able to figure out
+how to fill them. That involves processing the actual
+<code>AddType</code> and <code>AddEncoding</code> commands. To find
+commands, the server looks in the module's <code>command table</code>.
+That table contains information on how many arguments the commands
+take, and in what formats, where it is permitted, and so forth. That
+information is sufficient to allow the server to invoke most
+command-handling functions with pre-parsed arguments. Without further
+ado, let's look at the <code>AddType</code> command handler, which
+looks like this (the <code>AddEncoding</code> command looks basically
+the same, and won't be shown here):
+
+<pre>
+char *add_type(cmd_parms *cmd, mime_dir_config *m, char *ct, char *ext)
+{
+ if (*ext == '.') ++ext;
+ table_set (m-&gt;forced_types, ext, ct);
+ return NULL;
+}
+</pre>
+
+This command handler is unusually simple. As you can see, it takes
+four arguments, two of which are pre-parsed arguments, the third being
+the per-directory configuration structure for the module in question,
+and the fourth being a pointer to a <code>cmd_parms</code> structure.
+That structure contains a bunch of arguments which are frequently of
+use to some, but not all, commands, including a resource pool (from
+which memory can be allocated, and to which cleanups should be tied),
+and the (virtual) server being configured, from which the module's
+per-server configuration data can be obtained if required.<p>
+
+Another way in which this particular command handler is unusually
+simple is that there are no error conditions which it can encounter.
+If there were, it could return an error message instead of
+<code>NULL</code>; this causes an error to be printed out on the
+server's <code>stderr</code>, followed by a quick exit, if it is in
+the main config files; for a <code>.htaccess</code> file, the syntax
+error is logged in the server error log (along with an indication of
+where it came from), and the request is bounced with a server error
+response (HTTP error status, code 500). <p>
+
+The MIME module's command table has entries for these commands, which
+look like this:
+
+<pre>
+command_rec mime_cmds[] = {
+{ "AddType", add_type, NULL, OR_FILEINFO, TAKE2,
+ "a mime type followed by a file extension" },
+{ "AddEncoding", add_encoding, NULL, OR_FILEINFO, TAKE2,
+ "an encoding (e.g., gzip), followed by a file extension" },
+{ NULL }
+};
+</pre>
+
+The entries in these tables are:
+
+<ul>
+ <li> The name of the command
+ <li> The function which handles it
+ <li> a <code>(void *)</code> pointer, which is passed in the
+ <code>cmd_parms</code> structure to the command handler ---
+ this is useful in case many similar commands are handled by the
+ same function.
+ <li> A bit mask indicating where the command may appear. There are
+ mask bits corresponding to each <code>AllowOverride</code>
+ option, and an additional mask bit, <code>RSRC_CONF</code>,
+ indicating that the command may appear in the server's own
+ config files, but <em>not</em> in any <code>.htaccess</code>
+ file.
+ <li> A flag indicating how many arguments the command handler wants
+ pre-parsed, and how they should be passed in.
+ <code>TAKE2</code> indicates two pre-parsed arguments. Other
+ options are <code>TAKE1</code>, which indicates one pre-parsed
+ argument, <code>FLAG</code>, which indicates that the argument
+ should be <code>On</code> or <code>Off</code>, and is passed in
+ as a boolean flag, <code>RAW_ARGS</code>, which causes the
+ server to give the command the raw, unparsed arguments
+ (everything but the command name itself). There is also
+ <code>ITERATE</code>, which means that the handler looks the
+ same as <code>TAKE1</code>, but that if multiple arguments are
+ present, it should be called multiple times, and finally
+ <code>ITERATE2</code>, which indicates that the command handler
+ looks like a <code>TAKE2</code>, but if more arguments are
+ present, then it should be called multiple times, holding the
+ first argument constant.
+ <li> Finally, we have a string which describes the arguments that
+ should be present. If the arguments in the actual config file
+ are not as required, this string will be used to help give a
+ more specific error message. (You can safely leave this
+ <code>NULL</code>).
+</ul>
+
+Finally, having set this all up, we have to use it. This is
+ultimately done in the module's handlers, specifically for its
+file-typing handler, which looks more or less like this; note that the
+per-directory configuration structure is extracted from the
+<code>request_rec</code>'s per-directory configuration vector by using
+the <code>get_module_config</code> function.
+
+<pre>
+int find_ct(request_rec *r)
+{
+ int i;
+ char *fn = pstrdup (r->pool, r->filename);
+ mime_dir_config *conf = (mime_dir_config *)
+ get_module_config(r->per_dir_config, &amp;mime_module);
+ char *type;
+
+ if (S_ISDIR(r->finfo.st_mode)) {
+ r->content_type = DIR_MAGIC_TYPE;
+ return OK;
+ }
+
+ if((i=rind(fn,'.')) &lt; 0) return DECLINED;
+ ++i;
+
+ if ((type = table_get (conf->encoding_types, &amp;fn[i])))
+ {
+ r->content_encoding = type;
+
+ /* go back to previous extension to try to use it as a type */
+
+ fn[i-1] = '\0';
+ if((i=rind(fn,'.')) &lt; 0) return OK;
+ ++i;
+ }
+
+ if ((type = table_get (conf->forced_types, &amp;fn[i])))
+ {
+ r->content_type = type;
+ }
+
+ return OK;
+}
+
+</pre>
+
+<h3><a name="servconf">Side notes --- per-server configuration, virtual servers, etc.</a></h3>
+
+The basic ideas behind per-server module configuration are basically
+the same as those for per-directory configuration; there is a creation
+function and a merge function, the latter being invoked where a
+virtual server has partially overridden the base server configuration,
+and a combined structure must be computed. (As with per-directory
+configuration, the default if no merge function is specified, and a
+module is configured in some virtual server, is that the base
+configuration is simply ignored). <p>
+
+The only substantial difference is that when a command needs to
+configure the per-server private module data, it needs to go to the
+<code>cmd_parms</code> data to get at it. Here's an example, from the
+alias module, which also indicates how a syntax error can be returned
+(note that the per-directory configuration argument to the command
+handler is declared as a dummy, since the module doesn't actually have
+per-directory config data):
+
+<pre>
+char *add_redirect(cmd_parms *cmd, void *dummy, char *f, char *url)
+{
+ server_rec *s = cmd->server;
+ alias_server_conf *conf = (alias_server_conf *)
+ get_module_config(s-&gt;module_config,&amp;alias_module);
+ alias_entry *new = push_array (conf-&gt;redirects);
+
+ if (!is_url (url)) return "Redirect to non-URL";
+
+ new-&gt;fake = f; new-&gt;real = url;
+ return NULL;
+}
+</pre>
+<!--%hypertext -->
+</body></html>
+<!--/%hypertext -->
diff --git a/docs/manual/misc/API.html b/docs/manual/misc/API.html
new file mode 100644
index 0000000000..f860996e47
--- /dev/null
+++ b/docs/manual/misc/API.html
@@ -0,0 +1,988 @@
+<!--%hypertext -->
+<html><head>
+<title>Apache API notes</title>
+</head>
+<body>
+<!--/%hypertext -->
+<h1>Apache API notes</h1>
+
+These are some notes on the Apache API and the data structures you
+have to deal with, etc. They are not yet nearly complete, but
+hopefully, they will help you get your bearings. Keep in mind that
+the API is still subject to change as we gain experience with it.
+(See the TODO file for what <em>might</em> be coming). However,
+it will be easy to adapt modules to any changes that are made.
+(We have more modules to adapt than you do).
+<p>
+
+A few notes on general pedagogical style here. In the interest of
+conciseness, all structure declarations here are incomplete --- the
+real ones have more slots that I'm not telling you about. For the
+most part, these are reserved to one component of the server core or
+another, and should be altered by modules with caution. However, in
+some cases, they really are things I just haven't gotten around to
+yet. Welcome to the bleeding edge.<p>
+
+Finally, here's an outline, to give you some bare idea of what's
+coming up, and in what order:
+
+<ul>
+<li> <a href="#basics">Basic concepts.</a>
+<menu>
+ <li> <a href="#HMR">Handlers, Modules, and Requests</a>
+ <li> <a href="#moduletour">A brief tour of a module</a>
+</menu>
+<li> <a href="#handlers">How handlers work</a>
+<menu>
+ <li> <a href="#req_tour">A brief tour of the <code>request_rec</code></a>
+ <li> <a href="#req_orig">Where request_rec structures come from</a>
+ <li> <a href="#req_return">Handling requests, declining, and returning error codes</a>
+ <li> <a href="#resp_handlers">Special considerations for response handlers</a>
+ <li> <a href="#auth_handlers">Special considerations for authentication handlers</a>
+ <li> <a href="#log_handlers">Special considerations for logging handlers</a>
+</menu>
+<li> <a href="#pools">Resource allocation and resource pools</a>
+<li> <a href="#config">Configuration, commands and the like</a>
+<menu>
+ <li> <a href="#per-dir">Per-directory configuration structures</a>
+ <li> <a href="#commands">Command handling</a>
+ <li> <a href="#servconf">Side notes --- per-server configuration, virtual servers, etc.</a>
+</menu>
+</ul>
+
+<h2><a name="basics">Basic concepts.</a></h2>
+
+We begin with an overview of the basic concepts behind the
+API, and how they are manifested in the code.
+
+<h3><a name="HMR">Handlers, Modules, and Requests</a></h3>
+
+Apache breaks down request handling into a series of steps, more or
+less the same way the Netscape server API does (although this API has
+a few more stages than NetSite does, as hooks for stuff I thought
+might be useful in the future). These are:
+
+<ul>
+ <li> URI -&gt; Filename translation
+ <li> Auth ID checking [is the user who they say they are?]
+ <li> Auth access checking [is the user authorized <em>here</em>?]
+ <li> Access checking other than auth
+ <li> Determining MIME type of the object requested
+ <li> `Fixups' --- there aren't any of these yet, but the phase is
+ intended as a hook for possible extensions like
+ <code>SetEnv</code>, which don't really fit well elsewhere.
+ <li> Actually sending a response back to the client.
+ <li> Logging the request
+</ul>
+
+These phases are handled by looking at each of a succession of
+<em>modules</em>, looking to see if each of them has a handler for the
+phase, and attempting invoking it if so. The handler can typically do
+one of three things:
+
+<ul>
+ <li> <em>Handle</em> the request, and indicate that it has done so
+ by returning the magic constant <code>OK</code>.
+ <li> <em>Decline</em> to handle the request, by returning the magic
+ integer constant <code>DECLINED</code>. In this case, the
+ server behaves in all respects as if the handler simply hadn't
+ been there.
+ <li> Signal an error, by returning one of the HTTP error codes.
+ This terminates normal handling of the request, although an
+ ErrorDocument may be invoked to try to mop up, and it will be
+ logged in any case.
+</ul>
+
+Most phases are terminated by the first module that handles them;
+however, for logging, `fixups', and non-access authentication
+checking, all handlers always run (barring an error). Also, the
+response phase is unique in that modules may declare multiple handlers
+for it, via a dispatch table keyed on the MIME type of the requested
+object. Modules may declare a response-phase handler which can handle
+<em>any</em> request, by giving it the key <code>*/*</code> (i.e., a
+wildcard MIME type specification). However, wildcard handlers are
+only invoked if the server has already tried and failed to find a more
+specific response handler for the MIME type of the requested object
+(either none existed, or they all declined).<p>
+
+The handlers themselves are functions of one argument (a
+<code>request_rec</code> structure. vide infra), which returns an
+integer, as above.<p>
+
+<h3><a name="moduletour">A brief tour of a module</a></h3>
+
+At this point, we need to explain the structure of a module. Our
+candidate will be one of the messier ones, the CGI module --- this
+handles both CGI scripts and the <code>ScriptAlias</code> config file
+command. It's actually a great deal more complicated than most
+modules, but if we're going to have only one example, it might as well
+be the one with its fingers in every place.<p>
+
+Let's begin with handlers. In order to handle the CGI scripts, the
+module declares a response handler for them. Because of
+<code>ScriptAlias</code>, it also has handlers for the name
+translation phase (to recognise <code>ScriptAlias</code>ed URIs), the
+type-checking phase (any <code>ScriptAlias</code>ed request is typed
+as a CGI script).<p>
+
+The module needs to maintain some per (virtual)
+server information, namely, the <code>ScriptAlias</code>es in effect;
+the module structure therefore contains pointers to a functions which
+builds these structures, and to another which combines two of them (in
+case the main server and a virtual server both have
+<code>ScriptAlias</code>es declared).<p>
+
+Finally, this module contains code to handle the
+<code>ScriptAlias</code> command itself. This particular module only
+declares one command, but there could be more, so modules have
+<em>command tables</em> which declare their commands, and describe
+where they are permitted, and how they are to be invoked. <p>
+
+A final note on the declared types of the arguments of some of these
+commands: a <code>pool</code> is a pointer to a <em>resource pool</em>
+structure; these are used by the server to keep track of the memory
+which has been allocated, files opened, etc., either to service a
+particular request, or to handle the process of configuring itself.
+That way, when the request is over (or, for the configuration pool,
+when the server is restarting), the memory can be freed, and the files
+closed, <i>en masse</i>, without anyone having to write explicit code to
+track them all down and dispose of them. Also, a
+<code>cmd_parms</code> structure contains various information about
+the config file being read, and other status information, which is
+sometimes of use to the function which processes a config-file command
+(such as <code>ScriptAlias</code>).
+
+With no further ado, the module itself:
+
+<pre>
+/* Declarations of handlers. */
+
+int translate_scriptalias (request_rec *);
+int type_scriptalias (request_rec *);
+int cgi_handler (request_rec *);
+
+/* Subsidiary dispatch table for response-phase handlers, by MIME type */
+
+handler_rec cgi_handlers[] = {
+{ "application/x-httpd-cgi", cgi_handler },
+{ NULL }
+};
+
+/* Declarations of routines to manipulate the module's configuration
+ * info. Note that these are returned, and passed in, as void *'s;
+ * the server core keeps track of them, but it doesn't, and can't,
+ * know their internal structure.
+ */
+
+void *make_cgi_server_config (pool *);
+void *merge_cgi_server_config (pool *, void *, void *);
+
+/* Declarations of routines to handle config-file commands */
+
+extern char *script_alias(cmd_parms *, void *per_dir_config, char *fake,
+ char *real);
+
+command_rec cgi_cmds[] = {
+{ "ScriptAlias", script_alias, NULL, RSRC_CONF, TAKE2,
+ "a fakename and a realname"},
+{ NULL }
+};
+
+module cgi_module = {
+ STANDARD_MODULE_STUFF,
+ NULL, /* initializer */
+ NULL, /* dir config creator */
+ NULL, /* dir merger --- default is to override */
+ make_cgi_server_config, /* server config */
+ merge_cgi_server_config, /* merge server config */
+ cgi_cmds, /* command table */
+ cgi_handlers, /* handlers */
+ translate_scriptalias, /* filename translation */
+ NULL, /* check_user_id */
+ NULL, /* check auth */
+ NULL, /* check access */
+ type_scriptalias, /* type_checker */
+ NULL, /* fixups */
+ NULL /* logger */
+};
+</pre>
+
+<h2><a name="handlers">How handlers work</a></h2>
+
+The sole argument to handlers is a <code>request_rec</code> structure.
+This structure describes a particular request which has been made to
+the server, on behalf of a client. In most cases, each connection to
+the client generates only one <code>request_rec</code> structure.<p>
+
+<h3><a name="req_tour">A brief tour of the <code>request_rec</code></a></h3>
+
+The <code>request_rec</code> contains pointers to a resource pool
+which will be cleared when the server is finished handling the
+request; to structures containing per-server and per-connection
+information, and most importantly, information on the request itself.<p>
+
+The most important such information is a small set of character
+strings describing attributes of the object being requested, including
+its URI, filename, content-type and content-encoding (these being filled
+in by the translation and type-check handlers which handle the
+request, respectively). <p>
+
+Other commonly used data items are tables giving the MIME headers on
+the client's original request, MIME headers to be sent back with the
+response (which modules can add to at will), and environment variables
+for any subprocesses which are spawned off in the course of servicing
+the request. These tables are manipulated using the
+<code>table_get</code> and <code>table_set</code> routines. <p>
+
+Finally, there are pointers to two data structures which, in turn,
+point to per-module configuration structures. Specifically, these
+hold pointers to the data structures which the module has built to
+describe the way it has been configured to operate in a given
+directory (via <code>.htaccess</code> files or
+<code>&lt;Directory&gt;</code> sections), for private data it has
+built in the course of servicing the request (so modules' handlers for
+one phase can pass `notes' to their handlers for other phases). There
+is another such configuration vector in the <code>server_rec</code>
+data structure pointed to by the <code>request_rec</code>, which
+contains per (virtual) server configuration data.<p>
+
+Here is an abridged declaration, giving the fields most commonly used:<p>
+
+<pre>
+struct request_rec {
+
+ pool *pool;
+ conn_rec *connection;
+ server_rec *server;
+
+ /* What object is being requested */
+
+ char *uri;
+ char *filename;
+ char *path_info;
+ char *args; /* QUERY_ARGS, if any */
+ struct stat finfo; /* Set by server core;
+ * st_mode set to zero if no such file */
+
+ char *content_type;
+ char *content_encoding;
+
+ /* MIME header environments, in and out. Also, an array containing
+ * environment variables to be passed to subprocesses, so people can
+ * write modules to add to that environment.
+ *
+ * The difference between headers_out and err_headers_out is that
+ * the latter are printed even on error, and persist across internal
+ * redirects (so the headers printed for ErrorDocument handlers will
+ * have them).
+ */
+
+ table *headers_in;
+ table *headers_out;
+ table *err_headers_out;
+ table *subprocess_env;
+
+ /* Info about the request itself... */
+
+ int header_only; /* HEAD request, as opposed to GET */
+ char *protocol; /* Protocol, as given to us, or HTTP/0.9 */
+ char *method; /* GET, HEAD, POST, etc. */
+ int method_number; /* M_GET, M_POST, etc. */
+
+ /* Info for logging */
+
+ char *the_request;
+ int bytes_sent;
+
+ /* A flag which modules can set, to indicate that the data being
+ * returned is volatile, and clients should be told not to cache it.
+ */
+
+ int no_cache;
+
+ /* Various other config info which may change with .htaccess files
+ * These are config vectors, with one void* pointer for each module
+ * (the thing pointed to being the module's business).
+ */
+
+ void *per_dir_config; /* Options set in config files, etc. */
+ void *request_config; /* Notes on *this* request */
+
+};
+
+</pre>
+
+<h3><a name="req_orig">Where request_rec structures come from</a></h3>
+
+Most <code>request_rec</code> structures are built by reading an HTTP
+request from a client, and filling in the fields. However, there are
+a few exceptions:
+
+<ul>
+ <li> If the request is to an imagemap, a type map (i.e., a
+ <code>*.var</code> file), or a CGI script which returned a
+ local `Location:', then the resource which the user requested
+ is going to be ultimately located by some URI other than what
+ the client originally supplied. In this case, the server does
+ an <em>internal redirect</em>, constructing a new
+ <code>request_rec</code> for the new URI, and processing it
+ almost exactly as if the client had requested the new URI
+ directly. <p>
+
+ <li> If some handler signaled an error, and an
+ <code>ErrorDocument</code> is in scope, the same internal
+ redirect machinery comes into play.<p>
+
+ <li> Finally, a handler occasionally needs to investigate `what
+ would happen if' some other request were run. For instance,
+ the directory indexing module needs to know what MIME type
+ would be assigned to a request for each directory entry, in
+ order to figure out what icon to use.<p>
+
+ Such handlers can construct a <em>sub-request</em>, using the
+ functions <code>sub_req_lookup_file</code> and
+ <code>sub_req_lookup_uri</code>; this constructs a new
+ <code>request_rec</code> structure and processes it as you
+ would expect, up to but not including the point of actually
+ sending a response. (These functions skip over the access
+ checks if the sub-request is for a file in the same directory
+ as the original request).<p>
+
+ (Server-side includes work by building sub-requests and then
+ actually invoking the response handler for them, via the
+ function <code>run_sub_request</code>).
+</ul>
+
+<h3><a name="req_return">Handling requests, declining, and returning error codes</a></h3>
+
+As discussed above, each handler, when invoked to handle a particular
+<code>request_rec</code>, has to return an <code>int</code> to
+indicate what happened. That can either be
+
+<ul>
+ <li> OK --- the request was handled successfully. This may or may
+ not terminate the phase.
+ <li> DECLINED --- no erroneous condition exists, but the module
+ declines to handle the phase; the server tries to find another.
+ <li> an HTTP error code, which aborts handling of the request.
+</ul>
+
+Note that if the error code returned is <code>REDIRECT</code>, then
+the module should put a <code>Location</code> in the request's
+<code>headers_out</code>, to indicate where the client should be
+redirected <em>to</em>. <p>
+
+<h3><a name="resp_handlers">Special considerations for response handlers</a></h3>
+
+Handlers for most phases do their work by simply setting a few fields
+in the <code>request_rec</code> structure (or, in the case of access
+checkers, simply by returning the correct error code). However,
+response handlers have to actually send a request back to the client. <p>
+
+They should begin by sending an HTTP response header, using the
+function <code>send_http_header</code>. (You don't have to do
+anything special to skip sending the header for HTTP/0.9 requests; the
+function figures out on its own that it shouldn't do anything). If
+the request is marked <code>header_only</code>, that's all they should
+do; they should return after that, without attempting any further
+output. <p>
+
+Otherwise, they should produce a request body which responds to the
+client as appropriate. The primitives for this are <code>rputc</code>
+and <code>rprintf</code>, for internally generated output, and
+<code>send_fd</code>, to copy the contents of some <code>FILE *</code>
+straight to the client. <p>
+
+At this point, you should more or less understand the following piece
+of code, which is the handler which handles <code>GET</code> requests
+which have no more specific handler; it also shows how conditional
+<code>GET</code>s can be handled, if it's desirable to do so in a
+particular response handler --- <code>set_last_modified</code> checks
+against the <code>If-modified-since</code> value supplied by the
+client, if any, and returns an appropriate code (which will, if
+nonzero, be USE_LOCAL_COPY). No similar considerations apply for
+<code>set_content_length</code>, but it returns an error code for
+symmetry.<p>
+
+<pre>
+int default_handler (request_rec *r)
+{
+ int errstatus;
+ FILE *f;
+
+ if (r-&gt;method_number != M_GET) return DECLINED;
+ if (r-&gt;finfo.st_mode == 0) return NOT_FOUND;
+
+ if ((errstatus = set_content_length (r, r-&gt;finfo.st_size))
+ || (errstatus = set_last_modified (r, r-&gt;finfo.st_mtime)))
+ return errstatus;
+
+ f = fopen (r-&gt;filename, "r");
+
+ if (f == NULL) {
+ log_reason("file permissions deny server access",
+ r-&gt;filename, r);
+ return FORBIDDEN;
+ }
+
+ register_timeout ("send", r);
+ send_http_header (r);
+
+ if (!r-&gt;header_only) send_fd (f, r);
+ pfclose (r-&gt;pool, f);
+ return OK;
+}
+</pre>
+
+Finally, if all of this is too much of a challenge, there are a few
+ways out of it. First off, as shown above, a response handler which
+has not yet produced any output can simply return an error code, in
+which case the server will automatically produce an error response.
+Secondly, it can punt to some other handler by invoking
+<code>internal_redirect</code>, which is how the internal redirection
+machinery discussed above is invoked. A response handler which has
+internally redirected should always return <code>OK</code>. <p>
+
+(Invoking <code>internal_redirect</code> from handlers which are
+<em>not</em> response handlers will lead to serious confusion).
+
+<h3><a name="auth_handlers">Special considerations for authentication handlers</a></h3>
+
+Stuff that should be discussed here in detail:
+
+<ul>
+ <li> Authentication-phase handlers not invoked unless auth is
+ configured for the directory.
+ <li> Common auth configuration stored in the core per-dir
+ configuration; it has accessors <code>auth_type</code>,
+ <code>auth_name</code>, and <code>requires</code>.
+ <li> Common routines, to handle the protocol end of things, at least
+ for HTTP basic authentication (<code>get_basic_auth_pw</code>,
+ which sets the <code>connection-&gt;user</code> structure field
+ automatically, and <code>note_basic_auth_failure</code>, which
+ arranges for the proper <code>WWW-Authenticate:</code> header
+ to be sent back).
+</ul>
+
+<h3><a name="log_handlers">Special considerations for logging handlers</a></h3>
+
+When a request has internally redirected, there is the question of
+what to log. Apache handles this by bundling the entire chain of
+redirects into a list of <code>request_rec</code> structures which are
+threaded through the <code>r-&gt;prev</code> and <code>r-&gt;next</code>
+pointers. The <code>request_rec</code> which is passed to the logging
+handlers in such cases is the one which was originally built for the
+initial request from the client; note that the bytes_sent field will
+only be correct in the last request in the chain (the one for which a
+response was actually sent).
+
+<h2><a name="pools">Resource allocation and resource pools</a></h2>
+
+One of the problems of writing and designing a server-pool server is
+that of preventing leakage, that is, allocating resources (memory,
+open files, etc.), without subsequently releasing them. The resource
+pool machinery is designed to make it easy to prevent this from
+happening, by allowing resource to be allocated in such a way that
+they are <em>automatically</em> released when the server is done with
+them. <p>
+
+The way this works is as follows: the memory which is allocated, file
+opened, etc., to deal with a particular request are tied to a
+<em>resource pool</em> which is allocated for the request. The pool
+is a data structure which itself tracks the resources in question. <p>
+
+When the request has been processed, the pool is <em>cleared</em>. At
+that point, all the memory associated with it is released for reuse,
+all files associated with it are closed, and any other clean-up
+functions which are associated with the pool are run. When this is
+over, we can be confident that all the resource tied to the pool have
+been released, and that none of them have leaked. <p>
+
+Server restarts, and allocation of memory and resources for per-server
+configuration, are handled in a similar way. There is a
+<em>configuration pool</em>, which keeps track of resources which were
+allocated while reading the server configuration files, and handling
+the commands therein (for instance, the memory that was allocated for
+per-server module configuration, log files and other files that were
+opened, and so forth). When the server restarts, and has to reread
+the configuration files, the configuration pool is cleared, and so the
+memory and file descriptors which were taken up by reading them the
+last time are made available for reuse. <p>
+
+It should be noted that use of the pool machinery isn't generally
+obligatory, except for situations like logging handlers, where you
+really need to register cleanups to make sure that the log file gets
+closed when the server restarts (this is most easily done by using the
+function <code><a href="#pool-files">pfopen</a></code>, which also
+arranges for the underlying file descriptor to be closed before any
+child processes, such as for CGI scripts, are <code>exec</code>ed), or
+in case you are using the timeout machinery (which isn't yet even
+documented here). However, there are two benefits to using it:
+resources allocated to a pool never leak (even if you allocate a
+scratch string, and just forget about it); also, for memory
+allocation, <code>palloc</code> is generally faster than
+<code>malloc</code>.<p>
+
+We begin here by describing how memory is allocated to pools, and then
+discuss how other resources are tracked by the resource pool
+machinery.
+
+<h3>Allocation of memory in pools</h3>
+
+Memory is allocated to pools by calling the function
+<code>palloc</code>, which takes two arguments, one being a pointer to
+a resource pool structure, and the other being the amount of memory to
+allocate (in <code>char</code>s). Within handlers for handling
+requests, the most common way of getting a resource pool structure is
+by looking at the <code>pool</code> slot of the relevant
+<code>request_rec</code>; hence the repeated appearance of the
+following idiom in module code:
+
+<pre>
+int my_handler(request_rec *r)
+{
+ struct my_structure *foo;
+ ...
+
+ foo = (foo *)palloc (r->pool, sizeof(my_structure));
+}
+</pre>
+
+Note that <em>there is no <code>pfree</code></em> ---
+<code>palloc</code>ed memory is freed only when the associated
+resource pool is cleared. This means that <code>palloc</code> does not
+have to do as much accounting as <code>malloc()</code>; all it does in
+the typical case is to round up the size, bump a pointer, and do a
+range check.<p>
+
+(It also raises the possibility that heavy use of <code>palloc</code>
+could cause a server process to grow excessively large. There are
+two ways to deal with this, which are dealt with below; briefly, you
+can use <code>malloc</code>, and try to be sure that all of the memory
+gets explicitly <code>free</code>d, or you can allocate a sub-pool of
+the main pool, allocate your memory in the sub-pool, and clear it out
+periodically. The latter technique is discussed in the section on
+sub-pools below, and is used in the directory-indexing code, in order
+to avoid excessive storage allocation when listing directories with
+thousands of files).
+
+<h3>Allocating initialized memory</h3>
+
+There are functions which allocate initialized memory, and are
+frequently useful. The function <code>pcalloc</code> has the same
+interface as <code>palloc</code>, but clears out the memory it
+allocates before it returns it. The function <code>pstrdup</code>
+takes a resource pool and a <code>char *</code> as arguments, and
+allocates memory for a copy of the string the pointer points to,
+returning a pointer to the copy. Finally <code>pstrcat</code> is a
+varargs-style function, which takes a pointer to a resource pool, and
+at least two <code>char *</code> arguments, the last of which must be
+<code>NULL</code>. It allocates enough memory to fit copies of each
+of the strings, as a unit; for instance:
+
+<pre>
+ pstrcat (r->pool, "foo", "/", "bar", NULL);
+</pre>
+
+returns a pointer to 8 bytes worth of memory, initialized to
+<code>"foo/bar"</code>.
+
+<h3>Tracking open files, etc.</h3>
+
+As indicated above, resource pools are also used to track other sorts
+of resources besides memory. The most common are open files. The
+routine which is typically used for this is <code>pfopen</code>, which
+takes a resource pool and two strings as arguments; the strings are
+the same as the typical arguments to <code>fopen</code>, e.g.,
+
+<pre>
+ ...
+ FILE *f = pfopen (r->pool, r->filename, "r");
+
+ if (f == NULL) { ... } else { ... }
+</pre>
+
+There is also a <code>popenf</code> routine, which parallels the
+lower-level <code>open</code> system call. Both of these routines
+arrange for the file to be closed when the resource pool in question
+is cleared. <p>
+
+Unlike the case for memory, there <em>are</em> functions to close
+files allocated with <code>pfopen</code>, and <code>popenf</code>,
+namely <code>pfclose</code> and <code>pclosef</code>. (This is
+because, on many systems, the number of files which a single process
+can have open is quite limited). It is important to use these
+functions to close files allocated with <code>pfopen</code> and
+<code>popenf</code>, since to do otherwise could cause fatal errors on
+systems such as Linux, which react badly if the same
+<code>FILE*</code> is closed more than once. <p>
+
+(Using the <code>close</code> functions is not mandatory, since the
+file will eventually be closed regardless, but you should consider it
+in cases where your module is opening, or could open, a lot of files).
+
+<h3>Other sorts of resources --- cleanup functions</h3>
+
+More text goes here. Describe the the cleanup primitives in terms of
+which the file stuff is implemented; also, <code>spawn_process</code>.
+
+<h3>Fine control --- creating and dealing with sub-pools, with a note
+on sub-requests</h3>
+
+On rare occasions, too-free use of <code>palloc()</code> and the
+associated primitives may result in undesirably profligate resource
+allocation. You can deal with such a case by creating a
+<em>sub-pool</em>, allocating within the sub-pool rather than the main
+pool, and clearing or destroying the sub-pool, which releases the
+resources which were associated with it. (This really <em>is</em> a
+rare situation; the only case in which it comes up in the standard
+module set is in case of listing directories, and then only with
+<em>very</em> large directories. Unnecessary use of the primitives
+discussed here can hair up your code quite a bit, with very little
+gain). <p>
+
+The primitive for creating a sub-pool is <code>make_sub_pool</code>,
+which takes another pool (the parent pool) as an argument. When the
+main pool is cleared, the sub-pool will be destroyed. The sub-pool
+may also be cleared or destroyed at any time, by calling the functions
+<code>clear_pool</code> and <code>destroy_pool</code>, respectively.
+(The difference is that <code>clear_pool</code> frees resources
+associated with the pool, while <code>destroy_pool</code> also
+deallocates the pool itself. In the former case, you can allocate new
+resources within the pool, and clear it again, and so forth; in the
+latter case, it is simply gone). <p>
+
+One final note --- sub-requests have their own resource pools, which
+are sub-pools of the resource pool for the main request. The polite
+way to reclaim the resources associated with a sub request which you
+have allocated (using the <code>sub_req_lookup_...</code> functions)
+is <code>destroy_sub_request</code>, which frees the resource pool.
+Before calling this function, be sure to copy anything that you care
+about which might be allocated in the sub-request's resource pool into
+someplace a little less volatile (for instance, the filename in its
+<code>request_rec</code> structure). <p>
+
+(Again, under most circumstances, you shouldn't feel obliged to call
+this function; only 2K of memory or so are allocated for a typical sub
+request, and it will be freed anyway when the main request pool is
+cleared. It is only when you are allocating many, many sub-requests
+for a single main request that you should seriously consider the
+<code>destroy...</code> functions).
+
+<h2><a name="config">Configuration, commands and the like</a></h2>
+
+One of the design goals for this server was to maintain external
+compatibility with the NCSA 1.3 server --- that is, to read the same
+configuration files, to process all the directives therein correctly,
+and in general to be a drop-in replacement for NCSA. On the other
+hand, another design goal was to move as much of the server's
+functionality into modules which have as little as possible to do with
+the monolithic server core. The only way to reconcile these goals is
+to move the handling of most commands from the central server into the
+modules. <p>
+
+However, just giving the modules command tables is not enough to
+divorce them completely from the server core. The server has to
+remember the commands in order to act on them later. That involves
+maintaining data which is private to the modules, and which can be
+either per-server, or per-directory. Most things are per-directory,
+including in particular access control and authorization information,
+but also information on how to determine file types from suffixes,
+which can be modified by <code>AddType</code> and
+<code>DefaultType</code> directives, and so forth. In general, the
+governing philosophy is that anything which <em>can</em> be made
+configurable by directory should be; per-server information is
+generally used in the standard set of modules for information like
+<code>Alias</code>es and <code>Redirect</code>s which come into play
+before the request is tied to a particular place in the underlying
+file system. <p>
+
+Another requirement for emulating the NCSA server is being able to
+handle the per-directory configuration files, generally called
+<code>.htaccess</code> files, though even in the NCSA server they can
+contain directives which have nothing at all to do with access
+control. Accordingly, after URI -&gt; filename translation, but before
+performing any other phase, the server walks down the directory
+hierarchy of the underlying filesystem, following the translated
+pathname, to read any <code>.htaccess</code> files which might be
+present. The information which is read in then has to be
+<em>merged</em> with the applicable information from the server's own
+config files (either from the <code>&lt;Directory&gt;</code> sections
+in <code>access.conf</code>, or from defaults in
+<code>srm.conf</code>, which actually behaves for most purposes almost
+exactly like <code>&lt;Directory /&gt;</code>).<p>
+
+Finally, after having served a request which involved reading
+<code>.htaccess</code> files, we need to discard the storage allocated
+for handling them. That is solved the same way it is solved wherever
+else similar problems come up, by tying those structures to the
+per-transaction resource pool. <p>
+
+<h3><a name="per-dir">Per-directory configuration structures</a></h3>
+
+Let's look out how all of this plays out in <code>mod_mime.c</code>,
+which defines the file typing handler which emulates the NCSA server's
+behavior of determining file types from suffixes. What we'll be
+looking at, here, is the code which implements the
+<code>AddType</code> and <code>AddEncoding</code> commands. These
+commands can appear in <code>.htaccess</code> files, so they must be
+handled in the module's private per-directory data, which in fact,
+consists of two separate <code>table</code>s for MIME types and
+encoding information, and is declared as follows:
+
+<pre>
+typedef struct {
+ table *forced_types; /* Additional AddTyped stuff */
+ table *encoding_types; /* Added with AddEncoding... */
+} mime_dir_config;
+</pre>
+
+When the server is reading a configuration file, or
+<code>&lt;Directory&gt;</code> section, which includes one of the MIME
+module's commands, it needs to create a <code>mime_dir_config</code>
+structure, so those commands have something to act on. It does this
+by invoking the function it finds in the module's `create per-dir
+config slot', with two arguments: the name of the directory to which
+this configuration information applies (or <code>NULL</code> for
+<code>srm.conf</code>), and a pointer to a resource pool in which the
+allocation should happen. <p>
+
+(If we are reading a <code>.htaccess</code> file, that resource pool
+is the per-request resource pool for the request; otherwise it is a
+resource pool which is used for configuration data, and cleared on
+restarts. Either way, it is important for the structure being created
+to vanish when the pool is cleared, by registering a cleanup on the
+pool if necessary). <p>
+
+For the MIME module, the per-dir config creation function just
+<code>palloc</code>s the structure above, and a creates a couple of
+<code>table</code>s to fill it. That looks like this:
+
+<pre>
+void *create_mime_dir_config (pool *p, char *dummy)
+{
+ mime_dir_config *new =
+ (mime_dir_config *) palloc (p, sizeof(mime_dir_config));
+
+ new-&gt;forced_types = make_table (p, 4);
+ new-&gt;encoding_types = make_table (p, 4);
+
+ return new;
+}
+</pre>
+
+Now, suppose we've just read in a <code>.htaccess</code> file. We
+already have the per-directory configuration structure for the next
+directory up in the hierarchy. If the <code>.htaccess</code> file we
+just read in didn't have any <code>AddType</code> or
+<code>AddEncoding</code> commands, its per-directory config structure
+for the MIME module is still valid, and we can just use it.
+Otherwise, we need to merge the two structures somehow. <p>
+
+To do that, the server invokes the module's per-directory config merge
+function, if one is present. That function takes three arguments:
+the two structures being merged, and a resource pool in which to
+allocate the result. For the MIME module, all that needs to be done
+is overlay the tables from the new per-directory config structure with
+those from the parent:
+
+<pre>
+void *merge_mime_dir_configs (pool *p, void *parent_dirv, void *subdirv)
+{
+ mime_dir_config *parent_dir = (mime_dir_config *)parent_dirv;
+ mime_dir_config *subdir = (mime_dir_config *)subdirv;
+ mime_dir_config *new =
+ (mime_dir_config *)palloc (p, sizeof(mime_dir_config));
+
+ new-&gt;forced_types = overlay_tables (p, subdir-&gt;forced_types,
+ parent_dir-&gt;forced_types);
+ new-&gt;encoding_types = overlay_tables (p, subdir-&gt;encoding_types,
+ parent_dir-&gt;encoding_types);
+
+ return new;
+}
+</pre>
+
+As a note --- if there is no per-directory merge function present, the
+server will just use the subdirectory's configuration info, and ignore
+the parent's. For some modules, that works just fine (e.g., for the
+includes module, whose per-directory configuration information
+consists solely of the state of the <code>XBITHACK</code>), and for
+those modules, you can just not declare one, and leave the
+corresponding structure slot in the module itself <code>NULL</code>.<p>
+
+<h3><a name="commands">Command handling</a></h3>
+
+Now that we have these structures, we need to be able to figure out
+how to fill them. That involves processing the actual
+<code>AddType</code> and <code>AddEncoding</code> commands. To find
+commands, the server looks in the module's <code>command table</code>.
+That table contains information on how many arguments the commands
+take, and in what formats, where it is permitted, and so forth. That
+information is sufficient to allow the server to invoke most
+command-handling functions with pre-parsed arguments. Without further
+ado, let's look at the <code>AddType</code> command handler, which
+looks like this (the <code>AddEncoding</code> command looks basically
+the same, and won't be shown here):
+
+<pre>
+char *add_type(cmd_parms *cmd, mime_dir_config *m, char *ct, char *ext)
+{
+ if (*ext == '.') ++ext;
+ table_set (m-&gt;forced_types, ext, ct);
+ return NULL;
+}
+</pre>
+
+This command handler is unusually simple. As you can see, it takes
+four arguments, two of which are pre-parsed arguments, the third being
+the per-directory configuration structure for the module in question,
+and the fourth being a pointer to a <code>cmd_parms</code> structure.
+That structure contains a bunch of arguments which are frequently of
+use to some, but not all, commands, including a resource pool (from
+which memory can be allocated, and to which cleanups should be tied),
+and the (virtual) server being configured, from which the module's
+per-server configuration data can be obtained if required.<p>
+
+Another way in which this particular command handler is unusually
+simple is that there are no error conditions which it can encounter.
+If there were, it could return an error message instead of
+<code>NULL</code>; this causes an error to be printed out on the
+server's <code>stderr</code>, followed by a quick exit, if it is in
+the main config files; for a <code>.htaccess</code> file, the syntax
+error is logged in the server error log (along with an indication of
+where it came from), and the request is bounced with a server error
+response (HTTP error status, code 500). <p>
+
+The MIME module's command table has entries for these commands, which
+look like this:
+
+<pre>
+command_rec mime_cmds[] = {
+{ "AddType", add_type, NULL, OR_FILEINFO, TAKE2,
+ "a mime type followed by a file extension" },
+{ "AddEncoding", add_encoding, NULL, OR_FILEINFO, TAKE2,
+ "an encoding (e.g., gzip), followed by a file extension" },
+{ NULL }
+};
+</pre>
+
+The entries in these tables are:
+
+<ul>
+ <li> The name of the command
+ <li> The function which handles it
+ <li> a <code>(void *)</code> pointer, which is passed in the
+ <code>cmd_parms</code> structure to the command handler ---
+ this is useful in case many similar commands are handled by the
+ same function.
+ <li> A bit mask indicating where the command may appear. There are
+ mask bits corresponding to each <code>AllowOverride</code>
+ option, and an additional mask bit, <code>RSRC_CONF</code>,
+ indicating that the command may appear in the server's own
+ config files, but <em>not</em> in any <code>.htaccess</code>
+ file.
+ <li> A flag indicating how many arguments the command handler wants
+ pre-parsed, and how they should be passed in.
+ <code>TAKE2</code> indicates two pre-parsed arguments. Other
+ options are <code>TAKE1</code>, which indicates one pre-parsed
+ argument, <code>FLAG</code>, which indicates that the argument
+ should be <code>On</code> or <code>Off</code>, and is passed in
+ as a boolean flag, <code>RAW_ARGS</code>, which causes the
+ server to give the command the raw, unparsed arguments
+ (everything but the command name itself). There is also
+ <code>ITERATE</code>, which means that the handler looks the
+ same as <code>TAKE1</code>, but that if multiple arguments are
+ present, it should be called multiple times, and finally
+ <code>ITERATE2</code>, which indicates that the command handler
+ looks like a <code>TAKE2</code>, but if more arguments are
+ present, then it should be called multiple times, holding the
+ first argument constant.
+ <li> Finally, we have a string which describes the arguments that
+ should be present. If the arguments in the actual config file
+ are not as required, this string will be used to help give a
+ more specific error message. (You can safely leave this
+ <code>NULL</code>).
+</ul>
+
+Finally, having set this all up, we have to use it. This is
+ultimately done in the module's handlers, specifically for its
+file-typing handler, which looks more or less like this; note that the
+per-directory configuration structure is extracted from the
+<code>request_rec</code>'s per-directory configuration vector by using
+the <code>get_module_config</code> function.
+
+<pre>
+int find_ct(request_rec *r)
+{
+ int i;
+ char *fn = pstrdup (r->pool, r->filename);
+ mime_dir_config *conf = (mime_dir_config *)
+ get_module_config(r->per_dir_config, &amp;mime_module);
+ char *type;
+
+ if (S_ISDIR(r->finfo.st_mode)) {
+ r->content_type = DIR_MAGIC_TYPE;
+ return OK;
+ }
+
+ if((i=rind(fn,'.')) &lt; 0) return DECLINED;
+ ++i;
+
+ if ((type = table_get (conf->encoding_types, &amp;fn[i])))
+ {
+ r->content_encoding = type;
+
+ /* go back to previous extension to try to use it as a type */
+
+ fn[i-1] = '\0';
+ if((i=rind(fn,'.')) &lt; 0) return OK;
+ ++i;
+ }
+
+ if ((type = table_get (conf->forced_types, &amp;fn[i])))
+ {
+ r->content_type = type;
+ }
+
+ return OK;
+}
+
+</pre>
+
+<h3><a name="servconf">Side notes --- per-server configuration, virtual servers, etc.</a></h3>
+
+The basic ideas behind per-server module configuration are basically
+the same as those for per-directory configuration; there is a creation
+function and a merge function, the latter being invoked where a
+virtual server has partially overridden the base server configuration,
+and a combined structure must be computed. (As with per-directory
+configuration, the default if no merge function is specified, and a
+module is configured in some virtual server, is that the base
+configuration is simply ignored). <p>
+
+The only substantial difference is that when a command needs to
+configure the per-server private module data, it needs to go to the
+<code>cmd_parms</code> data to get at it. Here's an example, from the
+alias module, which also indicates how a syntax error can be returned
+(note that the per-directory configuration argument to the command
+handler is declared as a dummy, since the module doesn't actually have
+per-directory config data):
+
+<pre>
+char *add_redirect(cmd_parms *cmd, void *dummy, char *f, char *url)
+{
+ server_rec *s = cmd->server;
+ alias_server_conf *conf = (alias_server_conf *)
+ get_module_config(s-&gt;module_config,&amp;alias_module);
+ alias_entry *new = push_array (conf-&gt;redirects);
+
+ if (!is_url (url)) return "Redirect to non-URL";
+
+ new-&gt;fake = f; new-&gt;real = url;
+ return NULL;
+}
+</pre>
+<!--%hypertext -->
+</body></html>
+<!--/%hypertext -->
diff --git a/docs/manual/misc/FAQ.html b/docs/manual/misc/FAQ.html
new file mode 100644
index 0000000000..b630a283f0
--- /dev/null
+++ b/docs/manual/misc/FAQ.html
@@ -0,0 +1,162 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
+<HTML>
+<HEAD>
+<TITLE>Apache server Frequently Asked Questions</TITLE>
+</HEAD>
+
+<BODY>
+<IMG SRC="../images/apache_sub.gif" ALT="">
+<H1>Apache server Frequently Asked Questions</H1>
+
+<H2>The Questions</H2>
+<OL>
+<LI><A HREF="#what">What is Apache ?</A>
+<LI><A HREF="#why">Why was Apache created ?</A>
+<LI><A HREF="#relate">How does the Apache group relate to other servers ?</A>
+<LI><A HREF="#name">Why the name "Apache" ?</A>
+<LI><A HREF="#compatible">How compatible is Apache with my existing NCSA 1.3 setup ?</A>
+<LI><A HREF="#compare">OK, so how does Apache compare to other servers ?</A>
+<LI><A HREF="#tested">How thoroughly tested is Apache?</A>
+<LI><A HREF="#proxy">Does or will Apache act as a Proxy server?</A>
+<LI><A HREF="#future">What are the future plans for Apache ?</A>
+<LI><A HREF="#support">Who do I contact for support ?</A>
+<LI><A HREF="#more">Is there any more information on Apache ?</A>
+<LI><A HREF="#where">Where can get Apache ?</A>
+</OL>
+
+<HR>
+
+<H2>The Answers</H2>
+<OL>
+<LI><A name="what">What is Apache ?</A>
+<P>
+ Apache was originally based on code and ideas found in the most
+popular HTTP server of the time.. NCSA httpd 1.3 (early 1995). It has
+since evolved into a far superior system which can rival (and probably
+surpass) almost any other UNIX based HTTP server in terms of functionality,
+efficiency and speed.
+<p>Since it began, it has been completely rewritten, and includes many new
+features. Apache is, as of June 1996, the most popular WWW server on
+the Internet, according to the <a
+href="http://www.netcraft.com/Survey/">Netcraft Survey</a>.
+
+</P>
+<HR>
+<LI><A name="relate">How does the Apache group relate to other
+server efforts, such as NCSA's?</A>
+<P>
+We, of course, owe a great debt to NCSA and their programmers for
+making the server Apache was based on. We now, however, have our own
+server, and our project is mostly our own. The Apache Project is an
+entitely independent venture.
+</P>
+<HR>
+
+<LI><A name="why">Why was Apache created ?</A>
+<P>to address concerns of a group of www providers and part time httpd
+programmers, that httpd didn't behave as they wanted it
+to. Apache is an entirely volunteer effort, completely funded by its
+members, not by commercial sales.
+</P>
+
+<HR>
+
+<LI><A name="name">Why the name "Apache" ?</A>
+<P>A cute name which stuck. Apache is "<B>A PA</B>t<B>CH</B>y server". It was
+ based on some existing code and a series of "patch files".
+</P>
+<HR>
+
+
+<LI><A name="compatible">How compatible is Apache with my existing NCSA 1.3
+setup ?</A><P>
+
+Apache attempts to offer all the features and configuration options
+of NCSA httpd 1.3, as well as many of the additional features found in
+NCSA httpd 1.4 and NCSA httpd 1.5.<P>
+
+NCSA httpd appears to be moving toward adding experimental features
+which are not generally required at the moment. Some of the experiments
+will succeed while others will inevitably be dropped. The Apache philosophy is
+to add what's needed as and when it is needed.<p>
+
+Friendly interaction between Apache and NCSA developers should ensure
+that fundamental feature enhancments stay consistent between the two
+servers for the foreseeable future.<p>
+
+<HR>
+
+<LI><A name="compare">OK, so how does Apache compare to other servers ?</A>
+<P>
+For an independent assessment, see <A HREF="http://www.webcompare.com/server-main.html">http://www.webcompare.com/server-main.html</A>
+</P>
+
+<P>Apache has been shown to be substantially faster than many other
+free servers. Although certain commercial servers have claimed to
+surpass Apache's speed (it has not been demonstrated that any of these
+"benchmarks" are a good way of measuring WWW server speed at any
+rate), we feel that it is better to have a mostly-fast free server
+than an extremely-fast server that costs thousands of dollars. Apache
+is run on sites that get millions of hits per day, and they have
+experienced no performance difficulties.</p>
+
+<HR>
+<LI><A name="tested">How thoroughly tested is Apache?</A>
+
+<p>Apache is run on over 100,000 Internet servers (as of July 1996). It has
+been tested thoroughly by both developers and users. The Apache Group
+maintains rigorous standards before releasing new versions of their
+server, and our server runs without a hitch on over one third of all
+WWW servers. When bugs do show up, we release patches and new
+versions, as soon as they are available.</a>
+
+<P>See <A HREF="../info/apache_users.html">http://www.apache.org/info/apache_users.html</A> for an incomplete list of sites running Apache.</P>
+
+<hr>
+
+<LI><A name="proxy">Does or will Apache act as a Proxy server?
+<p>Apache version 1.1
+and above will come with a proxy module. If compiled in, this will make
+Apache act as a caching-proxy server
+<p>
+<HR>
+
+<LI><A name="future">What are the future plans for Apache ?</A>
+<P><UL>
+<LI>to continue as a public domain HTTP server,
+<LI>to keep up with advances in HTTP protocol and web developments in general
+<LI>to collect suggestions for fixes/improvements from its users,
+<LI>to respond to needs of large volume providers as well as occasional users.
+</UL>
+</P><HR>
+
+<LI><A name="support">Who do I contact for support ?</A>
+<P>There is no official support for Apache. None of the developers want to
+be swamped by a flood of trivial questions that can be resolved elsewhere.
+Bug reports and suggestions should be sent via <A HREF="http://www.apache.org/bug_report.html">the bug report page.</A>
+Other questions should be directed to
+<A HREF="news:comp.infosystems.www.servers.unix">comp.infosystems.www.servers.unix</A>, where some of the Apache team lurk,
+in the company of many other httpd gurus who should be able
+to help.
+<p>
+Commercial support for Apache is, however, available from a number
+third parties.
+</p>
+<HR>
+
+<LI><A name="more">Is there any more information on Apache ?</A>
+<P>Indeed there is. See <A HREF="http://www.apache.org/">http://www.apache.org/</A>.
+</P>
+<HR>
+
+<LI><A name="where">Where can get Apache ?</A>
+<P>
+You can find the source for Apache at <A HREF="http://www.apache.org/">http://www.apache.org/</A>.
+</P>
+<HR>
+</OL>
+
+<A HREF="../"><IMG SRC="../images/apache_home.gif" ALT="Home"></A>
+<A HREF="./"><IMG SRC="../images/apache_index.gif" ALT="Index"></A>
+</BODY>
+</HTML>
diff --git a/docs/manual/misc/client_block_api.html b/docs/manual/misc/client_block_api.html
new file mode 100644
index 0000000000..c70ee37a66
--- /dev/null
+++ b/docs/manual/misc/client_block_api.html
@@ -0,0 +1,70 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
+<HTML>
+<HEAD>
+<TITLE>Reading Client Input in Apache 1.2</TITLE>
+</HEAD>
+
+<BODY>
+<IMG SRC="../images/apache_sub.gif" ALT="">
+<H1>Reading Client Input in Apache 1.2</h1>
+
+<hr>
+
+<p>Apache 1.1 and earlier let modules handle POST and PUT requests by
+themselves. The module would, on its own, determine whether the
+request had an entity, how many bytes it was, and then called a
+function (<code>read_client_block</code>) to get the data.
+
+<p>However, HTTP/1.1 requires several things of POST and PUT request
+handlers that did not fit into this module, and all existing modules
+have to be rewritten. The API calls for handling this have been
+furthur abstracted, so that future HTTP protocol changes can be
+accomplished while remaining backwards-compatible.</p>
+
+<hr>
+
+<h3>The New API Functions</h3>
+
+<pre>
+ int setup_client_block (request_rec *);
+ int should_client_block (request_rec *);
+ long get_client_block (request_rec *, char *buffer, int buffer_size);
+</pre>
+
+<ol>
+<li>Call <code>setup_client_block()</code> near the beginning of the request
+ handler. This will set up all the neccessary properties, and
+ will return either OK, or an error code. If the latter,
+ the module should return that error code.
+
+<li>When you are ready to possibly accept input, call
+ <code>should_client_block()</code>.
+ This will tell the module whether or not to read input. If it is 0,
+ the module should assume that the input is of a non-entity type
+ (e.g. a GET request). A nonzero response indicates that the module
+ should proceed (to step 3).
+ This step also sends a 100 Continue response
+ to HTTP/1.1 clients, so should not be called until the module
+ is *defenitely* ready to read content. (otherwise, the point of the
+ 100 response is defeated). Never call this function more than once.
+
+<li>Finally, call <code>get_client_block</code> in a loop. Pass it a
+ buffer and its
+ size. It will put data into the buffer (not neccessarily the full
+ buffer, in the case of chunked inputs), and return the length of
+ the input block. When it is done reading, it will return 0, and
+ the module should proceed.
+
+</ol>
+
+<p>As an example, please look at the code in
+<code>mod_cgi.c</code>. This is properly written to the new API
+guidelines.</p>
+
+<hr>
+
+<A HREF="../"><IMG SRC="../images/apache_home.gif" ALT="Home"></A>
+<A HREF="./"><IMG SRC="../images/apache_index.gif" ALT="Index"></A>
+
+</BODY>
+</HTML>
diff --git a/docs/manual/misc/compat_notes.html b/docs/manual/misc/compat_notes.html
new file mode 100644
index 0000000000..efa641f8b7
--- /dev/null
+++ b/docs/manual/misc/compat_notes.html
@@ -0,0 +1,108 @@
+<HTML><HEAD>
+<TITLE>Apache HTTP Server: Compatibility Notes with NCSA's Server</TITLE>
+</HEAD>
+<BODY>
+<IMG SRC="../images/apache_sub.gif" ALT="">
+<H3>Compatibility Notes with NCSA's Server</H3>
+
+<HR>
+
+While Apache 0.8.x and beyond are for the most part a drop-in
+replacement for NCSA's httpd and earlier versions of Apache, there are
+a couple gotcha's to watch out for. These are mostly due to the fact
+that the parser for config and access control files was rewritten from
+scratch, so certain liberties the earlier servers took may not be
+available here. These are all easily fixable. If you know of other
+non-fatal problems that belong here, <a
+href="mailto:apache-bugs@apache.org">let us know.</a>
+
+<P>Please also check the <A HREF="known_bugs.html">known bugs</A> page.
+
+
+
+<OL>
+
+<LI><CODE>AddType</CODE> only accepts one file extension per line, without
+any dots (<code>.</code>) in the extension, and does not take full filenames.
+If you need multiple extensions per type, use multiple lines, e.g.
+<blockquote><code>
+AddType application/foo foo<br>
+AddType application/foo bar
+</code></blockquote>
+To map <code>.foo</code> and <code>.bar</code> to <code>application/foo</code>
+<p>
+
+
+
+ <LI><P>If you follow the NCSA guidelines for setting up access restrictions
+ based on client domain, you may well have added entries for,
+ <CODE>AuthType, AuthName, AuthUserFile</CODE> or <CODE>AuthGroupFile</CODE>.
+ <B>None</B> of these are needed (or appropriate) for restricting access
+ based on client domain.
+
+ <P>When Apache sees <CODE>AuthType</CODE> it (reasonably) assumes you
+ are using some authorization type based on username and password.
+
+ <P>Please remove <CODE>AuthType</CODE>, it's unnecessary even for NCSA.
+
+ <P>
+
+ <LI><CODE>AuthUserFile</CODE> requires a full pathname. In earlier
+ versions of NCSA httpd and Apache, you could use a filename
+ relative to the .htaccess file. This could be a major security hole,
+ as it made it trivially easy to make a ".htpass" file in the a
+ directory easily accessable by the world. We recommend you store
+ your passwords outside your document tree.
+
+ <P>
+
+ <LI><CODE>OldScriptAlias</CODE> is no longer supported.
+
+ <P>
+
+ <LI><CODE>exec cgi=""</CODE> produces reasonable <B>malformed header</B>
+ responses when used to invoke non-CGI scripts.<BR>
+ The NCSA code ignores the missing header. (bad idea)<BR>
+ Solution: write CGI to the CGI spec or use <CODE>exec cmd=""</CODE> instead.
+ <P>We might add <CODE>virtual</CODE> support to <CODE>exec cmd</CODE> to
+ make up for this difference.
+
+ <P>
+
+ <LI>&lt;Limit&gt; sillyness - in the old Apache 0.6.5, a
+ directive of &lt;Limit GET&gt; would also restrict POST methods - Apache 0.8.8's new
+ core is correct in not presuming a limit on a GET is the same limit on a POST,
+ so if you are relying on that behavior you need to change your access configurations
+ to reflect that.
+
+ <P>
+
+ <LI>Icons for FancyIndexing broken - well, no, they're not broken, we've just upgraded the
+ icons from flat .xbm files to pretty and much smaller .gif files, courtesy of
+<a href="mailto:kevinh@eit.com">Kevin Hughes</a> at
+<a href="http://www.eit.com">EIT</a>.
+ If you are using the same srm.conf from an old distribution, make sure you add the new
+ AddIcon, AddIconByType, and DefaultIcon commands.
+
+ <P>
+
+ <LI>Under IRIX, the "Group" directive in httpd.conf needs to be a valid group name
+ (i.e. "nogroup") not the numeric group ID. The distribution httpd.conf, and earlier
+ ones, had the default Group be "#-1", which was causing silent exits at startup.<p>
+
+<li><code>.asis</code> files: Apache 0.6.5 did not require a Status header;
+it added one automatically if the .asis file contained a Location header.
+0.8.14 requires a Status header. <p>
+
+</OL>
+
+More to come when we notice them....
+
+
+<hr>
+
+<A HREF="../"><IMG SRC="../images/apache_home.gif" ALT="Home"></A>
+<A HREF="./"><IMG SRC="../images/apache_index.gif" ALT="Index"></A>
+
+</BODY>
+</HTML>
diff --git a/docs/manual/misc/security_tips.html b/docs/manual/misc/security_tips.html
new file mode 100644
index 0000000000..a805d8cbed
--- /dev/null
+++ b/docs/manual/misc/security_tips.html
@@ -0,0 +1,92 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
+<HTML>
+<HEAD>
+<TITLE>Apache HTTP Server Documentation</TITLE>
+</HEAD>
+
+<BODY>
+<IMG SRC="../images/apache_sub.gif" ALT="">
+<H1>Security tips for server configuration</H1>
+
+<hr>
+
+<P>Some hints and tips on security issues in setting up a web server. Some of
+the suggestions will be general, other, specific to Apache
+
+<HR>
+
+<H2>Server Side Includes</H2>
+<P>Server side includes (SSI) can be configured so that users can execute
+arbitrary programs on the server. That thought alone should send a shiver
+down the spine of any sys-admin.<p>
+
+One solution is to disable that part of SSI. To do that you use the
+IncludesNOEXEC option to the <A HREF="core.html#options">Options</A>
+directive.<p>
+
+<HR>
+
+<H2>Non Script Aliased CGI</H2>
+<P>Allowing users to execute <B>CGI</B> scripts in any directory should only
+be considered if;
+<OL>
+ <LI>You trust your users not to write scripts which will deliberately or
+accidentally expose your system to an attack.
+ <LI>You consider security at your site to be so feeble in other areas, as to
+make one more potential hole irrelevant.
+ <LI>You have no users, and nobody ever visits your server.
+</OL><p>
+<HR>
+
+<H2>Script Alias'ed CGI</H2>
+<P>Limiting <B>CGI</B> to special directories gives the admin control over
+what goes into those directories. This is inevitably more secure than
+non script aliased CGI, but <strong>only if users with write access to the
+directories are trusted</strong> or the admin is willing to test each new CGI
+script/program for potential security holes.<P>
+
+Most sites choose this option over the non script aliased CGI approach.<p>
+
+<HR>
+<H2>CGI in general</H2>
+<P>Always remember that you must trust the writers of the CGI script/programs
+or your ability to spot potential security holes in CGI, whether they were
+deliberate or accidental.<p>
+
+All the CGI scripts will run as the same user, so they have potential to
+conflict (accidentally or deliberately) with other scripts e.g. User A hates
+User B, so he writes a script to trash User B's CGI database.<P>
+
+<HR>
+
+Please send any other useful security tips to
+<A HREF="mailto:apache-bugs@mail.apache.org">apache-bugs@mail.apache.org</A>
+<p>
+<HR>
+
+<H2>Stopping users overriding system wide settings...</H2>
+<P>To run a really tight ship, you'll want to stop users from setting
+up <CODE>.htaccess</CODE> files which can override security features
+you've configured. Here's one way to do it...<p>
+
+In the server configuration file, put
+<blockquote><code>
+&lt;Directory&gt; <br>
+AllowOverride None <br>
+Options None <br>
+&lt;Limit GET PUT POST&gt; <br>
+allow from all <br>
+&lt;/Limit&gt; <br>
+&lt;/Directory&gt; <br>
+</code></blockquote>
+
+Then setup for specific directories<P>
+
+This stops all overrides, Includes and accesses in all directories apart
+from those named.<p><hr>
+
+<A HREF="../"><IMG SRC="../images/apache_home.gif" ALT="Home"></A>
+<A HREF="./"><IMG SRC="../images/apache_index.gif" ALT="Index"></A>
+
+</BODY>
+</HTML>
diff --git a/docs/manual/platform/perf-bsd44.html b/docs/manual/platform/perf-bsd44.html
new file mode 100644
index 0000000000..1f3a6010c8
--- /dev/null
+++ b/docs/manual/platform/perf-bsd44.html
@@ -0,0 +1,215 @@
+<html>
+<head>
+<title>Running a High-Performance Web Server for BSD</title>
+</head>
+
+<body>
+<A NAME="initial">
+<IMG SRC="../images/apache_sub.gif" ALT="">
+</A>
+<H2>Running a High-Performance Web Server for BSD</H2>
+
+Like other OS's, the listen queue is often the <b>first limit hit</b>. The
+following are comments from "Aaron Gifford &lt;agifford@InfoWest.COM&gt;"
+on how to fix this on BSDI 1.x, 2.x, and FreeBSD 2.0 (and earlier):
+
+<p>
+
+Edit the following two files:
+<blockquote><code> /usr/include/sys/socket.h <br>
+ /usr/src/sys/sys/socket.h </code></blockquote>
+In each file, look for the following:
+<pre>
+ /*
+ * Maximum queue length specifiable by listen.
+ */
+ #define SOMAXCONN 5
+</pre>
+
+Just change the "5" to whatever appears to work. I bumped the two
+machines I was having problems with up to 32 and haven't noticed the
+problem since.
+
+<p>
+
+After the edit, recompile the kernel and recompile the Apache server
+then reboot.
+
+<P>
+
+FreeBSD 2.1 seems to be perfectly happy, with SOMAXCONN
+set to 32 already.
+
+<p>
+
+<A NAME="detail">
+<b>Addendum for <i>very</i> heavily loaded BSD servers</b><br>
+</A>
+from Chuck Murcko &lt;chuck@telebase.com&gt;
+
+<p>
+
+If you're running a really busy BSD Apache server, the following are useful
+things to do if the system is acting sluggish:<p>
+
+<ul>
+
+<li> Run vmstat to check memory usage, page/swap rates, etc.
+
+<li> Run netstat -m to check mbuf usage
+
+<li> Run fstat to check file descriptor usage
+
+</ul>
+
+These utilities give you an idea what you'll need to tune in your kernel,
+and whether it'll help to buy more RAM.
+
+Here are some BSD kernel config parameters (actually BSDI, but pertinent to
+FreeBSD and other 4.4-lite derivatives) from a system getting heavy usage.
+The tools mentioned above were used, and the system memory was increased to
+48 MB before these tuneups. Other system parameters remained unchanged.
+
+<p>
+
+<pre>
+maxusers 256
+</pre>
+
+Maxusers drives a <i>lot</i> of other kernel parameters:
+
+<ul>
+
+<li> Maximum # of processes
+
+<li> Maximum # of processes per user
+
+<li> System wide open files limit
+
+<li> Per-process open files limit
+
+<li> Maximum # of mbuf clusters
+
+<li> Proc/pgrp hash table size
+
+</ul>
+
+The actual formulae for these derived parameters are in
+<i>/usr/src/sys/conf/param.c</i>.
+These calculated parameters can also be overridden (in part) by specifying
+your own values in the kernel configuration file:
+
+<pre>
+# Network options. NMBCLUSTERS defines the number of mbuf clusters and
+# defaults to 256. This machine is a server that handles lots of traffic,
+# so we crank that value.
+options SOMAXCONN=256 # max pending connects
+options NMBCLUSTERS=4096 # mbuf clusters at 4096
+
+#
+# Misc. options
+#
+options CHILD_MAX=512 # maximum number of child processes
+options OPEN_MAX=512 # maximum fds (breaks RPC svcs)
+</pre>
+
+SOMAXCONN is not derived from maxusers, so you'll always need to increase
+that yourself. We used a value guaranteed to be larger than Apache's
+default for the listen() of 128, currently.
+
+<p>
+
+In many cases, NMBCLUSTERS must be set much larger than would appear
+necessary at first glance. The reason for this is that if the browser
+disconnects in mid-transfer, the socket fd associated with that particular
+connection ends up in the TIME_WAIT state for several minutes, during
+which time its mbufs are not yet freed.
+
+<p>
+
+Some more info on mbuf clusters (from sys/mbuf.h):
+<pre>
+/*
+ * Mbufs are of a single size, MSIZE (machine/machparam.h), which
+ * includes overhead. An mbuf may add a single "mbuf cluster" of size
+ * MCLBYTES (also in machine/machparam.h), which has no additional overhead
+ * and is used instead of the internal data area; this is done when
+ * at least MINCLSIZE of data must be stored.
+ */
+</pre>
+
+<p>
+
+CHILD_MAX and OPEN_MAX are set to allow up to 512 child processes (different
+than the maximum value for processes per user ID) and file descriptors.
+These values may change for your particular configuration (a higher OPEN_MAX
+value if you've got modules or CGI scripts opening lots of connections or
+files). If you've got a lot of other activity besides httpd on the same
+machine, you'll have to set NPROC higher still. In this example, the NPROC
+value derived from maxusers proved sufficient for our load.
+
+<p>
+
+<b>Caveats</b>
+
+<p>
+
+Be aware that your system may not boot with a kernel that is configured
+to use more resources than you have available system RAM. <b>ALWAYS</b>
+have a known bootable kernel available when tuning your system this way,
+and use the system tools beforehand to learn if you need to buy more
+memory before tuning.
+
+<p>
+
+RPC services will fail when the value of OPEN_MAX is larger than 256.
+This is a function of the original implementations of the RPC library,
+which used a byte value for holding file descriptors. BSDI has partially
+addressed this limit in its 2.1 release, but a real fix may well await
+the redesign of RPC itself.
+
+<p>
+
+Finally, there's the hard limit of child processes configured in Apache.
+
+<p>
+
+For versions of Apache later than 1.0.5 you'll need to change the
+definition for <b>HARD_SERVER_LIMIT</b> in <i>httpd.h</i> and recompile
+if you need to run more than the default 150 instances of httpd.
+
+<p>
+
+From conf/httpd.conf-dist:
+
+<pre>
+# Limit on total number of servers running, i.e., limit on the number
+# of clients who can simultaneously connect --- if this limit is ever
+# reached, clients will be LOCKED OUT, so it should NOT BE SET TOO LOW.
+# It is intended mainly as a brake to keep a runaway server from taking
+# Unix with it as it spirals down...
+
+MaxClients 150
+</pre>
+
+Know what you're doing if you bump this value up, and make sure you've
+done your system monitoring, RAM expansion, and kernel tuning beforehand.
+Then you're ready to service some serious hits!
+
+<p>
+
+Thanks to <i>Tony Sanders</i> and <i>Chris Torek</i> at BSDI for their
+helpful suggestions and information.
+
+<P><HR>
+
+<H3>More welcome!</H3>
+
+If you have tips to contribute, send mail to <a
+href="mailto:brian@organic.com">brian@organic.com</a>
+
+<P><HR><P>
+<A HREF="/"><IMG SRC="../images/apache_home.gif" ALT="Home"></A>
+<A HREF="."><IMG SRC="../images/apache_index.gif" ALT="Index"></A>
+</body></html>
+
diff --git a/docs/manual/platform/perf-dec.html b/docs/manual/platform/perf-dec.html
new file mode 100644
index 0000000000..cd027bfc60
--- /dev/null
+++ b/docs/manual/platform/perf-dec.html
@@ -0,0 +1,267 @@
+<HEAD>
+<TITLE>Performance Tuning Tips for Digital Unix</TITLE>
+</HEAD>
+<BODY>
+<H1>Performance Tuning Tips for Digital Unix</H1>
+
+Below is a set of newsgroup posts made by an engineer from DEC in
+response to queries about how to modify DEC's Digital Unix OS for more
+heavily loaded web sites. Copied with permission.
+
+<HR>
+
+<H2>Update</H2>
+From: Jeffrey Mogul <mogul@pa.dec.com><BR>
+Date: Fri, 28 Jun 96 16:07:56 MDT<BR>
+
+<OL>
+<LI> The advice given in the README file regarding the
+ "tcbhashsize" variable is incorrect. The largest value
+ this should be set to is 1024. Setting it any higher
+ will have the perverse result of disabling the hashing
+ mechanism.
+
+<LI>Patch ID OSF350-146 has been superseded by
+<blockquote>
+ Patch ID OSF350-195 for V3.2C<BR>
+ Patch ID OSF360-350195 for V3.2D
+</blockquote>
+ Patch IDs for V3.2E and V3.2F should be available soon.
+ There is no known reason why the Patch ID OSF360-350195
+ won't work on these releases, but such use is not officially
+ supported by Digital. This patch kit will not be needed for
+ V3.2G when it is released.
+</UL>
+
+<HR>
+
+
+<PRE>
+From mogul@pa.dec.com (Jeffrey Mogul)
+Organization DEC Western Research
+Date 30 May 1996 00:50:25 GMT
+Newsgroups <A HREF="news:comp.unix.osf.osf1">comp.unix.osf.osf1</A>
+Message-ID <A HREF="news:4oirch$bc8@usenet.pa.dec.com">&lt;4oirch$bc8@usenet.pa.dec.com&gt;</A>
+Subject Re: Web Site Performance
+References 1
+
+
+
+In article &lt;skoogDs54BH.9pF@netcom.com&gt; skoog@netcom.com (Jim Skoog) writes:
+&gt;Where are the performance bottlenecks for Alpha AXP running the
+&gt;Netscape Commerce Server 1.12 with high volume internet traffic?
+&gt;We are evaluating network performance for a variety of Alpha AXP
+&gt;runing DEC UNIX 3.2C, which run DEC's seal firewall and behind
+&gt;that Alpha 1000 and 2100 webservers.
+
+Our experience (running such Web servers as <A HREF="http://altavista.digital.com">altavista.digital.com</A>
+and <A HREF="http://www.digital.com">www.digital.com</A>) is that there is one important kernel tuning
+knob to adjust in order to get good performance on V3.2C. You
+need to patch the kernel global variable "somaxconn" (use dbx -k
+to do this) from its default value of 8 to something much larger.
+
+How much larger? Well, no larger than 32767 (decimal). And
+probably no less than about 2048, if you have a really high volume
+(millions of hits per day), like AltaVista does.
+
+This change allows the system to maintain more than 8 TCP
+connections in the SYN_RCVD state for the HTTP server. (You
+can use "netstat -An |grep SYN_RCVD" to see how many such
+connections exist at any given instant).
+
+If you don't make this change, you might find that as the load gets
+high, some connection attempts take a very long time. And if a lot
+of your clients disconnect from the Internet during the process of
+TCP connection establishment (this happens a lot with dialup
+users), these "embryonic" connections might tie up your somaxconn
+quota of SYN_RCVD-state connections. Until the kernel times out
+these embryonic connections, no other connections will be accepted,
+and it will appear as if the server has died.
+
+The default value for somaxconn in Digital UNIX V4.0 will be quite
+a bit larger than it has been in previous versions (we inherited
+this default from 4.3BSD).
+
+Digital UNIX V4.0 includes some other performance-related changes
+that significantly improve its maximum HTTP connection rate. However,
+we've been using V3.2C systems to front-end for altavista.digital.com
+with no obvious performance bottlenecks at the millions-of-hits-per-day
+level.
+
+We have some Webstone performance results available at
+ <A HREF="http://www.digital.com/info/alphaserver/news/webff.html">http://www.digital.com/info/alphaserver/news/webff.html</A>
+I'm not sure if these were done using V4.0 or an earlier version
+of Digital UNIX, although I suspect they were done using a test
+version of V4.0.
+
+-Jeff
+
+<HR>
+
+----------------------------------------------------------------------------
+
+From mogul@pa.dec.com (Jeffrey Mogul)
+Organization DEC Western Research
+Date 31 May 1996 21:01:01 GMT
+Newsgroups <A HREF="news:comp.unix.osf.osf1">comp.unix.osf.osf1</A>
+Message-ID <A HREF="news:4onmmd$mmd@usenet.pa.dec.com">&lt;4onmmd$mmd@usenet.pa.dec.com&gt;</A>
+Subject Digital UNIX V3.2C Internet tuning patch info
+
+----------------------------------------------------------------------------
+
+Something that probably few people are aware of is that Digital
+has a patch kit available for Digital UNIX V3.2C that may improve
+Internet performance, especially for busy web servers.
+
+This patch kit is one way to increase the value of somaxconn,
+which I discussed in a message here a day or two ago.
+
+I've included in this message the revised README file for this
+patch kit below. Note that the original README file in the patch
+kit itself may be an earlier version; I'm told that the version
+below is the right one.
+
+Sorry, this patch kit is NOT available for other versions of Digital
+UNIX. Most (but not quite all) of these changes also made it into V4.0,
+so the description of the various tuning parameters in this README
+file might be useful to people running V4.0 systems.
+
+This patch kit does not appear to be available (yet?) from
+ <A HREF="http://www.service.digital.com/html/patch_service.html">http://www.service.digital.com/html/patch_service.html</A>
+so I guess you'll have to call Digital's Customer Support to get it.
+
+-Jeff
+
+DESCRIPTION: Digital UNIX Network tuning patch
+
+ Patch ID: OSF350-146
+
+ SUPERSEDED PATCHES: OSF350-151, OSF350-158
+
+ This set of files improves the performance of the network
+ subsystem on a system being used as a web server. There are
+ additional tunable parameters included here, to be used
+ cautiously by an informed system administrator.
+
+TUNING
+
+ To tune the web server, the number of simultaneous socket
+ connection requests are limited by:
+
+ somaxconn Sets the maximum number of pending requests
+ allowed to wait on a listening socket. The
+ default value in Digital UNIX V3.2 is 8.
+ This patch kit increases the default to 1024,
+ which matches the value in Digital UNIX V4.0.
+
+ sominconn Sets the minimum number of pending connections
+ allowed on a listening socket. When a user
+ process calls listen with a backlog less
+ than sominconn, the backlog will be set to
+ sominconn. sominconn overrides somaxconn.
+ The default value is 1.
+
+ The effectiveness of tuning these parameters can be monitored by
+ the sobacklog variables available in the kernel:
+
+ sobacklog_hiwat Tracks the maximum pending requests to any
+ socket. The initial value is 0.
+
+ sobacklog_drops Tracks the number of drops exceeding the
+ socket set backlog limit. The initial
+ value is 0.
+
+ somaxconn_drops Tracks the number of drops exceeding the
+ somaxconn limit. When sominconn is larger
+ than somaxconn, tracks the number of drops
+ exceeding sominconn. The initial value is 0.
+
+ TCP timer parameters also affect performance. Tuning the following
+ require some knowledge of the characteristics of the network.
+
+ tcp_msl Sets the tcp maximum segment lifetime.
+ This is the maximum lifetime in half
+ seconds that a packet can be in transit
+ on the network. This value, when doubled,
+ is the length of time a connection remains
+ in the TIME_WAIT state after a incoming
+ close request is processed. The unit is
+ specified in 1/2 seconds, the initial
+ value is 60.
+
+ tcp_rexmit_interval_min
+ Sets the minimum TCP retransmit interval.
+ For some WAN networks the default value may
+ be too short, causing unnecessary duplicate
+ packets to be sent. The unit is specified
+ in 1/2 seconds, the initial value is 1.
+
+ tcp_keepinit This is the amount of time a partially
+ established connection will sit on the listen
+ queue before timing out (e.g. if a client
+ sends a SYN but never answers our SYN/ACK).
+ Partially established connections tie up slots
+ on the listen queue. If the queue starts to
+ fill with connections in SYN_RCVD state,
+ tcp_keepinit can be decreased to make those
+ partial connects time out sooner. This should
+ be used with caution, since there might be
+ legitimate clients that are taking a while
+ to respond to SYN/ACK. The unit is specified
+ in 1/2 seconds, the default value is 150
+ (ie. 75 seconds).
+
+ The hashlist size for the TCP inpcb lookup table is regulated by:
+
+ tcbhashsize The number of hash buckets used for the
+ TCP connection table used in the kernel.
+ The initial value is 32. For best results,
+ should be specified as a power of 2. For
+ busy Web servers, set this to 2048 or more.
+
+ The hashlist size for the interface alias table is regulated by:
+
+ inifaddr_hsize The number of hash buckets used for the
+ interface alias table used in the kernel.
+ The initial value is 32. For best results,
+ should be specified as a power of 2.
+
+ ipport_userreserved The maximum number of concurrent non-reserved,
+ dynamically allocated ports. Default range
+ is 1025-5000. The maximum value is 65535.
+ This limits the numer of times you can
+ simultaneously telnet or ftp out to connect
+ to other systems.
+
+ tcpnodelack Don't delay acknowledging TCP data; this
+ can sometimes improve performance of locally
+ run CAD packages. Default is value is 0,
+ the enabled value is 1.
+
+ Digital UNIX version:
+
+ V3.2C
+Feature V3.2C patch V4.0
+ ======= ===== ===== ====
+somaxconn X X X
+sominconn - X X
+sobacklog_hiwat - X -
+sobacklog_drops - X -
+somaxconn_drops - X -
+tcpnodelack X X X
+tcp_keepidle X X X
+tcp_keepintvl X X X
+tcp_keepcnt - X X
+tcp_keepinit - X X
+TCP keepalive per-socket - - X
+tcp_msl - X -
+tcp_rexmit_interval_min - X -
+TCP inpcb hashing - X X
+tcbhashsize - X X
+interface alias hashing - X X
+inifaddr_hsize - X X
+ipport_userreserved - X -
+sysconfig -q inet - - X
+sysconfig -q socket - - X
+
+</PRE>
diff --git a/docs/manual/platform/perf.html b/docs/manual/platform/perf.html
new file mode 100644
index 0000000000..d2a88e23b3
--- /dev/null
+++ b/docs/manual/platform/perf.html
@@ -0,0 +1,134 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
+<html>
+<head>
+<title>Hints on Running a High-Performance Web Server</title>
+</head>
+
+<body>
+<IMG SRC="../images/apache_sub.gif" ALT="">
+<h2>Hints on Running a High-Performance Web Server</H2>
+
+Running Apache on a heavily loaded web server, one often encounters
+problems related to the machine and OS configuration. "Heavy" is
+relative, of course - but if you are seeing more than a couple hits
+per second on a sustained basis you should consult the pointers on
+this page. In general the suggestions involve how to tune your kernel
+for the heavier TCP load, hardware/software conflicts that arise, etc.
+
+<UL>
+<LI><A HREF="#AUX">A/UX (Apple's UNIX)</A>
+<LI><A HREF="#BSD">BSD-based (BSDI, FreeBSD, etc)</A>
+<LI><A HREF="#DEC">Digital UNIX</A>
+<LI><A HREF="#HP">Hewlett-Packard</A>
+<LI><A HREF="#Linux">Linux</A>
+<LI><A HREF="#SGI">SGI</A>
+<LI><A HREF="#Solaris">Solaris</A>
+<LI><A HREF="#SunOS">SunOS 4.x</A>
+</UL>
+
+<HR>
+
+<A NAME="AUX">
+<H3>A/UX (Apple's UNIX)</H3>
+</A>
+
+If you are running Apache on A/UX, a page that gives some helpful
+performance hints (concerning the <I>listen()</I> queue and using
+virtual hosts)
+<A HREF="http://www.jaguNET.com/apache.html">can be found here</A>
+
+<P><HR>
+
+<A NAME="BSD">
+<H3>BSD-based (BSDI, FreeBSD, etc)</H3>
+</A>
+
+<A HREF="perf-bsd44.html#initial">Quick</A> and
+<A HREF="perf-bsd44.html#detail">detailed</A>
+performance tuning hints for BSD-derived systems.
+
+<P><HR>
+
+<A NAME="DEC">
+<H3>Digital UNIX</H3>
+</A>
+
+We have some <A HREF="perf-dec.html">newsgroup postings</A> on how to
+tune Digital UNIX 3.2 and 4.0.
+
+<P><HR>
+
+<A NAME="HP">
+<H3>Hewlett-Packard</H3>
+</A>
+
+Some documentation on tuning HP machines can be found at <A
+HREF="http://www.software.hp.com/internet/perf/tuning.html">http://www.software.hp.com/internet/perf/tuning.html</A>.
+
+<P><HR>
+
+<A NAME="Linux">
+<H3>Linux</H3>
+</A>
+
+The most common problem on Linux shows up on heavily-loaded systems
+where the whole server will appear to freeze for a couple of minutes
+at a time, and then come back to life. This has been traced to a
+listen() queue overload - certain Linux implementations have a low
+value set for the incoming connection queue which can cause problems.
+Please see our <a
+href="http://www.qosina.com/~awm/apache/linux-tcp.html">Using Apache on
+Linux</a> page for more info on how to fix this.
+
+<P><HR>
+
+<A NAME="SGI">
+<H3>SGI</H3>
+
+<UL>
+<LI><A HREF="http://www.sgi.com/Products/WebFORCE/TuningGuide.html">
+WebFORCE Web Server Tuning Guidelines for IRIX 5.3,
+&lt;http://www.sgi.com/Products/WebFORCE/TuningGuide.html&gt;</A>
+</UL>
+
+<P><HR>
+
+<A NAME="Solaris">
+<H3>Solaris 2.4</H3>
+</A>
+
+The Solaris 2.4 TCP implementation has a few inherent limitations that
+only became apparent under heavy loads. This has been fixed to some
+extent in 2.5 (and completely revamped in 2.6), but for now consult
+the following URL for tips on how to expand the capabilities if you
+are finding slowdowns and lags are hurting performance.
+
+<UL>
+
+<LI><A href="http://www.sun.com/cgi-bin/show?sun-on-net/Sun.Internet.Solutions/performance/">
+World Wide Web Server Performance,
+&lt;http://www.sun.com/cgi-bin/show?sun-on-net/Sun.Internet.Solutions/performance/&gt;</a>
+</UL>
+
+<P><HR>
+
+<A NAME="SunOS">
+<H3>SunOS 4.x</H3>
+</A>
+
+More information on tuning SOMAXCONN on SunOS can be found at
+<A HREF="http://www.islandnet.com/~mark/somaxconn.html">
+http://www.islandnet.com/~mark/somaxconn.html</A>.
+
+<P><HR>
+
+<H3>More welcome!</H3>
+
+If you have tips to contribute, send mail to <a
+href="mailto:brian@organic.com">brian@organic.com</a>
+
+<P><HR><P>
+<A HREF="/"><IMG SRC="../images/apache_home.gif" ALT="Home"></A>
+<A HREF="."><IMG SRC="../images/apache_index.gif" ALT="Index"></A>
+</body></html>
+