diff options
author | Tomas Popela <tpopela@redhat.com> | 2019-02-12 13:20:06 +0100 |
---|---|---|
committer | Tomas Popela <tpopela@redhat.com> | 2019-02-12 13:23:13 +0100 |
commit | 013224b21fede6c16ef1c841e5cf4467ec1ab811 (patch) | |
tree | 876ffcae81b39a570ecdc4f96597d9a12673f9f3 /docs | |
parent | edf088bca7277f72ca926d12cc3f84b37cd6b585 (diff) | |
download | libsoup-013224b21fede6c16ef1c841e5cf4467ec1ab811.tar.gz |
Remove RFCs from tree
They are outdated and not free according to Debian's lintian.
Fixes: #133
Diffstat (limited to 'docs')
-rw-r--r-- | docs/specs/README | 13 | ||||
-rw-r--r-- | docs/specs/rfc1945.txt | 3363 | ||||
-rw-r--r-- | docs/specs/rfc2068.txt | 9075 | ||||
-rw-r--r-- | docs/specs/rfc2109.txt | 1179 | ||||
-rw-r--r-- | docs/specs/rfc2145.txt | 395 | ||||
-rw-r--r-- | docs/specs/rfc2324.txt | 563 | ||||
-rw-r--r-- | docs/specs/rfc2388.txt | 507 | ||||
-rw-r--r-- | docs/specs/rfc2518.txt | 5267 | ||||
-rw-r--r-- | docs/specs/rfc2616.txt | 9934 | ||||
-rw-r--r-- | docs/specs/rfc2617.txt | 1909 | ||||
-rw-r--r-- | docs/specs/rfc2817.txt | 731 | ||||
-rw-r--r-- | docs/specs/rfc2818.txt | 395 | ||||
-rw-r--r-- | docs/specs/rfc2965.txt | 1459 | ||||
-rw-r--r-- | docs/specs/rfc3986.txt | 3419 |
14 files changed, 0 insertions, 38209 deletions
diff --git a/docs/specs/README b/docs/specs/README deleted file mode 100644 index 0dee62d0..00000000 --- a/docs/specs/README +++ /dev/null @@ -1,13 +0,0 @@ -rfc1945 - HTTP/1.0 -rfc2068 - HTTP/1.1 (mostly obsoleted original specification) -rfc2109 - HTTP State Management Mechanism -rfc2145 - Use and Interpretation of HTTP Version Numbers -rfc2324 - Hyper Text Coffee Pot Control Protocol (HTCPCP/1.0) -rfc2388 - Returning Values from Forms: multipart/form-data -rfc2518 - HTTP Extensions for Distributed Authoring -- WEBDAV -rfc2616 - HTTP/1.1 (revised) [plus errata] -rfc2617 - HTTP Authentication: Basic and Digest Access Authentication [plus errata] -rfc2817 - Upgrading to TLS Within HTTP/1.1 -rfc2818 - HTTP Over TLS -rfc2965 - HTTP State Management Mechanism (allegedly obsoletes 2109) -rfc3986 - Uniform Resource Identifiers (URI): Generic Syntax diff --git a/docs/specs/rfc1945.txt b/docs/specs/rfc1945.txt deleted file mode 100644 index 37f3f23c..00000000 --- a/docs/specs/rfc1945.txt +++ /dev/null @@ -1,3363 +0,0 @@ - - - - - - -Network Working Group T. Berners-Lee -Request for Comments: 1945 MIT/LCS -Category: Informational R. Fielding - UC Irvine - H. Frystyk - MIT/LCS - May 1996 - - - Hypertext Transfer Protocol -- HTTP/1.0 - -Status of This Memo - - This memo provides information for the Internet community. This memo - does not specify an Internet standard of any kind. Distribution of - this memo is unlimited. - -IESG Note: - - The IESG has concerns about this protocol, and expects this document - to be replaced relatively soon by a standards track document. - -Abstract - - The Hypertext Transfer Protocol (HTTP) is an application-level - protocol with the lightness and speed necessary for distributed, - collaborative, hypermedia information systems. It is a generic, - stateless, object-oriented protocol which can be used for many tasks, - such as name servers and distributed object management systems, - through extension of its request methods (commands). A feature of - HTTP is the typing of data representation, allowing systems to be - built independently of the data being transferred. - - HTTP has been in use by the World-Wide Web global information - initiative since 1990. This specification reflects common usage of - the protocol referred to as "HTTP/1.0". - -Table of Contents - - 1. Introduction .............................................. 4 - 1.1 Purpose .............................................. 4 - 1.2 Terminology .......................................... 4 - 1.3 Overall Operation .................................... 6 - 1.4 HTTP and MIME ........................................ 8 - 2. Notational Conventions and Generic Grammar ................ 8 - 2.1 Augmented BNF ........................................ 8 - 2.2 Basic Rules .......................................... 10 - 3. Protocol Parameters ....................................... 12 - - - -Berners-Lee, et al Informational [Page 1] - -RFC 1945 HTTP/1.0 May 1996 - - - 3.1 HTTP Version ......................................... 12 - 3.2 Uniform Resource Identifiers ......................... 14 - 3.2.1 General Syntax ................................ 14 - 3.2.2 http URL ...................................... 15 - 3.3 Date/Time Formats .................................... 15 - 3.4 Character Sets ....................................... 17 - 3.5 Content Codings ...................................... 18 - 3.6 Media Types .......................................... 19 - 3.6.1 Canonicalization and Text Defaults ............ 19 - 3.6.2 Multipart Types ............................... 20 - 3.7 Product Tokens ....................................... 20 - 4. HTTP Message .............................................. 21 - 4.1 Message Types ........................................ 21 - 4.2 Message Headers ...................................... 22 - 4.3 General Header Fields ................................ 23 - 5. Request ................................................... 23 - 5.1 Request-Line ......................................... 23 - 5.1.1 Method ........................................ 24 - 5.1.2 Request-URI ................................... 24 - 5.2 Request Header Fields ................................ 25 - 6. Response .................................................. 25 - 6.1 Status-Line .......................................... 26 - 6.1.1 Status Code and Reason Phrase ................. 26 - 6.2 Response Header Fields ............................... 28 - 7. Entity .................................................... 28 - 7.1 Entity Header Fields ................................. 29 - 7.2 Entity Body .......................................... 29 - 7.2.1 Type .......................................... 29 - 7.2.2 Length ........................................ 30 - 8. Method Definitions ........................................ 30 - 8.1 GET .................................................. 31 - 8.2 HEAD ................................................. 31 - 8.3 POST ................................................. 31 - 9. Status Code Definitions ................................... 32 - 9.1 Informational 1xx .................................... 32 - 9.2 Successful 2xx ....................................... 32 - 9.3 Redirection 3xx ...................................... 34 - 9.4 Client Error 4xx ..................................... 35 - 9.5 Server Error 5xx ..................................... 37 - 10. Header Field Definitions .................................. 37 - 10.1 Allow ............................................... 38 - 10.2 Authorization ....................................... 38 - 10.3 Content-Encoding .................................... 39 - 10.4 Content-Length ...................................... 39 - 10.5 Content-Type ........................................ 40 - 10.6 Date ................................................ 40 - 10.7 Expires ............................................. 41 - 10.8 From ................................................ 42 - - - -Berners-Lee, et al Informational [Page 2] - -RFC 1945 HTTP/1.0 May 1996 - - - 10.9 If-Modified-Since ................................... 42 - 10.10 Last-Modified ....................................... 43 - 10.11 Location ............................................ 44 - 10.12 Pragma .............................................. 44 - 10.13 Referer ............................................. 44 - 10.14 Server .............................................. 45 - 10.15 User-Agent .......................................... 46 - 10.16 WWW-Authenticate .................................... 46 - 11. Access Authentication ..................................... 47 - 11.1 Basic Authentication Scheme ......................... 48 - 12. Security Considerations ................................... 49 - 12.1 Authentication of Clients ........................... 49 - 12.2 Safe Methods ........................................ 49 - 12.3 Abuse of Server Log Information ..................... 50 - 12.4 Transfer of Sensitive Information ................... 50 - 12.5 Attacks Based On File and Path Names ................ 51 - 13. Acknowledgments ........................................... 51 - 14. References ................................................ 52 - 15. Authors' Addresses ........................................ 54 - Appendix A. Internet Media Type message/http ................ 55 - Appendix B. Tolerant Applications ........................... 55 - Appendix C. Relationship to MIME ............................ 56 - C.1 Conversion to Canonical Form ......................... 56 - C.2 Conversion of Date Formats ........................... 57 - C.3 Introduction of Content-Encoding ..................... 57 - C.4 No Content-Transfer-Encoding ......................... 57 - C.5 HTTP Header Fields in Multipart Body-Parts ........... 57 - Appendix D. Additional Features ............................. 57 - D.1 Additional Request Methods ........................... 58 - D.1.1 PUT ........................................... 58 - D.1.2 DELETE ........................................ 58 - D.1.3 LINK .......................................... 58 - D.1.4 UNLINK ........................................ 58 - D.2 Additional Header Field Definitions .................. 58 - D.2.1 Accept ........................................ 58 - D.2.2 Accept-Charset ................................ 59 - D.2.3 Accept-Encoding ............................... 59 - D.2.4 Accept-Language ............................... 59 - D.2.5 Content-Language .............................. 59 - D.2.6 Link .......................................... 59 - D.2.7 MIME-Version .................................. 59 - D.2.8 Retry-After ................................... 60 - D.2.9 Title ......................................... 60 - D.2.10 URI ........................................... 60 - - - - - - - -Berners-Lee, et al Informational [Page 3] - -RFC 1945 HTTP/1.0 May 1996 - - -1. Introduction - -1.1 Purpose - - The Hypertext Transfer Protocol (HTTP) is an application-level - protocol with the lightness and speed necessary for distributed, - collaborative, hypermedia information systems. HTTP has been in use - by the World-Wide Web global information initiative since 1990. This - specification reflects common usage of the protocol referred too as - "HTTP/1.0". This specification describes the features that seem to be - consistently implemented in most HTTP/1.0 clients and servers. The - specification is split into two sections. Those features of HTTP for - which implementations are usually consistent are described in the - main body of this document. Those features which have few or - inconsistent implementations are listed in Appendix D. - - Practical information systems require more functionality than simple - retrieval, including search, front-end update, and annotation. HTTP - allows an open-ended set of methods to be used to indicate the - purpose of a request. It builds on the discipline of reference - provided by the Uniform Resource Identifier (URI) [2], as a location - (URL) [4] or name (URN) [16], for indicating the resource on which a - method is to be applied. Messages are passed in a format similar to - that used by Internet Mail [7] and the Multipurpose Internet Mail - Extensions (MIME) [5]. - - HTTP is also used as a generic protocol for communication between - user agents and proxies/gateways to other Internet protocols, such as - SMTP [12], NNTP [11], FTP [14], Gopher [1], and WAIS [8], allowing - basic hypermedia access to resources available from diverse - applications and simplifying the implementation of user agents. - -1.2 Terminology - - This specification uses a number of terms to refer to the roles - played by participants in, and objects of, the HTTP communication. - - connection - - A transport layer virtual circuit established between two - application programs for the purpose of communication. - - message - - The basic unit of HTTP communication, consisting of a structured - sequence of octets matching the syntax defined in Section 4 and - transmitted via the connection. - - - - -Berners-Lee, et al Informational [Page 4] - -RFC 1945 HTTP/1.0 May 1996 - - - request - - An HTTP request message (as defined in Section 5). - - response - - An HTTP response message (as defined in Section 6). - - resource - - A network data object or service which can be identified by a - URI (Section 3.2). - - entity - - A particular representation or rendition of a data resource, or - reply from a service resource, that may be enclosed within a - request or response message. An entity consists of - metainformation in the form of entity headers and content in the - form of an entity body. - - client - - An application program that establishes connections for the - purpose of sending requests. - - user agent - - The client which initiates a request. These are often browsers, - editors, spiders (web-traversing robots), or other end user - tools. - - server - - An application program that accepts connections in order to - service requests by sending back responses. - - origin server - - The server on which a given resource resides or is to be created. - - proxy - - An intermediary program which acts as both a server and a client - for the purpose of making requests on behalf of other clients. - Requests are serviced internally or by passing them, with - possible translation, on to other servers. A proxy must - interpret and, if necessary, rewrite a request message before - - - -Berners-Lee, et al Informational [Page 5] - -RFC 1945 HTTP/1.0 May 1996 - - - forwarding it. Proxies are often used as client-side portals - through network firewalls and as helper applications for - handling requests via protocols not implemented by the user - agent. - - gateway - - A server which acts as an intermediary for some other server. - Unlike a proxy, a gateway receives requests as if it were the - origin server for the requested resource; the requesting client - may not be aware that it is communicating with a gateway. - Gateways are often used as server-side portals through network - firewalls and as protocol translators for access to resources - stored on non-HTTP systems. - - tunnel - - A tunnel is an intermediary program which is acting as a blind - relay between two connections. Once active, a tunnel is not - considered a party to the HTTP communication, though the tunnel - may have been initiated by an HTTP request. The tunnel ceases to - exist when both ends of the relayed connections are closed. - Tunnels are used when a portal is necessary and the intermediary - cannot, or should not, interpret the relayed communication. - - cache - - A program's local store of response messages and the subsystem - that controls its message storage, retrieval, and deletion. A - cache stores cachable responses in order to reduce the response - time and network bandwidth consumption on future, equivalent - requests. Any client or server may include a cache, though a - cache cannot be used by a server while it is acting as a tunnel. - - Any given program may be capable of being both a client and a server; - our use of these terms refers only to the role being performed by the - program for a particular connection, rather than to the program's - capabilities in general. Likewise, any server may act as an origin - server, proxy, gateway, or tunnel, switching behavior based on the - nature of each request. - -1.3 Overall Operation - - The HTTP protocol is based on a request/response paradigm. A client - establishes a connection with a server and sends a request to the - server in the form of a request method, URI, and protocol version, - followed by a MIME-like message containing request modifiers, client - information, and possible body content. The server responds with a - - - -Berners-Lee, et al Informational [Page 6] - -RFC 1945 HTTP/1.0 May 1996 - - - status line, including the message's protocol version and a success - or error code, followed by a MIME-like message containing server - information, entity metainformation, and possible body content. - - Most HTTP communication is initiated by a user agent and consists of - a request to be applied to a resource on some origin server. In the - simplest case, this may be accomplished via a single connection (v) - between the user agent (UA) and the origin server (O). - - request chain ------------------------> - UA -------------------v------------------- O - <----------------------- response chain - - A more complicated situation occurs when one or more intermediaries - are present in the request/response chain. There are three common - forms of intermediary: proxy, gateway, and tunnel. A proxy is a - forwarding agent, receiving requests for a URI in its absolute form, - rewriting all or parts of the message, and forwarding the reformatted - request toward the server identified by the URI. A gateway is a - receiving agent, acting as a layer above some other server(s) and, if - necessary, translating the requests to the underlying server's - protocol. A tunnel acts as a relay point between two connections - without changing the messages; tunnels are used when the - communication needs to pass through an intermediary (such as a - firewall) even when the intermediary cannot understand the contents - of the messages. - - request chain --------------------------------------> - UA -----v----- A -----v----- B -----v----- C -----v----- O - <------------------------------------- response chain - - The figure above shows three intermediaries (A, B, and C) between the - user agent and origin server. A request or response message that - travels the whole chain must pass through four separate connections. - This distinction is important because some HTTP communication options - may apply only to the connection with the nearest, non-tunnel - neighbor, only to the end-points of the chain, or to all connections - along the chain. Although the diagram is linear, each participant may - be engaged in multiple, simultaneous communications. For example, B - may be receiving requests from many clients other than A, and/or - forwarding requests to servers other than C, at the same time that it - is handling A's request. - - Any party to the communication which is not acting as a tunnel may - employ an internal cache for handling requests. The effect of a cache - is that the request/response chain is shortened if one of the - participants along the chain has a cached response applicable to that - request. The following illustrates the resulting chain if B has a - - - -Berners-Lee, et al Informational [Page 7] - -RFC 1945 HTTP/1.0 May 1996 - - - cached copy of an earlier response from O (via C) for a request which - has not been cached by UA or A. - - request chain ----------> - UA -----v----- A -----v----- B - - - - - - C - - - - - - O - <--------- response chain - - Not all responses are cachable, and some requests may contain - modifiers which place special requirements on cache behavior. Some - HTTP/1.0 applications use heuristics to describe what is or is not a - "cachable" response, but these rules are not standardized. - - On the Internet, HTTP communication generally takes place over TCP/IP - connections. The default port is TCP 80 [15], but other ports can be - used. This does not preclude HTTP from being implemented on top of - any other protocol on the Internet, or on other networks. HTTP only - presumes a reliable transport; any protocol that provides such - guarantees can be used, and the mapping of the HTTP/1.0 request and - response structures onto the transport data units of the protocol in - question is outside the scope of this specification. - - Except for experimental applications, current practice requires that - the connection be established by the client prior to each request and - closed by the server after sending the response. Both clients and - servers should be aware that either party may close the connection - prematurely, due to user action, automated time-out, or program - failure, and should handle such closing in a predictable fashion. In - any case, the closing of the connection by either or both parties - always terminates the current request, regardless of its status. - -1.4 HTTP and MIME - - HTTP/1.0 uses many of the constructs defined for MIME, as defined in - RFC 1521 [5]. Appendix C describes the ways in which the context of - HTTP allows for different use of Internet Media Types than is - typically found in Internet mail, and gives the rationale for those - differences. - -2. Notational Conventions and Generic Grammar - -2.1 Augmented BNF - - All of the mechanisms specified in this document are described in - both prose and an augmented Backus-Naur Form (BNF) similar to that - used by RFC 822 [7]. Implementors will need to be familiar with the - notation in order to understand this specification. The augmented BNF - includes the following constructs: - - - - -Berners-Lee, et al Informational [Page 8] - -RFC 1945 HTTP/1.0 May 1996 - - - name = definition - - The name of a rule is simply the name itself (without any - enclosing "<" and ">") and is separated from its definition by - the equal character "=". Whitespace is only significant in that - indentation of continuation lines is used to indicate a rule - definition that spans more than one line. Certain basic rules - are in uppercase, such as SP, LWS, HT, CRLF, DIGIT, ALPHA, etc. - Angle brackets are used within definitions whenever their - presence will facilitate discerning the use of rule names. - - "literal" - - Quotation marks surround literal text. Unless stated otherwise, - the text is case-insensitive. - - rule1 | rule2 - - Elements separated by a bar ("I") are alternatives, - e.g., "yes | no" will accept yes or no. - - (rule1 rule2) - - Elements enclosed in parentheses are treated as a single - element. Thus, "(elem (foo | bar) elem)" allows the token - sequences "elem foo elem" and "elem bar elem". - - *rule - - The character "*" preceding an element indicates repetition. The - full form is "<n>*<m>element" indicating at least <n> and at - most <m> occurrences of element. Default values are 0 and - infinity so that "*(element)" allows any number, including zero; - "1*element" requires at least one; and "1*2element" allows one - or two. - - [rule] - - Square brackets enclose optional elements; "[foo bar]" is - equivalent to "*1(foo bar)". - - N rule - - Specific repetition: "<n>(element)" is equivalent to - "<n>*<n>(element)"; that is, exactly <n> occurrences of - (element). Thus 2DIGIT is a 2-digit number, and 3ALPHA is a - string of three alphabetic characters. - - - - -Berners-Lee, et al Informational [Page 9] - -RFC 1945 HTTP/1.0 May 1996 - - - #rule - - A construct "#" is defined, similar to "*", for defining lists - of elements. The full form is "<n>#<m>element" indicating at - least <n> and at most <m> elements, each separated by one or - more commas (",") and optional linear whitespace (LWS). This - makes the usual form of lists very easy; a rule such as - "( *LWS element *( *LWS "," *LWS element ))" can be shown as - "1#element". Wherever this construct is used, null elements are - allowed, but do not contribute to the count of elements present. - That is, "(element), , (element)" is permitted, but counts as - only two elements. Therefore, where at least one element is - required, at least one non-null element must be present. Default - values are 0 and infinity so that "#(element)" allows any - number, including zero; "1#element" requires at least one; and - "1#2element" allows one or two. - - ; comment - - A semi-colon, set off some distance to the right of rule text, - starts a comment that continues to the end of line. This is a - simple way of including useful notes in parallel with the - specifications. - - implied *LWS - - The grammar described by this specification is word-based. - Except where noted otherwise, linear whitespace (LWS) can be - included between any two adjacent words (token or - quoted-string), and between adjacent tokens and delimiters - (tspecials), without changing the interpretation of a field. At - least one delimiter (tspecials) must exist between any two - tokens, since they would otherwise be interpreted as a single - token. However, applications should attempt to follow "common - form" when generating HTTP constructs, since there exist some - implementations that fail to accept anything beyond the common - forms. - -2.2 Basic Rules - - The following rules are used throughout this specification to - describe basic parsing constructs. The US-ASCII coded character set - is defined by [17]. - - OCTET = <any 8-bit sequence of data> - CHAR = <any US-ASCII character (octets 0 - 127)> - UPALPHA = <any US-ASCII uppercase letter "A".."Z"> - LOALPHA = <any US-ASCII lowercase letter "a".."z"> - - - -Berners-Lee, et al Informational [Page 10] - -RFC 1945 HTTP/1.0 May 1996 - - - ALPHA = UPALPHA | LOALPHA - DIGIT = <any US-ASCII digit "0".."9"> - CTL = <any US-ASCII control character - (octets 0 - 31) and DEL (127)> - CR = <US-ASCII CR, carriage return (13)> - LF = <US-ASCII LF, linefeed (10)> - SP = <US-ASCII SP, space (32)> - HT = <US-ASCII HT, horizontal-tab (9)> - <"> = <US-ASCII double-quote mark (34)> - - HTTP/1.0 defines the octet sequence CR LF as the end-of-line marker - for all protocol elements except the Entity-Body (see Appendix B for - tolerant applications). The end-of-line marker within an Entity-Body - is defined by its associated media type, as described in Section 3.6. - - CRLF = CR LF - - HTTP/1.0 headers may be folded onto multiple lines if each - continuation line begins with a space or horizontal tab. All linear - whitespace, including folding, has the same semantics as SP. - - LWS = [CRLF] 1*( SP | HT ) - - However, folding of header lines is not expected by some - applications, and should not be generated by HTTP/1.0 applications. - - The TEXT rule is only used for descriptive field contents and values - that are not intended to be interpreted by the message parser. Words - of *TEXT may contain octets from character sets other than US-ASCII. - - TEXT = <any OCTET except CTLs, - but including LWS> - - Recipients of header field TEXT containing octets outside the US- - ASCII character set may assume that they represent ISO-8859-1 - characters. - - Hexadecimal numeric characters are used in several protocol elements. - - HEX = "A" | "B" | "C" | "D" | "E" | "F" - | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT - - Many HTTP/1.0 header field values consist of words separated by LWS - or special characters. These special characters must be in a quoted - string to be used within a parameter value. - - word = token | quoted-string - - - - -Berners-Lee, et al Informational [Page 11] - -RFC 1945 HTTP/1.0 May 1996 - - - token = 1*<any CHAR except CTLs or tspecials> - - tspecials = "(" | ")" | "<" | ">" | "@" - | "," | ";" | ":" | "\" | <"> - | "/" | "[" | "]" | "?" | "=" - | "{" | "}" | SP | HT - - Comments may be included in some HTTP header fields by surrounding - the comment text with parentheses. Comments are only allowed in - fields containing "comment" as part of their field value definition. - In all other fields, parentheses are considered part of the field - value. - - comment = "(" *( ctext | comment ) ")" - ctext = <any TEXT excluding "(" and ")"> - - A string of text is parsed as a single word if it is quoted using - double-quote marks. - - quoted-string = ( <"> *(qdtext) <"> ) - - qdtext = <any CHAR except <"> and CTLs, - but including LWS> - - Single-character quoting using the backslash ("\") character is not - permitted in HTTP/1.0. - -3. Protocol Parameters - -3.1 HTTP Version - - HTTP uses a "<major>.<minor>" numbering scheme to indicate versions - of the protocol. The protocol versioning policy is intended to allow - the sender to indicate the format of a message and its capacity for - understanding further HTTP communication, rather than the features - obtained via that communication. No change is made to the version - number for the addition of message components which do not affect - communication behavior or which only add to extensible field values. - The <minor> number is incremented when the changes made to the - protocol add features which do not change the general message parsing - algorithm, but which may add to the message semantics and imply - additional capabilities of the sender. The <major> number is - incremented when the format of a message within the protocol is - changed. - - The version of an HTTP message is indicated by an HTTP-Version field - in the first line of the message. If the protocol version is not - specified, the recipient must assume that the message is in the - - - -Berners-Lee, et al Informational [Page 12] - -RFC 1945 HTTP/1.0 May 1996 - - - simple HTTP/0.9 format. - - HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT - - Note that the major and minor numbers should be treated as separate - integers and that each may be incremented higher than a single digit. - Thus, HTTP/2.4 is a lower version than HTTP/2.13, which in turn is - lower than HTTP/12.3. Leading zeros should be ignored by recipients - and never generated by senders. - - This document defines both the 0.9 and 1.0 versions of the HTTP - protocol. Applications sending Full-Request or Full-Response - messages, as defined by this specification, must include an HTTP- - Version of "HTTP/1.0". - - HTTP/1.0 servers must: - - o recognize the format of the Request-Line for HTTP/0.9 and - HTTP/1.0 requests; - - o understand any valid request in the format of HTTP/0.9 or - HTTP/1.0; - - o respond appropriately with a message in the same protocol - version used by the client. - - HTTP/1.0 clients must: - - o recognize the format of the Status-Line for HTTP/1.0 responses; - - o understand any valid response in the format of HTTP/0.9 or - HTTP/1.0. - - Proxy and gateway applications must be careful in forwarding requests - that are received in a format different than that of the - application's native HTTP version. Since the protocol version - indicates the protocol capability of the sender, a proxy/gateway must - never send a message with a version indicator which is greater than - its native version; if a higher version request is received, the - proxy/gateway must either downgrade the request version or respond - with an error. Requests with a version lower than that of the - application's native format may be upgraded before being forwarded; - the proxy/gateway's response to that request must follow the server - requirements listed above. - - - - - - - -Berners-Lee, et al Informational [Page 13] - -RFC 1945 HTTP/1.0 May 1996 - - -3.2 Uniform Resource Identifiers - - URIs have been known by many names: WWW addresses, Universal Document - Identifiers, Universal Resource Identifiers [2], and finally the - combination of Uniform Resource Locators (URL) [4] and Names (URN) - [16]. As far as HTTP is concerned, Uniform Resource Identifiers are - simply formatted strings which identify--via name, location, or any - other characteristic--a network resource. - -3.2.1 General Syntax - - URIs in HTTP can be represented in absolute form or relative to some - known base URI [9], depending upon the context of their use. The two - forms are differentiated by the fact that absolute URIs always begin - with a scheme name followed by a colon. - - URI = ( absoluteURI | relativeURI ) [ "#" fragment ] - - absoluteURI = scheme ":" *( uchar | reserved ) - - relativeURI = net_path | abs_path | rel_path - - net_path = "//" net_loc [ abs_path ] - abs_path = "/" rel_path - rel_path = [ path ] [ ";" params ] [ "?" query ] - - path = fsegment *( "/" segment ) - fsegment = 1*pchar - segment = *pchar - - params = param *( ";" param ) - param = *( pchar | "/" ) - - scheme = 1*( ALPHA | DIGIT | "+" | "-" | "." ) - net_loc = *( pchar | ";" | "?" ) - query = *( uchar | reserved ) - fragment = *( uchar | reserved ) - - pchar = uchar | ":" | "@" | "&" | "=" | "+" - uchar = unreserved | escape - unreserved = ALPHA | DIGIT | safe | extra | national - - escape = "%" HEX HEX - reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" - extra = "!" | "*" | "'" | "(" | ")" | "," - safe = "$" | "-" | "_" | "." - unsafe = CTL | SP | <"> | "#" | "%" | "<" | ">" - national = <any OCTET excluding ALPHA, DIGIT, - - - -Berners-Lee, et al Informational [Page 14] - -RFC 1945 HTTP/1.0 May 1996 - - - reserved, extra, safe, and unsafe> - - For definitive information on URL syntax and semantics, see RFC 1738 - [4] and RFC 1808 [9]. The BNF above includes national characters not - allowed in valid URLs as specified by RFC 1738, since HTTP servers - are not restricted in the set of unreserved characters allowed to - represent the rel_path part of addresses, and HTTP proxies may - receive requests for URIs not defined by RFC 1738. - -3.2.2 http URL - - The "http" scheme is used to locate network resources via the HTTP - protocol. This section defines the scheme-specific syntax and - semantics for http URLs. - - http_URL = "http:" "//" host [ ":" port ] [ abs_path ] - - host = <A legal Internet host domain name - or IP address (in dotted-decimal form), - as defined by Section 2.1 of RFC 1123> - - port = *DIGIT - - If the port is empty or not given, port 80 is assumed. The semantics - are that the identified resource is located at the server listening - for TCP connections on that port of that host, and the Request-URI - for the resource is abs_path. If the abs_path is not present in the - URL, it must be given as "/" when used as a Request-URI (Section - 5.1.2). - - Note: Although the HTTP protocol is independent of the transport - layer protocol, the http URL only identifies resources by their - TCP location, and thus non-TCP resources must be identified by - some other URI scheme. - - The canonical form for "http" URLs is obtained by converting any - UPALPHA characters in host to their LOALPHA equivalent (hostnames are - case-insensitive), eliding the [ ":" port ] if the port is 80, and - replacing an empty abs_path with "/". - -3.3 Date/Time Formats - - HTTP/1.0 applications have historically allowed three different - formats for the representation of date/time stamps: - - Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 - Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 - Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format - - - -Berners-Lee, et al Informational [Page 15] - -RFC 1945 HTTP/1.0 May 1996 - - - The first format is preferred as an Internet standard and represents - a fixed-length subset of that defined by RFC 1123 [6] (an update to - RFC 822 [7]). The second format is in common use, but is based on the - obsolete RFC 850 [10] date format and lacks a four-digit year. - HTTP/1.0 clients and servers that parse the date value should accept - all three formats, though they must never generate the third - (asctime) format. - - Note: Recipients of date values are encouraged to be robust in - accepting date values that may have been generated by non-HTTP - applications, as is sometimes the case when retrieving or posting - messages via proxies/gateways to SMTP or NNTP. - - All HTTP/1.0 date/time stamps must be represented in Universal Time - (UT), also known as Greenwich Mean Time (GMT), without exception. - This is indicated in the first two formats by the inclusion of "GMT" - as the three-letter abbreviation for time zone, and should be assumed - when reading the asctime format. - - HTTP-date = rfc1123-date | rfc850-date | asctime-date - - rfc1123-date = wkday "," SP date1 SP time SP "GMT" - rfc850-date = weekday "," SP date2 SP time SP "GMT" - asctime-date = wkday SP date3 SP time SP 4DIGIT - - date1 = 2DIGIT SP month SP 4DIGIT - ; day month year (e.g., 02 Jun 1982) - date2 = 2DIGIT "-" month "-" 2DIGIT - ; day-month-year (e.g., 02-Jun-82) - date3 = month SP ( 2DIGIT | ( SP 1DIGIT )) - ; month day (e.g., Jun 2) - - time = 2DIGIT ":" 2DIGIT ":" 2DIGIT - ; 00:00:00 - 23:59:59 - - wkday = "Mon" | "Tue" | "Wed" - | "Thu" | "Fri" | "Sat" | "Sun" - - weekday = "Monday" | "Tuesday" | "Wednesday" - | "Thursday" | "Friday" | "Saturday" | "Sunday" - - month = "Jan" | "Feb" | "Mar" | "Apr" - | "May" | "Jun" | "Jul" | "Aug" - | "Sep" | "Oct" | "Nov" | "Dec" - - Note: HTTP requirements for the date/time stamp format apply - only to their usage within the protocol stream. Clients and - servers are not required to use these formats for user - - - -Berners-Lee, et al Informational [Page 16] - -RFC 1945 HTTP/1.0 May 1996 - - - presentation, request logging, etc. - -3.4 Character Sets - - HTTP uses the same definition of the term "character set" as that - described for MIME: - - The term "character set" is used in this document to refer to a - method used with one or more tables to convert a sequence of - octets into a sequence of characters. Note that unconditional - conversion in the other direction is not required, in that not all - characters may be available in a given character set and a - character set may provide more than one sequence of octets to - represent a particular character. This definition is intended to - allow various kinds of character encodings, from simple single- - table mappings such as US-ASCII to complex table switching methods - such as those that use ISO 2022's techniques. However, the - definition associated with a MIME character set name must fully - specify the mapping to be performed from octets to characters. In - particular, use of external profiling information to determine the - exact mapping is not permitted. - - Note: This use of the term "character set" is more commonly - referred to as a "character encoding." However, since HTTP and - MIME share the same registry, it is important that the terminology - also be shared. - - HTTP character sets are identified by case-insensitive tokens. The - complete set of tokens are defined by the IANA Character Set registry - [15]. However, because that registry does not define a single, - consistent token for each character set, we define here the preferred - names for those character sets most likely to be used with HTTP - entities. These character sets include those registered by RFC 1521 - [5] -- the US-ASCII [17] and ISO-8859 [18] character sets -- and - other names specifically recommended for use within MIME charset - parameters. - - charset = "US-ASCII" - | "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3" - | "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6" - | "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9" - | "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR" - | "UNICODE-1-1" | "UNICODE-1-1-UTF-7" | "UNICODE-1-1-UTF-8" - | token - - Although HTTP allows an arbitrary token to be used as a charset - value, any token that has a predefined value within the IANA - Character Set registry [15] must represent the character set defined - - - -Berners-Lee, et al Informational [Page 17] - -RFC 1945 HTTP/1.0 May 1996 - - - by that registry. Applications should limit their use of character - sets to those defined by the IANA registry. - - The character set of an entity body should be labelled as the lowest - common denominator of the character codes used within that body, with - the exception that no label is preferred over the labels US-ASCII or - ISO-8859-1. - -3.5 Content Codings - - Content coding values are used to indicate an encoding transformation - that has been applied to a resource. Content codings are primarily - used to allow a document to be compressed or encrypted without losing - the identity of its underlying media type. Typically, the resource is - stored in this encoding and only decoded before rendering or - analogous usage. - - content-coding = "x-gzip" | "x-compress" | token - - Note: For future compatibility, HTTP/1.0 applications should - consider "gzip" and "compress" to be equivalent to "x-gzip" - and "x-compress", respectively. - - All content-coding values are case-insensitive. HTTP/1.0 uses - content-coding values in the Content-Encoding (Section 10.3) header - field. Although the value describes the content-coding, what is more - important is that it indicates what decoding mechanism will be - required to remove the encoding. Note that a single program may be - capable of decoding multiple content-coding formats. Two values are - defined by this specification: - - x-gzip - An encoding format produced by the file compression program - "gzip" (GNU zip) developed by Jean-loup Gailly. This format is - typically a Lempel-Ziv coding (LZ77) with a 32 bit CRC. - - x-compress - The encoding format produced by the file compression program - "compress". This format is an adaptive Lempel-Ziv-Welch coding - (LZW). - - Note: Use of program names for the identification of - encoding formats is not desirable and should be discouraged - for future encodings. Their use here is representative of - historical practice, not good design. - - - - - - -Berners-Lee, et al Informational [Page 18] - -RFC 1945 HTTP/1.0 May 1996 - - -3.6 Media Types - - HTTP uses Internet Media Types [13] in the Content-Type header field - (Section 10.5) in order to provide open and extensible data typing. - - media-type = type "/" subtype *( ";" parameter ) - type = token - subtype = token - - Parameters may follow the type/subtype in the form of attribute/value - pairs. - - parameter = attribute "=" value - attribute = token - value = token | quoted-string - - The type, subtype, and parameter attribute names are case- - insensitive. Parameter values may or may not be case-sensitive, - depending on the semantics of the parameter name. LWS must not be - generated between the type and subtype, nor between an attribute and - its value. Upon receipt of a media type with an unrecognized - parameter, a user agent should treat the media type as if the - unrecognized parameter and its value were not present. - - Some older HTTP applications do not recognize media type parameters. - HTTP/1.0 applications should only use media type parameters when they - are necessary to define the content of a message. - - Media-type values are registered with the Internet Assigned Number - Authority (IANA [15]). The media type registration process is - outlined in RFC 1590 [13]. Use of non-registered media types is - discouraged. - -3.6.1 Canonicalization and Text Defaults - - Internet media types are registered with a canonical form. In - general, an Entity-Body transferred via HTTP must be represented in - the appropriate canonical form prior to its transmission. If the body - has been encoded with a Content-Encoding, the underlying data should - be in canonical form prior to being encoded. - - Media subtypes of the "text" type use CRLF as the text line break - when in canonical form. However, HTTP allows the transport of text - media with plain CR or LF alone representing a line break when used - consistently within the Entity-Body. HTTP applications must accept - CRLF, bare CR, and bare LF as being representative of a line break in - text media received via HTTP. - - - - -Berners-Lee, et al Informational [Page 19] - -RFC 1945 HTTP/1.0 May 1996 - - - In addition, if the text media is represented in a character set that - does not use octets 13 and 10 for CR and LF respectively, as is the - case for some multi-byte character sets, HTTP allows the use of - whatever octet sequences are defined by that character set to - represent the equivalent of CR and LF for line breaks. This - flexibility regarding line breaks applies only to text media in the - Entity-Body; a bare CR or LF should not be substituted for CRLF - within any of the HTTP control structures (such as header fields and - multipart boundaries). - - The "charset" parameter is used with some media types to define the - character set (Section 3.4) of the data. When no explicit charset - parameter is provided by the sender, media subtypes of the "text" - type are defined to have a default charset value of "ISO-8859-1" when - received via HTTP. Data in character sets other than "ISO-8859-1" or - its subsets must be labelled with an appropriate charset value in - order to be consistently interpreted by the recipient. - - Note: Many current HTTP servers provide data using charsets other - than "ISO-8859-1" without proper labelling. This situation reduces - interoperability and is not recommended. To compensate for this, - some HTTP user agents provide a configuration option to allow the - user to change the default interpretation of the media type - character set when no charset parameter is given. - -3.6.2 Multipart Types - - MIME provides for a number of "multipart" types -- encapsulations of - several entities within a single message's Entity-Body. The multipart - types registered by IANA [15] do not have any special meaning for - HTTP/1.0, though user agents may need to understand each type in - order to correctly interpret the purpose of each body-part. An HTTP - user agent should follow the same or similar behavior as a MIME user - agent does upon receipt of a multipart type. HTTP servers should not - assume that all HTTP clients are prepared to handle multipart types. - - All multipart types share a common syntax and must include a boundary - parameter as part of the media type value. The message body is itself - a protocol element and must therefore use only CRLF to represent line - breaks between body-parts. Multipart body-parts may contain HTTP - header fields which are significant to the meaning of that part. - -3.7 Product Tokens - - Product tokens are used to allow communicating applications to - identify themselves via a simple product token, with an optional - slash and version designator. Most fields using product tokens also - allow subproducts which form a significant part of the application to - - - -Berners-Lee, et al Informational [Page 20] - -RFC 1945 HTTP/1.0 May 1996 - - - be listed, separated by whitespace. By convention, the products are - listed in order of their significance for identifying the - application. - - product = token ["/" product-version] - product-version = token - - Examples: - - User-Agent: CERN-LineMode/2.15 libwww/2.17b3 - - Server: Apache/0.8.4 - - Product tokens should be short and to the point -- use of them for - advertizing or other non-essential information is explicitly - forbidden. Although any token character may appear in a product- - version, this token should only be used for a version identifier - (i.e., successive versions of the same product should only differ in - the product-version portion of the product value). - -4. HTTP Message - -4.1 Message Types - - HTTP messages consist of requests from client to server and responses - from server to client. - - HTTP-message = Simple-Request ; HTTP/0.9 messages - | Simple-Response - | Full-Request ; HTTP/1.0 messages - | Full-Response - - Full-Request and Full-Response use the generic message format of RFC - 822 [7] for transferring entities. Both messages may include optional - header fields (also known as "headers") and an entity body. The - entity body is separated from the headers by a null line (i.e., a - line with nothing preceding the CRLF). - - Full-Request = Request-Line ; Section 5.1 - *( General-Header ; Section 4.3 - | Request-Header ; Section 5.2 - | Entity-Header ) ; Section 7.1 - CRLF - [ Entity-Body ] ; Section 7.2 - - Full-Response = Status-Line ; Section 6.1 - *( General-Header ; Section 4.3 - | Response-Header ; Section 6.2 - - - -Berners-Lee, et al Informational [Page 21] - -RFC 1945 HTTP/1.0 May 1996 - - - | Entity-Header ) ; Section 7.1 - CRLF - [ Entity-Body ] ; Section 7.2 - - Simple-Request and Simple-Response do not allow the use of any header - information and are limited to a single request method (GET). - - Simple-Request = "GET" SP Request-URI CRLF - - Simple-Response = [ Entity-Body ] - - Use of the Simple-Request format is discouraged because it prevents - the server from identifying the media type of the returned entity. - -4.2 Message Headers - - HTTP header fields, which include General-Header (Section 4.3), - Request-Header (Section 5.2), Response-Header (Section 6.2), and - Entity-Header (Section 7.1) fields, follow the same generic format as - that given in Section 3.1 of RFC 822 [7]. Each header field consists - of a name followed immediately by a colon (":"), a single space (SP) - character, and the field value. Field names are case-insensitive. - Header fields can be extended over multiple lines by preceding each - extra line with at least one SP or HT, though this is not - recommended. - - HTTP-header = field-name ":" [ field-value ] CRLF - - field-name = token - field-value = *( field-content | LWS ) - - field-content = <the OCTETs making up the field-value - and consisting of either *TEXT or combinations - of token, tspecials, and quoted-string> - - The order in which header fields are received is not significant. - However, it is "good practice" to send General-Header fields first, - followed by Request-Header or Response-Header fields prior to the - Entity-Header fields. - - Multiple HTTP-header fields with the same field-name may be present - in a message if and only if the entire field-value for that header - field is defined as a comma-separated list [i.e., #(values)]. It must - be possible to combine the multiple header fields into one "field- - name: field-value" pair, without changing the semantics of the - message, by appending each subsequent field-value to the first, each - separated by a comma. - - - - -Berners-Lee, et al Informational [Page 22] - -RFC 1945 HTTP/1.0 May 1996 - - -4.3 General Header Fields - - There are a few header fields which have general applicability for - both request and response messages, but which do not apply to the - entity being transferred. These headers apply only to the message - being transmitted. - - General-Header = Date ; Section 10.6 - | Pragma ; Section 10.12 - - General header field names can be extended reliably only in - combination with a change in the protocol version. However, new or - experimental header fields may be given the semantics of general - header fields if all parties in the communication recognize them to - be general header fields. Unrecognized header fields are treated as - Entity-Header fields. - -5. Request - - A request message from a client to a server includes, within the - first line of that message, the method to be applied to the resource, - the identifier of the resource, and the protocol version in use. For - backwards compatibility with the more limited HTTP/0.9 protocol, - there are two valid formats for an HTTP request: - - Request = Simple-Request | Full-Request - - Simple-Request = "GET" SP Request-URI CRLF - - Full-Request = Request-Line ; Section 5.1 - *( General-Header ; Section 4.3 - | Request-Header ; Section 5.2 - | Entity-Header ) ; Section 7.1 - CRLF - [ Entity-Body ] ; Section 7.2 - - If an HTTP/1.0 server receives a Simple-Request, it must respond with - an HTTP/0.9 Simple-Response. An HTTP/1.0 client capable of receiving - a Full-Response should never generate a Simple-Request. - -5.1 Request-Line - - The Request-Line begins with a method token, followed by the - Request-URI and the protocol version, and ending with CRLF. The - elements are separated by SP characters. No CR or LF are allowed - except in the final CRLF sequence. - - Request-Line = Method SP Request-URI SP HTTP-Version CRLF - - - -Berners-Lee, et al Informational [Page 23] - -RFC 1945 HTTP/1.0 May 1996 - - - Note that the difference between a Simple-Request and the Request- - Line of a Full-Request is the presence of the HTTP-Version field and - the availability of methods other than GET. - -5.1.1 Method - - The Method token indicates the method to be performed on the resource - identified by the Request-URI. The method is case-sensitive. - - Method = "GET" ; Section 8.1 - | "HEAD" ; Section 8.2 - | "POST" ; Section 8.3 - | extension-method - - extension-method = token - - The list of methods acceptable by a specific resource can change - dynamically; the client is notified through the return code of the - response if a method is not allowed on a resource. Servers should - return the status code 501 (not implemented) if the method is - unrecognized or not implemented. - - The methods commonly used by HTTP/1.0 applications are fully defined - in Section 8. - -5.1.2 Request-URI - - The Request-URI is a Uniform Resource Identifier (Section 3.2) and - identifies the resource upon which to apply the request. - - Request-URI = absoluteURI | abs_path - - The two options for Request-URI are dependent on the nature of the - request. - - The absoluteURI form is only allowed when the request is being made - to a proxy. The proxy is requested to forward the request and return - the response. If the request is GET or HEAD and a prior response is - cached, the proxy may use the cached message if it passes any - restrictions in the Expires header field. Note that the proxy may - forward the request on to another proxy or directly to the server - specified by the absoluteURI. In order to avoid request loops, a - proxy must be able to recognize all of its server names, including - any aliases, local variations, and the numeric IP address. An example - Request-Line would be: - - GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.0 - - - - -Berners-Lee, et al Informational [Page 24] - -RFC 1945 HTTP/1.0 May 1996 - - - The most common form of Request-URI is that used to identify a - resource on an origin server or gateway. In this case, only the - absolute path of the URI is transmitted (see Section 3.2.1, - abs_path). For example, a client wishing to retrieve the resource - above directly from the origin server would create a TCP connection - to port 80 of the host "www.w3.org" and send the line: - - GET /pub/WWW/TheProject.html HTTP/1.0 - - followed by the remainder of the Full-Request. Note that the absolute - path cannot be empty; if none is present in the original URI, it must - be given as "/" (the server root). - - The Request-URI is transmitted as an encoded string, where some - characters may be escaped using the "% HEX HEX" encoding defined by - RFC 1738 [4]. The origin server must decode the Request-URI in order - to properly interpret the request. - -5.2 Request Header Fields - - The request header fields allow the client to pass additional - information about the request, and about the client itself, to the - server. These fields act as request modifiers, with semantics - equivalent to the parameters on a programming language method - (procedure) invocation. - - Request-Header = Authorization ; Section 10.2 - | From ; Section 10.8 - | If-Modified-Since ; Section 10.9 - | Referer ; Section 10.13 - | User-Agent ; Section 10.15 - - Request-Header field names can be extended reliably only in - combination with a change in the protocol version. However, new or - experimental header fields may be given the semantics of request - header fields if all parties in the communication recognize them to - be request header fields. Unrecognized header fields are treated as - Entity-Header fields. - -6. Response - - After receiving and interpreting a request message, a server responds - in the form of an HTTP response message. - - Response = Simple-Response | Full-Response - - Simple-Response = [ Entity-Body ] - - - - -Berners-Lee, et al Informational [Page 25] - -RFC 1945 HTTP/1.0 May 1996 - - - Full-Response = Status-Line ; Section 6.1 - *( General-Header ; Section 4.3 - | Response-Header ; Section 6.2 - | Entity-Header ) ; Section 7.1 - CRLF - [ Entity-Body ] ; Section 7.2 - - A Simple-Response should only be sent in response to an HTTP/0.9 - Simple-Request or if the server only supports the more limited - HTTP/0.9 protocol. If a client sends an HTTP/1.0 Full-Request and - receives a response that does not begin with a Status-Line, it should - assume that the response is a Simple-Response and parse it - accordingly. Note that the Simple-Response consists only of the - entity body and is terminated by the server closing the connection. - -6.1 Status-Line - - The first line of a Full-Response message is the Status-Line, - consisting of the protocol version followed by a numeric status code - and its associated textual phrase, with each element separated by SP - characters. No CR or LF is allowed except in the final CRLF sequence. - - Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF - - Since a status line always begins with the protocol version and - status code - - "HTTP/" 1*DIGIT "." 1*DIGIT SP 3DIGIT SP - - (e.g., "HTTP/1.0 200 "), the presence of that expression is - sufficient to differentiate a Full-Response from a Simple-Response. - Although the Simple-Response format may allow such an expression to - occur at the beginning of an entity body, and thus cause a - misinterpretation of the message if it was given in response to a - Full-Request, most HTTP/0.9 servers are limited to responses of type - "text/html" and therefore would never generate such a response. - -6.1.1 Status Code and Reason Phrase - - The Status-Code element is a 3-digit integer result code of the - attempt to understand and satisfy the request. The Reason-Phrase is - intended to give a short textual description of the Status-Code. The - Status-Code is intended for use by automata and the Reason-Phrase is - intended for the human user. The client is not required to examine or - display the Reason-Phrase. - - - - - - -Berners-Lee, et al Informational [Page 26] - -RFC 1945 HTTP/1.0 May 1996 - - - The first digit of the Status-Code defines the class of response. The - last two digits do not have any categorization role. There are 5 - values for the first digit: - - o 1xx: Informational - Not used, but reserved for future use - - o 2xx: Success - The action was successfully received, - understood, and accepted. - - o 3xx: Redirection - Further action must be taken in order to - complete the request - - o 4xx: Client Error - The request contains bad syntax or cannot - be fulfilled - - o 5xx: Server Error - The server failed to fulfill an apparently - valid request - - The individual values of the numeric status codes defined for - HTTP/1.0, and an example set of corresponding Reason-Phrase's, are - presented below. The reason phrases listed here are only recommended - -- they may be replaced by local equivalents without affecting the - protocol. These codes are fully defined in Section 9. - - Status-Code = "200" ; OK - | "201" ; Created - | "202" ; Accepted - | "204" ; No Content - | "301" ; Moved Permanently - | "302" ; Moved Temporarily - | "304" ; Not Modified - | "400" ; Bad Request - | "401" ; Unauthorized - | "403" ; Forbidden - | "404" ; Not Found - | "500" ; Internal Server Error - | "501" ; Not Implemented - | "502" ; Bad Gateway - | "503" ; Service Unavailable - | extension-code - - extension-code = 3DIGIT - - Reason-Phrase = *<TEXT, excluding CR, LF> - - HTTP status codes are extensible, but the above codes are the only - ones generally recognized in current practice. HTTP applications are - not required to understand the meaning of all registered status - - - -Berners-Lee, et al Informational [Page 27] - -RFC 1945 HTTP/1.0 May 1996 - - - codes, though such understanding is obviously desirable. However, - applications must understand the class of any status code, as - indicated by the first digit, and treat any unrecognized response as - being equivalent to the x00 status code of that class, with the - exception that an unrecognized response must not be cached. For - example, if an unrecognized status code of 431 is received by the - client, it can safely assume that there was something wrong with its - request and treat the response as if it had received a 400 status - code. In such cases, user agents should present to the user the - entity returned with the response, since that entity is likely to - include human-readable information which will explain the unusual - status. - -6.2 Response Header Fields - - The response header fields allow the server to pass additional - information about the response which cannot be placed in the Status- - Line. These header fields give information about the server and about - further access to the resource identified by the Request-URI. - - Response-Header = Location ; Section 10.11 - | Server ; Section 10.14 - | WWW-Authenticate ; Section 10.16 - - Response-Header field names can be extended reliably only in - combination with a change in the protocol version. However, new or - experimental header fields may be given the semantics of response - header fields if all parties in the communication recognize them to - be response header fields. Unrecognized header fields are treated as - Entity-Header fields. - -7. Entity - - Full-Request and Full-Response messages may transfer an entity within - some requests and responses. An entity consists of Entity-Header - fields and (usually) an Entity-Body. In this section, both sender and - recipient refer to either the client or the server, depending on who - sends and who receives the entity. - - - - - - - - - - - - - -Berners-Lee, et al Informational [Page 28] - -RFC 1945 HTTP/1.0 May 1996 - - -7.1 Entity Header Fields - - Entity-Header fields define optional metainformation about the - Entity-Body or, if no body is present, about the resource identified - by the request. - - Entity-Header = Allow ; Section 10.1 - | Content-Encoding ; Section 10.3 - | Content-Length ; Section 10.4 - | Content-Type ; Section 10.5 - | Expires ; Section 10.7 - | Last-Modified ; Section 10.10 - | extension-header - - extension-header = HTTP-header - - The extension-header mechanism allows additional Entity-Header fields - to be defined without changing the protocol, but these fields cannot - be assumed to be recognizable by the recipient. Unrecognized header - fields should be ignored by the recipient and forwarded by proxies. - -7.2 Entity Body - - The entity body (if any) sent with an HTTP request or response is in - a format and encoding defined by the Entity-Header fields. - - Entity-Body = *OCTET - - An entity body is included with a request message only when the - request method calls for one. The presence of an entity body in a - request is signaled by the inclusion of a Content-Length header field - in the request message headers. HTTP/1.0 requests containing an - entity body must include a valid Content-Length header field. - - For response messages, whether or not an entity body is included with - a message is dependent on both the request method and the response - code. All responses to the HEAD request method must not include a - body, even though the presence of entity header fields may lead one - to believe they do. All 1xx (informational), 204 (no content), and - 304 (not modified) responses must not include a body. All other - responses must include an entity body or a Content-Length header - field defined with a value of zero (0). - -7.2.1 Type - - When an Entity-Body is included with a message, the data type of that - body is determined via the header fields Content-Type and Content- - Encoding. These define a two-layer, ordered encoding model: - - - -Berners-Lee, et al Informational [Page 29] - -RFC 1945 HTTP/1.0 May 1996 - - - entity-body := Content-Encoding( Content-Type( data ) ) - - A Content-Type specifies the media type of the underlying data. A - Content-Encoding may be used to indicate any additional content - coding applied to the type, usually for the purpose of data - compression, that is a property of the resource requested. The - default for the content encoding is none (i.e., the identity - function). - - Any HTTP/1.0 message containing an entity body should include a - Content-Type header field defining the media type of that body. If - and only if the media type is not given by a Content-Type header, as - is the case for Simple-Response messages, the recipient may attempt - to guess the media type via inspection of its content and/or the name - extension(s) of the URL used to identify the resource. If the media - type remains unknown, the recipient should treat it as type - "application/octet-stream". - -7.2.2 Length - - When an Entity-Body is included with a message, the length of that - body may be determined in one of two ways. If a Content-Length header - field is present, its value in bytes represents the length of the - Entity-Body. Otherwise, the body length is determined by the closing - of the connection by the server. - - Closing the connection cannot be used to indicate the end of a - request body, since it leaves no possibility for the server to send - back a response. Therefore, HTTP/1.0 requests containing an entity - body must include a valid Content-Length header field. If a request - contains an entity body and Content-Length is not specified, and the - server does not recognize or cannot calculate the length from other - fields, then the server should send a 400 (bad request) response. - - Note: Some older servers supply an invalid Content-Length when - sending a document that contains server-side includes dynamically - inserted into the data stream. It must be emphasized that this - will not be tolerated by future versions of HTTP. Unless the - client knows that it is receiving a response from a compliant - server, it should not depend on the Content-Length value being - correct. - -8. Method Definitions - - The set of common methods for HTTP/1.0 is defined below. Although - this set can be expanded, additional methods cannot be assumed to - share the same semantics for separately extended clients and servers. - - - - -Berners-Lee, et al Informational [Page 30] - -RFC 1945 HTTP/1.0 May 1996 - - -8.1 GET - - The GET method means retrieve whatever information (in the form of an - entity) is identified by the Request-URI. If the Request-URI refers - to a data-producing process, it is the produced data which shall be - returned as the entity in the response and not the source text of the - process, unless that text happens to be the output of the process. - - The semantics of the GET method changes to a "conditional GET" if the - request message includes an If-Modified-Since header field. A - conditional GET method requests that the identified resource be - transferred only if it has been modified since the date given by the - If-Modified-Since header, as described in Section 10.9. The - conditional GET method is intended to reduce network usage by - allowing cached entities to be refreshed without requiring multiple - requests or transferring unnecessary data. - -8.2 HEAD - - The HEAD method is identical to GET except that the server must not - return any Entity-Body in the response. The metainformation contained - in the HTTP headers in response to a HEAD request should be identical - to the information sent in response to a GET request. This method can - be used for obtaining metainformation about the resource identified - by the Request-URI without transferring the Entity-Body itself. This - method is often used for testing hypertext links for validity, - accessibility, and recent modification. - - There is no "conditional HEAD" request analogous to the conditional - GET. If an If-Modified-Since header field is included with a HEAD - request, it should be ignored. - -8.3 POST - - The POST method is used to request that the destination server accept - the entity enclosed in the request as a new subordinate of the - resource identified by the Request-URI in the Request-Line. POST is - designed to allow a uniform method to cover the following functions: - - o Annotation of existing resources; - - o Posting a message to a bulletin board, newsgroup, mailing list, - or similar group of articles; - - o Providing a block of data, such as the result of submitting a - form [3], to a data-handling process; - - o Extending a database through an append operation. - - - -Berners-Lee, et al Informational [Page 31] - -RFC 1945 HTTP/1.0 May 1996 - - - The actual function performed by the POST method is determined by the - server and is usually dependent on the Request-URI. The posted entity - is subordinate to that URI in the same way that a file is subordinate - to a directory containing it, a news article is subordinate to a - newsgroup to which it is posted, or a record is subordinate to a - database. - - A successful POST does not require that the entity be created as a - resource on the origin server or made accessible for future - reference. That is, the action performed by the POST method might not - result in a resource that can be identified by a URI. In this case, - either 200 (ok) or 204 (no content) is the appropriate response - status, depending on whether or not the response includes an entity - that describes the result. - - If a resource has been created on the origin server, the response - should be 201 (created) and contain an entity (preferably of type - "text/html") which describes the status of the request and refers to - the new resource. - - A valid Content-Length is required on all HTTP/1.0 POST requests. An - HTTP/1.0 server should respond with a 400 (bad request) message if it - cannot determine the length of the request message's content. - - Applications must not cache responses to a POST request because the - application has no way of knowing that the server would return an - equivalent response on some future request. - -9. Status Code Definitions - - Each Status-Code is described below, including a description of which - method(s) it can follow and any metainformation required in the - response. - -9.1 Informational 1xx - - This class of status code indicates a provisional response, - consisting only of the Status-Line and optional headers, and is - terminated by an empty line. HTTP/1.0 does not define any 1xx status - codes and they are not a valid response to a HTTP/1.0 request. - However, they may be useful for experimental applications which are - outside the scope of this specification. - -9.2 Successful 2xx - - This class of status code indicates that the client's request was - successfully received, understood, and accepted. - - - - -Berners-Lee, et al Informational [Page 32] - -RFC 1945 HTTP/1.0 May 1996 - - - 200 OK - - The request has succeeded. The information returned with the - response is dependent on the method used in the request, as follows: - - GET an entity corresponding to the requested resource is sent - in the response; - - HEAD the response must only contain the header information and - no Entity-Body; - - POST an entity describing or containing the result of the action. - - 201 Created - - The request has been fulfilled and resulted in a new resource being - created. The newly created resource can be referenced by the URI(s) - returned in the entity of the response. The origin server should - create the resource before using this Status-Code. If the action - cannot be carried out immediately, the server must include in the - response body a description of when the resource will be available; - otherwise, the server should respond with 202 (accepted). - - Of the methods defined by this specification, only POST can create a - resource. - - 202 Accepted - - The request has been accepted for processing, but the processing - has not been completed. The request may or may not eventually be - acted upon, as it may be disallowed when processing actually takes - place. There is no facility for re-sending a status code from an - asynchronous operation such as this. - - The 202 response is intentionally non-committal. Its purpose is to - allow a server to accept a request for some other process (perhaps - a batch-oriented process that is only run once per day) without - requiring that the user agent's connection to the server persist - until the process is completed. The entity returned with this - response should include an indication of the request's current - status and either a pointer to a status monitor or some estimate of - when the user can expect the request to be fulfilled. - - 204 No Content - - The server has fulfilled the request but there is no new - information to send back. If the client is a user agent, it should - not change its document view from that which caused the request to - - - -Berners-Lee, et al Informational [Page 33] - -RFC 1945 HTTP/1.0 May 1996 - - - be generated. This response is primarily intended to allow input - for scripts or other actions to take place without causing a change - to the user agent's active document view. The response may include - new metainformation in the form of entity headers, which should - apply to the document currently in the user agent's active view. - -9.3 Redirection 3xx - - This class of status code indicates that further action needs to be - taken by the user agent in order to fulfill the request. The action - required may be carried out by the user agent without interaction - with the user if and only if the method used in the subsequent - request is GET or HEAD. A user agent should never automatically - redirect a request more than 5 times, since such redirections usually - indicate an infinite loop. - - 300 Multiple Choices - - This response code is not directly used by HTTP/1.0 applications, - but serves as the default for interpreting the 3xx class of - responses. - - The requested resource is available at one or more locations. - Unless it was a HEAD request, the response should include an entity - containing a list of resource characteristics and locations from - which the user or user agent can choose the one most appropriate. - If the server has a preferred choice, it should include the URL in - a Location field; user agents may use this field value for - automatic redirection. - - 301 Moved Permanently - - The requested resource has been assigned a new permanent URL and - any future references to this resource should be done using that - URL. Clients with link editing capabilities should automatically - relink references to the Request-URI to the new reference returned - by the server, where possible. - - The new URL must be given by the Location field in the response. - Unless it was a HEAD request, the Entity-Body of the response - should contain a short note with a hyperlink to the new URL. - - If the 301 status code is received in response to a request using - the POST method, the user agent must not automatically redirect the - request unless it can be confirmed by the user, since this might - change the conditions under which the request was issued. - - - - - -Berners-Lee, et al Informational [Page 34] - -RFC 1945 HTTP/1.0 May 1996 - - - Note: When automatically redirecting a POST request after - receiving a 301 status code, some existing user agents will - erroneously change it into a GET request. - - 302 Moved Temporarily - - The requested resource resides temporarily under a different URL. - Since the redirection may be altered on occasion, the client should - continue to use the Request-URI for future requests. - - The URL must be given by the Location field in the response. Unless - it was a HEAD request, the Entity-Body of the response should - contain a short note with a hyperlink to the new URI(s). - - If the 302 status code is received in response to a request using - the POST method, the user agent must not automatically redirect the - request unless it can be confirmed by the user, since this might - change the conditions under which the request was issued. - - Note: When automatically redirecting a POST request after - receiving a 302 status code, some existing user agents will - erroneously change it into a GET request. - - 304 Not Modified - - If the client has performed a conditional GET request and access is - allowed, but the document has not been modified since the date and - time specified in the If-Modified-Since field, the server must - respond with this status code and not send an Entity-Body to the - client. Header fields contained in the response should only include - information which is relevant to cache managers or which may have - changed independently of the entity's Last-Modified date. Examples - of relevant header fields include: Date, Server, and Expires. A - cache should update its cached entity to reflect any new field - values given in the 304 response. - -9.4 Client Error 4xx - - The 4xx class of status code is intended for cases in which the - client seems to have erred. If the client has not completed the - request when a 4xx code is received, it should immediately cease - sending data to the server. Except when responding to a HEAD request, - the server should include an entity containing an explanation of the - error situation, and whether it is a temporary or permanent - condition. These status codes are applicable to any request method. - - - - - - -Berners-Lee, et al Informational [Page 35] - -RFC 1945 HTTP/1.0 May 1996 - - - Note: If the client is sending data, server implementations on TCP - should be careful to ensure that the client acknowledges receipt - of the packet(s) containing the response prior to closing the - input connection. If the client continues sending data to the - server after the close, the server's controller will send a reset - packet to the client, which may erase the client's unacknowledged - input buffers before they can be read and interpreted by the HTTP - application. - - 400 Bad Request - - The request could not be understood by the server due to malformed - syntax. The client should not repeat the request without - modifications. - - 401 Unauthorized - - The request requires user authentication. The response must include - a WWW-Authenticate header field (Section 10.16) containing a - challenge applicable to the requested resource. The client may - repeat the request with a suitable Authorization header field - (Section 10.2). If the request already included Authorization - credentials, then the 401 response indicates that authorization has - been refused for those credentials. If the 401 response contains - the same challenge as the prior response, and the user agent has - already attempted authentication at least once, then the user - should be presented the entity that was given in the response, - since that entity may include relevant diagnostic information. HTTP - access authentication is explained in Section 11. - - 403 Forbidden - - The server understood the request, but is refusing to fulfill it. - Authorization will not help and the request should not be repeated. - If the request method was not HEAD and the server wishes to make - public why the request has not been fulfilled, it should describe - the reason for the refusal in the entity body. This status code is - commonly used when the server does not wish to reveal exactly why - the request has been refused, or when no other response is - applicable. - - 404 Not Found - - The server has not found anything matching the Request-URI. No - indication is given of whether the condition is temporary or - permanent. If the server does not wish to make this information - available to the client, the status code 403 (forbidden) can be - used instead. - - - -Berners-Lee, et al Informational [Page 36] - -RFC 1945 HTTP/1.0 May 1996 - - -9.5 Server Error 5xx - - Response status codes beginning with the digit "5" indicate cases in - which the server is aware that it has erred or is incapable of - performing the request. If the client has not completed the request - when a 5xx code is received, it should immediately cease sending data - to the server. Except when responding to a HEAD request, the server - should include an entity containing an explanation of the error - situation, and whether it is a temporary or permanent condition. - These response codes are applicable to any request method and there - are no required header fields. - - 500 Internal Server Error - - The server encountered an unexpected condition which prevented it - from fulfilling the request. - - 501 Not Implemented - - The server does not support the functionality required to fulfill - the request. This is the appropriate response when the server does - not recognize the request method and is not capable of supporting - it for any resource. - - 502 Bad Gateway - - The server, while acting as a gateway or proxy, received an invalid - response from the upstream server it accessed in attempting to - fulfill the request. - - 503 Service Unavailable - - The server is currently unable to handle the request due to a - temporary overloading or maintenance of the server. The implication - is that this is a temporary condition which will be alleviated - after some delay. - - Note: The existence of the 503 status code does not imply - that a server must use it when becoming overloaded. Some - servers may wish to simply refuse the connection. - -10. Header Field Definitions - - This section defines the syntax and semantics of all commonly used - HTTP/1.0 header fields. For general and entity header fields, both - sender and recipient refer to either the client or the server, - depending on who sends and who receives the message. - - - - -Berners-Lee, et al Informational [Page 37] - -RFC 1945 HTTP/1.0 May 1996 - - -10.1 Allow - - The Allow entity-header field lists the set of methods supported by - the resource identified by the Request-URI. The purpose of this field - is strictly to inform the recipient of valid methods associated with - the resource. The Allow header field is not permitted in a request - using the POST method, and thus should be ignored if it is received - as part of a POST entity. - - Allow = "Allow" ":" 1#method - - Example of use: - - Allow: GET, HEAD - - This field cannot prevent a client from trying other methods. - However, the indications given by the Allow header field value should - be followed. The actual set of allowed methods is defined by the - origin server at the time of each request. - - A proxy must not modify the Allow header field even if it does not - understand all the methods specified, since the user agent may have - other means of communicating with the origin server. - - The Allow header field does not indicate what methods are implemented - by the server. - -10.2 Authorization - - A user agent that wishes to authenticate itself with a server-- - usually, but not necessarily, after receiving a 401 response--may do - so by including an Authorization request-header field with the - request. The Authorization field value consists of credentials - containing the authentication information of the user agent for the - realm of the resource being requested. - - Authorization = "Authorization" ":" credentials - - HTTP access authentication is described in Section 11. If a request - is authenticated and a realm specified, the same credentials should - be valid for all other requests within this realm. - - Responses to requests containing an Authorization field are not - cachable. - - - - - - - -Berners-Lee, et al Informational [Page 38] - -RFC 1945 HTTP/1.0 May 1996 - - -10.3 Content-Encoding - - The Content-Encoding entity-header field is used as a modifier to the - media-type. When present, its value indicates what additional content - coding has been applied to the resource, and thus what decoding - mechanism must be applied in order to obtain the media-type - referenced by the Content-Type header field. The Content-Encoding is - primarily used to allow a document to be compressed without losing - the identity of its underlying media type. - - Content-Encoding = "Content-Encoding" ":" content-coding - - Content codings are defined in Section 3.5. An example of its use is - - Content-Encoding: x-gzip - - The Content-Encoding is a characteristic of the resource identified - by the Request-URI. Typically, the resource is stored with this - encoding and is only decoded before rendering or analogous usage. - -10.4 Content-Length - - The Content-Length entity-header field indicates the size of the - Entity-Body, in decimal number of octets, sent to the recipient or, - in the case of the HEAD method, the size of the Entity-Body that - would have been sent had the request been a GET. - - Content-Length = "Content-Length" ":" 1*DIGIT - - An example is - - Content-Length: 3495 - - Applications should use this field to indicate the size of the - Entity-Body to be transferred, regardless of the media type of the - entity. A valid Content-Length field value is required on all - HTTP/1.0 request messages containing an entity body. - - Any Content-Length greater than or equal to zero is a valid value. - Section 7.2.2 describes how to determine the length of a response - entity body if a Content-Length is not given. - - Note: The meaning of this field is significantly different from - the corresponding definition in MIME, where it is an optional - field used within the "message/external-body" content-type. In - HTTP, it should be used whenever the entity's length can be - determined prior to being transferred. - - - - -Berners-Lee, et al Informational [Page 39] - -RFC 1945 HTTP/1.0 May 1996 - - -10.5 Content-Type - - The Content-Type entity-header field indicates the media type of the - Entity-Body sent to the recipient or, in the case of the HEAD method, - the media type that would have been sent had the request been a GET. - - Content-Type = "Content-Type" ":" media-type - - Media types are defined in Section 3.6. An example of the field is - - Content-Type: text/html - - Further discussion of methods for identifying the media type of an - entity is provided in Section 7.2.1. - -10.6 Date - - The Date general-header field represents the date and time at which - the message was originated, having the same semantics as orig-date in - RFC 822. The field value is an HTTP-date, as described in Section - 3.3. - - Date = "Date" ":" HTTP-date - - An example is - - Date: Tue, 15 Nov 1994 08:12:31 GMT - - If a message is received via direct connection with the user agent - (in the case of requests) or the origin server (in the case of - responses), then the date can be assumed to be the current date at - the receiving end. However, since the date--as it is believed by the - origin--is important for evaluating cached responses, origin servers - should always include a Date header. Clients should only send a Date - header field in messages that include an entity body, as in the case - of the POST request, and even then it is optional. A received message - which does not have a Date header field should be assigned one by the - recipient if the message will be cached by that recipient or - gatewayed via a protocol which requires a Date. - - In theory, the date should represent the moment just before the - entity is generated. In practice, the date can be generated at any - time during the message origination without affecting its semantic - value. - - Note: An earlier version of this document incorrectly specified - that this field should contain the creation date of the enclosed - Entity-Body. This has been changed to reflect actual (and proper) - - - -Berners-Lee, et al Informational [Page 40] - -RFC 1945 HTTP/1.0 May 1996 - - - usage. - -10.7 Expires - - The Expires entity-header field gives the date/time after which the - entity should be considered stale. This allows information providers - to suggest the volatility of the resource, or a date after which the - information may no longer be valid. Applications must not cache this - entity beyond the date given. The presence of an Expires field does - not imply that the original resource will change or cease to exist - at, before, or after that time. However, information providers that - know or even suspect that a resource will change by a certain date - should include an Expires header with that date. The format is an - absolute date and time as defined by HTTP-date in Section 3.3. - - Expires = "Expires" ":" HTTP-date - - An example of its use is - - Expires: Thu, 01 Dec 1994 16:00:00 GMT - - If the date given is equal to or earlier than the value of the Date - header, the recipient must not cache the enclosed entity. If a - resource is dynamic by nature, as is the case with many data- - producing processes, entities from that resource should be given an - appropriate Expires value which reflects that dynamism. - - The Expires field cannot be used to force a user agent to refresh its - display or reload a resource; its semantics apply only to caching - mechanisms, and such mechanisms need only check a resource's - expiration status when a new request for that resource is initiated. - - User agents often have history mechanisms, such as "Back" buttons and - history lists, which can be used to redisplay an entity retrieved - earlier in a session. By default, the Expires field does not apply to - history mechanisms. If the entity is still in storage, a history - mechanism should display it even if the entity has expired, unless - the user has specifically configured the agent to refresh expired - history documents. - - Note: Applications are encouraged to be tolerant of bad or - misinformed implementations of the Expires header. A value of zero - (0) or an invalid date format should be considered equivalent to - an "expires immediately." Although these values are not legitimate - for HTTP/1.0, a robust implementation is always desirable. - - - - - - -Berners-Lee, et al Informational [Page 41] - -RFC 1945 HTTP/1.0 May 1996 - - -10.8 From - - The From request-header field, if given, should contain an Internet - e-mail address for the human user who controls the requesting user - agent. The address should be machine-usable, as defined by mailbox in - RFC 822 [7] (as updated by RFC 1123 [6]): - - From = "From" ":" mailbox - - An example is: - - From: webmaster@w3.org - - This header field may be used for logging purposes and as a means for - identifying the source of invalid or unwanted requests. It should not - be used as an insecure form of access protection. The interpretation - of this field is that the request is being performed on behalf of the - person given, who accepts responsibility for the method performed. In - particular, robot agents should include this header so that the - person responsible for running the robot can be contacted if problems - occur on the receiving end. - - The Internet e-mail address in this field may be separate from the - Internet host which issued the request. For example, when a request - is passed through a proxy, the original issuer's address should be - used. - - Note: The client should not send the From header field without the - user's approval, as it may conflict with the user's privacy - interests or their site's security policy. It is strongly - recommended that the user be able to disable, enable, and modify - the value of this field at any time prior to a request. - -10.9 If-Modified-Since - - The If-Modified-Since request-header field is used with the GET - method to make it conditional: if the requested resource has not been - modified since the time specified in this field, a copy of the - resource will not be returned from the server; instead, a 304 (not - modified) response will be returned without any Entity-Body. - - If-Modified-Since = "If-Modified-Since" ":" HTTP-date - - An example of the field is: - - If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT - - - - - -Berners-Lee, et al Informational [Page 42] - -RFC 1945 HTTP/1.0 May 1996 - - - A conditional GET method requests that the identified resource be - transferred only if it has been modified since the date given by the - If-Modified-Since header. The algorithm for determining this includes - the following cases: - - a) If the request would normally result in anything other than - a 200 (ok) status, or if the passed If-Modified-Since date - is invalid, the response is exactly the same as for a - normal GET. A date which is later than the server's current - time is invalid. - - b) If the resource has been modified since the - If-Modified-Since date, the response is exactly the same as - for a normal GET. - - c) If the resource has not been modified since a valid - If-Modified-Since date, the server shall return a 304 (not - modified) response. - - The purpose of this feature is to allow efficient updates of cached - information with a minimum amount of transaction overhead. - -10.10 Last-Modified - - The Last-Modified entity-header field indicates the date and time at - which the sender believes the resource was last modified. The exact - semantics of this field are defined in terms of how the recipient - should interpret it: if the recipient has a copy of this resource - which is older than the date given by the Last-Modified field, that - copy should be considered stale. - - Last-Modified = "Last-Modified" ":" HTTP-date - - An example of its use is - - Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT - - The exact meaning of this header field depends on the implementation - of the sender and the nature of the original resource. For files, it - may be just the file system last-modified time. For entities with - dynamically included parts, it may be the most recent of the set of - last-modify times for its component parts. For database gateways, it - may be the last-update timestamp of the record. For virtual objects, - it may be the last time the internal state changed. - - An origin server must not send a Last-Modified date which is later - than the server's time of message origination. In such cases, where - the resource's last modification would indicate some time in the - - - -Berners-Lee, et al Informational [Page 43] - -RFC 1945 HTTP/1.0 May 1996 - - - future, the server must replace that date with the message - origination date. - -10.11 Location - - The Location response-header field defines the exact location of the - resource that was identified by the Request-URI. For 3xx responses, - the location must indicate the server's preferred URL for automatic - redirection to the resource. Only one absolute URL is allowed. - - Location = "Location" ":" absoluteURI - - An example is - - Location: http://www.w3.org/hypertext/WWW/NewLocation.html - -10.12 Pragma - - The Pragma general-header field is used to include implementation- - specific directives that may apply to any recipient along the - request/response chain. All pragma directives specify optional - behavior from the viewpoint of the protocol; however, some systems - may require that behavior be consistent with the directives. - - Pragma = "Pragma" ":" 1#pragma-directive - - pragma-directive = "no-cache" | extension-pragma - extension-pragma = token [ "=" word ] - - When the "no-cache" directive is present in a request message, an - application should forward the request toward the origin server even - if it has a cached copy of what is being requested. This allows a - client to insist upon receiving an authoritative response to its - request. It also allows a client to refresh a cached copy which is - known to be corrupted or stale. - - Pragma directives must be passed through by a proxy or gateway - application, regardless of their significance to that application, - since the directives may be applicable to all recipients along the - request/response chain. It is not possible to specify a pragma for a - specific recipient; however, any pragma directive not relevant to a - recipient should be ignored by that recipient. - -10.13 Referer - - The Referer request-header field allows the client to specify, for - the server's benefit, the address (URI) of the resource from which - the Request-URI was obtained. This allows a server to generate lists - - - -Berners-Lee, et al Informational [Page 44] - -RFC 1945 HTTP/1.0 May 1996 - - - of back-links to resources for interest, logging, optimized caching, - etc. It also allows obsolete or mistyped links to be traced for - maintenance. The Referer field must not be sent if the Request-URI - was obtained from a source that does not have its own URI, such as - input from the user keyboard. - - Referer = "Referer" ":" ( absoluteURI | relativeURI ) - - Example: - - Referer: http://www.w3.org/hypertext/DataSources/Overview.html - - If a partial URI is given, it should be interpreted relative to the - Request-URI. The URI must not include a fragment. - - Note: Because the source of a link may be private information or - may reveal an otherwise private information source, it is strongly - recommended that the user be able to select whether or not the - Referer field is sent. For example, a browser client could have a - toggle switch for browsing openly/anonymously, which would - respectively enable/disable the sending of Referer and From - information. - -10.14 Server - - The Server response-header field contains information about the - software used by the origin server to handle the request. The field - can contain multiple product tokens (Section 3.7) and comments - identifying the server and any significant subproducts. By - convention, the product tokens are listed in order of their - significance for identifying the application. - - Server = "Server" ":" 1*( product | comment ) - - Example: - - Server: CERN/3.0 libwww/2.17 - - If the response is being forwarded through a proxy, the proxy - application must not add its data to the product list. - - Note: Revealing the specific software version of the server may - allow the server machine to become more vulnerable to attacks - against software that is known to contain security holes. Server - implementors are encouraged to make this field a configurable - option. - - - - - -Berners-Lee, et al Informational [Page 45] - -RFC 1945 HTTP/1.0 May 1996 - - - Note: Some existing servers fail to restrict themselves to the - product token syntax within the Server field. - -10.15 User-Agent - - The User-Agent request-header field contains information about the - user agent originating the request. This is for statistical purposes, - the tracing of protocol violations, and automated recognition of user - agents for the sake of tailoring responses to avoid particular user - agent limitations. Although it is not required, user agents should - include this field with requests. The field can contain multiple - product tokens (Section 3.7) and comments identifying the agent and - any subproducts which form a significant part of the user agent. By - convention, the product tokens are listed in order of their - significance for identifying the application. - - User-Agent = "User-Agent" ":" 1*( product | comment ) - - Example: - - User-Agent: CERN-LineMode/2.15 libwww/2.17b3 - - Note: Some current proxy applications append their product - information to the list in the User-Agent field. This is not - recommended, since it makes machine interpretation of these - fields ambiguous. - - Note: Some existing clients fail to restrict themselves to - the product token syntax within the User-Agent field. - -10.16 WWW-Authenticate - - The WWW-Authenticate response-header field must be included in 401 - (unauthorized) response messages. The field value consists of at - least one challenge that indicates the authentication scheme(s) and - parameters applicable to the Request-URI. - - WWW-Authenticate = "WWW-Authenticate" ":" 1#challenge - - The HTTP access authentication process is described in Section 11. - User agents must take special care in parsing the WWW-Authenticate - field value if it contains more than one challenge, or if more than - one WWW-Authenticate header field is provided, since the contents of - a challenge may itself contain a comma-separated list of - authentication parameters. - - - - - - -Berners-Lee, et al Informational [Page 46] - -RFC 1945 HTTP/1.0 May 1996 - - -11. Access Authentication - - HTTP provides a simple challenge-response authentication mechanism - which may be used by a server to challenge a client request and by a - client to provide authentication information. It uses an extensible, - case-insensitive token to identify the authentication scheme, - followed by a comma-separated list of attribute-value pairs which - carry the parameters necessary for achieving authentication via that - scheme. - - auth-scheme = token - - auth-param = token "=" quoted-string - - The 401 (unauthorized) response message is used by an origin server - to challenge the authorization of a user agent. This response must - include a WWW-Authenticate header field containing at least one - challenge applicable to the requested resource. - - challenge = auth-scheme 1*SP realm *( "," auth-param ) - - realm = "realm" "=" realm-value - realm-value = quoted-string - - The realm attribute (case-insensitive) is required for all - authentication schemes which issue a challenge. The realm value - (case-sensitive), in combination with the canonical root URL of the - server being accessed, defines the protection space. These realms - allow the protected resources on a server to be partitioned into a - set of protection spaces, each with its own authentication scheme - and/or authorization database. The realm value is a string, generally - assigned by the origin server, which may have additional semantics - specific to the authentication scheme. - - A user agent that wishes to authenticate itself with a server-- - usually, but not necessarily, after receiving a 401 response--may do - so by including an Authorization header field with the request. The - Authorization field value consists of credentials containing the - authentication information of the user agent for the realm of the - resource being requested. - - credentials = basic-credentials - | ( auth-scheme #auth-param ) - - The domain over which credentials can be automatically applied by a - user agent is determined by the protection space. If a prior request - has been authorized, the same credentials may be reused for all other - requests within that protection space for a period of time determined - - - -Berners-Lee, et al Informational [Page 47] - -RFC 1945 HTTP/1.0 May 1996 - - - by the authentication scheme, parameters, and/or user preference. - Unless otherwise defined by the authentication scheme, a single - protection space cannot extend outside the scope of its server. - - If the server does not wish to accept the credentials sent with a - request, it should return a 403 (forbidden) response. - - The HTTP protocol does not restrict applications to this simple - challenge-response mechanism for access authentication. Additional - mechanisms may be used, such as encryption at the transport level or - via message encapsulation, and with additional header fields - specifying authentication information. However, these additional - mechanisms are not defined by this specification. - - Proxies must be completely transparent regarding user agent - authentication. That is, they must forward the WWW-Authenticate and - Authorization headers untouched, and must not cache the response to a - request containing Authorization. HTTP/1.0 does not provide a means - for a client to be authenticated with a proxy. - -11.1 Basic Authentication Scheme - - The "basic" authentication scheme is based on the model that the user - agent must authenticate itself with a user-ID and a password for each - realm. The realm value should be considered an opaque string which - can only be compared for equality with other realms on that server. - The server will authorize the request only if it can validate the - user-ID and password for the protection space of the Request-URI. - There are no optional authentication parameters. - - Upon receipt of an unauthorized request for a URI within the - protection space, the server should respond with a challenge like the - following: - - WWW-Authenticate: Basic realm="WallyWorld" - - where "WallyWorld" is the string assigned by the server to identify - the protection space of the Request-URI. - - To receive authorization, the client sends the user-ID and password, - separated by a single colon (":") character, within a base64 [5] - encoded string in the credentials. - - basic-credentials = "Basic" SP basic-cookie - - basic-cookie = <base64 [5] encoding of userid-password, - except not limited to 76 char/line> - - - - -Berners-Lee, et al Informational [Page 48] - -RFC 1945 HTTP/1.0 May 1996 - - - userid-password = [ token ] ":" *TEXT - - If the user agent wishes to send the user-ID "Aladdin" and password - "open sesame", it would use the following header field: - - Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ== - - The basic authentication scheme is a non-secure method of filtering - unauthorized access to resources on an HTTP server. It is based on - the assumption that the connection between the client and the server - can be regarded as a trusted carrier. As this is not generally true - on an open network, the basic authentication scheme should be used - accordingly. In spite of this, clients should implement the scheme in - order to communicate with servers that use it. - -12. Security Considerations - - This section is meant to inform application developers, information - providers, and users of the security limitations in HTTP/1.0 as - described by this document. The discussion does not include - definitive solutions to the problems revealed, though it does make - some suggestions for reducing security risks. - -12.1 Authentication of Clients - - As mentioned in Section 11.1, the Basic authentication scheme is not - a secure method of user authentication, nor does it prevent the - Entity-Body from being transmitted in clear text across the physical - network used as the carrier. HTTP/1.0 does not prevent additional - authentication schemes and encryption mechanisms from being employed - to increase security. - -12.2 Safe Methods - - The writers of client software should be aware that the software - represents the user in their interactions over the Internet, and - should be careful to allow the user to be aware of any actions they - may take which may have an unexpected significance to themselves or - others. - - In particular, the convention has been established that the GET and - HEAD methods should never have the significance of taking an action - other than retrieval. These methods should be considered "safe." This - allows user agents to represent other methods, such as POST, in a - special way, so that the user is made aware of the fact that a - possibly unsafe action is being requested. - - - - - -Berners-Lee, et al Informational [Page 49] - -RFC 1945 HTTP/1.0 May 1996 - - - Naturally, it is not possible to ensure that the server does not - generate side-effects as a result of performing a GET request; in - fact, some dynamic resources consider that a feature. The important - distinction here is that the user did not request the side-effects, - so therefore cannot be held accountable for them. - -12.3 Abuse of Server Log Information - - A server is in the position to save personal data about a user's - requests which may identify their reading patterns or subjects of - interest. This information is clearly confidential in nature and its - handling may be constrained by law in certain countries. People using - the HTTP protocol to provide data are responsible for ensuring that - such material is not distributed without the permission of any - individuals that are identifiable by the published results. - -12.4 Transfer of Sensitive Information - - Like any generic data transfer protocol, HTTP cannot regulate the - content of the data that is transferred, nor is there any a priori - method of determining the sensitivity of any particular piece of - information within the context of any given request. Therefore, - applications should supply as much control over this information as - possible to the provider of that information. Three header fields are - worth special mention in this context: Server, Referer and From. - - Revealing the specific software version of the server may allow the - server machine to become more vulnerable to attacks against software - that is known to contain security holes. Implementors should make the - Server header field a configurable option. - - The Referer field allows reading patterns to be studied and reverse - links drawn. Although it can be very useful, its power can be abused - if user details are not separated from the information contained in - the Referer. Even when the personal information has been removed, the - Referer field may indicate a private document's URI whose publication - would be inappropriate. - - The information sent in the From field might conflict with the user's - privacy interests or their site's security policy, and hence it - should not be transmitted without the user being able to disable, - enable, and modify the contents of the field. The user must be able - to set the contents of this field within a user preference or - application defaults configuration. - - We suggest, though do not require, that a convenient toggle interface - be provided for the user to enable or disable the sending of From and - Referer information. - - - -Berners-Lee, et al Informational [Page 50] - -RFC 1945 HTTP/1.0 May 1996 - - -12.5 Attacks Based On File and Path Names - - Implementations of HTTP origin servers should be careful to restrict - the documents returned by HTTP requests to be only those that were - intended by the server administrators. If an HTTP server translates - HTTP URIs directly into file system calls, the server must take - special care not to serve files that were not intended to be - delivered to HTTP clients. For example, Unix, Microsoft Windows, and - other operating systems use ".." as a path component to indicate a - directory level above the current one. On such a system, an HTTP - server must disallow any such construct in the Request-URI if it - would otherwise allow access to a resource outside those intended to - be accessible via the HTTP server. Similarly, files intended for - reference only internally to the server (such as access control - files, configuration files, and script code) must be protected from - inappropriate retrieval, since they might contain sensitive - information. Experience has shown that minor bugs in such HTTP server - implementations have turned into security risks. - -13. Acknowledgments - - This specification makes heavy use of the augmented BNF and generic - constructs defined by David H. Crocker for RFC 822 [7]. Similarly, it - reuses many of the definitions provided by Nathaniel Borenstein and - Ned Freed for MIME [5]. We hope that their inclusion in this - specification will help reduce past confusion over the relationship - between HTTP/1.0 and Internet mail message formats. - - The HTTP protocol has evolved considerably over the past four years. - It has benefited from a large and active developer community--the - many people who have participated on the www-talk mailing list--and - it is that community which has been most responsible for the success - of HTTP and of the World-Wide Web in general. Marc Andreessen, Robert - Cailliau, Daniel W. Connolly, Bob Denny, Jean-Francois Groff, Phillip - M. Hallam-Baker, Hakon W. Lie, Ari Luotonen, Rob McCool, Lou - Montulli, Dave Raggett, Tony Sanders, and Marc VanHeyningen deserve - special recognition for their efforts in defining aspects of the - protocol for early versions of this specification. - - Paul Hoffman contributed sections regarding the informational status - of this document and Appendices C and D. - - - - - - - - - - -Berners-Lee, et al Informational [Page 51] - -RFC 1945 HTTP/1.0 May 1996 - - - This document has benefited greatly from the comments of all those - participating in the HTTP-WG. In addition to those already mentioned, - the following individuals have contributed to this specification: - - Gary Adams Harald Tveit Alvestrand - Keith Ball Brian Behlendorf - Paul Burchard Maurizio Codogno - Mike Cowlishaw Roman Czyborra - Michael A. Dolan John Franks - Jim Gettys Marc Hedlund - Koen Holtman Alex Hopmann - Bob Jernigan Shel Kaphan - Martijn Koster Dave Kristol - Daniel LaLiberte Paul Leach - Albert Lunde John C. Mallery - Larry Masinter Mitra - Jeffrey Mogul Gavin Nicol - Bill Perry Jeffrey Perry - Owen Rees Luigi Rizzo - David Robinson Marc Salomon - Rich Salz Jim Seidman - Chuck Shotton Eric W. Sink - Simon E. Spero Robert S. Thau - Francois Yergeau Mary Ellen Zurko - Jean-Philippe Martin-Flatin - -14. References - - [1] Anklesaria, F., McCahill, M., Lindner, P., Johnson, D., - Torrey, D., and B. Alberti, "The Internet Gopher Protocol: A - Distributed Document Search and Retrieval Protocol", RFC 1436, - University of Minnesota, March 1993. - - [2] Berners-Lee, T., "Universal Resource Identifiers in WWW: A - Unifying Syntax for the Expression of Names and Addresses of - Objects on the Network as used in the World-Wide Web", - RFC 1630, CERN, June 1994. - - [3] Berners-Lee, T., and D. Connolly, "Hypertext Markup Language - - 2.0", RFC 1866, MIT/W3C, November 1995. - - [4] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform - Resource Locators (URL)", RFC 1738, CERN, Xerox PARC, - University of Minnesota, December 1994. - - - - - - - -Berners-Lee, et al Informational [Page 52] - -RFC 1945 HTTP/1.0 May 1996 - - - [5] Borenstein, N., and N. Freed, "MIME (Multipurpose Internet Mail - Extensions) Part One: Mechanisms for Specifying and Describing - the Format of Internet Message Bodies", RFC 1521, Bellcore, - Innosoft, September 1993. - - [6] Braden, R., "Requirements for Internet hosts - Application and - Support", STD 3, RFC 1123, IETF, October 1989. - - [7] Crocker, D., "Standard for the Format of ARPA Internet Text - Messages", STD 11, RFC 822, UDEL, August 1982. - - [8] F. Davis, B. Kahle, H. Morris, J. Salem, T. Shen, R. Wang, - J. Sui, and M. Grinbaum. "WAIS Interface Protocol Prototype - Functional Specification." (v1.5), Thinking Machines - Corporation, April 1990. - - [9] Fielding, R., "Relative Uniform Resource Locators", RFC 1808, - UC Irvine, June 1995. - - [10] Horton, M., and R. Adams, "Standard for interchange of USENET - Messages", RFC 1036 (Obsoletes RFC 850), AT&T Bell - Laboratories, Center for Seismic Studies, December 1987. - - [11] Kantor, B., and P. Lapsley, "Network News Transfer Protocol: - A Proposed Standard for the Stream-Based Transmission of News", - RFC 977, UC San Diego, UC Berkeley, February 1986. - - [12] Postel, J., "Simple Mail Transfer Protocol." STD 10, RFC 821, - USC/ISI, August 1982. - - [13] Postel, J., "Media Type Registration Procedure." RFC 1590, - USC/ISI, March 1994. - - [14] Postel, J., and J. Reynolds, "File Transfer Protocol (FTP)", - STD 9, RFC 959, USC/ISI, October 1985. - - [15] Reynolds, J., and J. Postel, "Assigned Numbers", STD 2, RFC - 1700, USC/ISI, October 1994. - - [16] Sollins, K., and L. Masinter, "Functional Requirements for - Uniform Resource Names", RFC 1737, MIT/LCS, Xerox Corporation, - December 1994. - - [17] US-ASCII. Coded Character Set - 7-Bit American Standard Code - for Information Interchange. Standard ANSI X3.4-1986, ANSI, - 1986. - - - - - -Berners-Lee, et al Informational [Page 53] - -RFC 1945 HTTP/1.0 May 1996 - - - [18] ISO-8859. International Standard -- Information Processing -- - 8-bit Single-Byte Coded Graphic Character Sets -- - Part 1: Latin alphabet No. 1, ISO 8859-1:1987. - Part 2: Latin alphabet No. 2, ISO 8859-2, 1987. - Part 3: Latin alphabet No. 3, ISO 8859-3, 1988. - Part 4: Latin alphabet No. 4, ISO 8859-4, 1988. - Part 5: Latin/Cyrillic alphabet, ISO 8859-5, 1988. - Part 6: Latin/Arabic alphabet, ISO 8859-6, 1987. - Part 7: Latin/Greek alphabet, ISO 8859-7, 1987. - Part 8: Latin/Hebrew alphabet, ISO 8859-8, 1988. - Part 9: Latin alphabet No. 5, ISO 8859-9, 1990. - -15. Authors' Addresses - - Tim Berners-Lee - Director, W3 Consortium - MIT Laboratory for Computer Science - 545 Technology Square - Cambridge, MA 02139, U.S.A. - - Fax: +1 (617) 258 8682 - EMail: timbl@w3.org - - - Roy T. Fielding - Department of Information and Computer Science - University of California - Irvine, CA 92717-3425, U.S.A. - - Fax: +1 (714) 824-4056 - EMail: fielding@ics.uci.edu - - - Henrik Frystyk Nielsen - W3 Consortium - MIT Laboratory for Computer Science - 545 Technology Square - Cambridge, MA 02139, U.S.A. - - Fax: +1 (617) 258 8682 - EMail: frystyk@w3.org - - - - - - - - - - -Berners-Lee, et al Informational [Page 54] - -RFC 1945 HTTP/1.0 May 1996 - - -Appendices - - These appendices are provided for informational reasons only -- they - do not form a part of the HTTP/1.0 specification. - -A. Internet Media Type message/http - - In addition to defining the HTTP/1.0 protocol, this document serves - as the specification for the Internet media type "message/http". The - following is to be registered with IANA [13]. - - Media Type name: message - - Media subtype name: http - - Required parameters: none - - Optional parameters: version, msgtype - - version: The HTTP-Version number of the enclosed message - (e.g., "1.0"). If not present, the version can be - determined from the first line of the body. - - msgtype: The message type -- "request" or "response". If - not present, the type can be determined from the - first line of the body. - - Encoding considerations: only "7bit", "8bit", or "binary" are - permitted - - Security considerations: none - -B. Tolerant Applications - - Although this document specifies the requirements for the generation - of HTTP/1.0 messages, not all applications will be correct in their - implementation. We therefore recommend that operational applications - be tolerant of deviations whenever those deviations can be - interpreted unambiguously. - - Clients should be tolerant in parsing the Status-Line and servers - tolerant when parsing the Request-Line. In particular, they should - accept any amount of SP or HT characters between fields, even though - only a single SP is required. - - The line terminator for HTTP-header fields is the sequence CRLF. - However, we recommend that applications, when parsing such headers, - recognize a single LF as a line terminator and ignore the leading CR. - - - -Berners-Lee, et al Informational [Page 55] - -RFC 1945 HTTP/1.0 May 1996 - - -C. Relationship to MIME - - HTTP/1.0 uses many of the constructs defined for Internet Mail (RFC - 822 [7]) and the Multipurpose Internet Mail Extensions (MIME [5]) to - allow entities to be transmitted in an open variety of - representations and with extensible mechanisms. However, RFC 1521 - discusses mail, and HTTP has a few features that are different than - those described in RFC 1521. These differences were carefully chosen - to optimize performance over binary connections, to allow greater - freedom in the use of new media types, to make date comparisons - easier, and to acknowledge the practice of some early HTTP servers - and clients. - - At the time of this writing, it is expected that RFC 1521 will be - revised. The revisions may include some of the practices found in - HTTP/1.0 but not in RFC 1521. - - This appendix describes specific areas where HTTP differs from RFC - 1521. Proxies and gateways to strict MIME environments should be - aware of these differences and provide the appropriate conversions - where necessary. Proxies and gateways from MIME environments to HTTP - also need to be aware of the differences because some conversions may - be required. - -C.1 Conversion to Canonical Form - - RFC 1521 requires that an Internet mail entity be converted to - canonical form prior to being transferred, as described in Appendix G - of RFC 1521 [5]. Section 3.6.1 of this document describes the forms - allowed for subtypes of the "text" media type when transmitted over - HTTP. - - RFC 1521 requires that content with a Content-Type of "text" - represent line breaks as CRLF and forbids the use of CR or LF outside - of line break sequences. HTTP allows CRLF, bare CR, and bare LF to - indicate a line break within text content when a message is - transmitted over HTTP. - - Where it is possible, a proxy or gateway from HTTP to a strict RFC - 1521 environment should translate all line breaks within the text - media types described in Section 3.6.1 of this document to the RFC - 1521 canonical form of CRLF. Note, however, that this may be - complicated by the presence of a Content-Encoding and by the fact - that HTTP allows the use of some character sets which do not use - octets 13 and 10 to represent CR and LF, as is the case for some - multi-byte character sets. - - - - - -Berners-Lee, et al Informational [Page 56] - -RFC 1945 HTTP/1.0 May 1996 - - -C.2 Conversion of Date Formats - - HTTP/1.0 uses a restricted set of date formats (Section 3.3) to - simplify the process of date comparison. Proxies and gateways from - other protocols should ensure that any Date header field present in a - message conforms to one of the HTTP/1.0 formats and rewrite the date - if necessary. - -C.3 Introduction of Content-Encoding - - RFC 1521 does not include any concept equivalent to HTTP/1.0's - Content-Encoding header field. Since this acts as a modifier on the - media type, proxies and gateways from HTTP to MIME-compliant - protocols must either change the value of the Content-Type header - field or decode the Entity-Body before forwarding the message. (Some - experimental applications of Content-Type for Internet mail have used - a media-type parameter of ";conversions=<content-coding>" to perform - an equivalent function as Content-Encoding. However, this parameter - is not part of RFC 1521.) - -C.4 No Content-Transfer-Encoding - - HTTP does not use the Content-Transfer-Encoding (CTE) field of RFC - 1521. Proxies and gateways from MIME-compliant protocols to HTTP must - remove any non-identity CTE ("quoted-printable" or "base64") encoding - prior to delivering the response message to an HTTP client. - - Proxies and gateways from HTTP to MIME-compliant protocols are - responsible for ensuring that the message is in the correct format - and encoding for safe transport on that protocol, where "safe - transport" is defined by the limitations of the protocol being used. - Such a proxy or gateway should label the data with an appropriate - Content-Transfer-Encoding if doing so will improve the likelihood of - safe transport over the destination protocol. - -C.5 HTTP Header Fields in Multipart Body-Parts - - In RFC 1521, most header fields in multipart body-parts are generally - ignored unless the field name begins with "Content-". In HTTP/1.0, - multipart body-parts may contain any HTTP header fields which are - significant to the meaning of that part. - -D. Additional Features - - This appendix documents protocol elements used by some existing HTTP - implementations, but not consistently and correctly across most - HTTP/1.0 applications. Implementors should be aware of these - features, but cannot rely upon their presence in, or interoperability - - - -Berners-Lee, et al Informational [Page 57] - -RFC 1945 HTTP/1.0 May 1996 - - - with, other HTTP/1.0 applications. - -D.1 Additional Request Methods - -D.1.1 PUT - - The PUT method requests that the enclosed entity be stored under the - supplied Request-URI. If the Request-URI refers to an already - existing resource, the enclosed entity should be considered as a - modified version of the one residing on the origin server. If the - Request-URI does not point to an existing resource, and that URI is - capable of being defined as a new resource by the requesting user - agent, the origin server can create the resource with that URI. - - The fundamental difference between the POST and PUT requests is - reflected in the different meaning of the Request-URI. The URI in a - POST request identifies the resource that will handle the enclosed - entity as data to be processed. That resource may be a data-accepting - process, a gateway to some other protocol, or a separate entity that - accepts annotations. In contrast, the URI in a PUT request identifies - the entity enclosed with the request -- the user agent knows what URI - is intended and the server should not apply the request to some other - resource. - -D.1.2 DELETE - - The DELETE method requests that the origin server delete the resource - identified by the Request-URI. - -D.1.3 LINK - - The LINK method establishes one or more Link relationships between - the existing resource identified by the Request-URI and other - existing resources. - -D.1.4 UNLINK - - The UNLINK method removes one or more Link relationships from the - existing resource identified by the Request-URI. - -D.2 Additional Header Field Definitions - -D.2.1 Accept - - The Accept request-header field can be used to indicate a list of - media ranges which are acceptable as a response to the request. The - asterisk "*" character is used to group media types into ranges, with - "*/*" indicating all media types and "type/*" indicating all subtypes - - - -Berners-Lee, et al Informational [Page 58] - -RFC 1945 HTTP/1.0 May 1996 - - - of that type. The set of ranges given by the client should represent - what types are acceptable given the context of the request. - -D.2.2 Accept-Charset - - The Accept-Charset request-header field can be used to indicate a - list of preferred character sets other than the default US-ASCII and - ISO-8859-1. This field allows clients capable of understanding more - comprehensive or special-purpose character sets to signal that - capability to a server which is capable of representing documents in - those character sets. - -D.2.3 Accept-Encoding - - The Accept-Encoding request-header field is similar to Accept, but - restricts the content-coding values which are acceptable in the - response. - -D.2.4 Accept-Language - - The Accept-Language request-header field is similar to Accept, but - restricts the set of natural languages that are preferred as a - response to the request. - -D.2.5 Content-Language - - The Content-Language entity-header field describes the natural - language(s) of the intended audience for the enclosed entity. Note - that this may not be equivalent to all the languages used within the - entity. - -D.2.6 Link - - The Link entity-header field provides a means for describing a - relationship between the entity and some other resource. An entity - may include multiple Link values. Links at the metainformation level - typically indicate relationships like hierarchical structure and - navigation paths. - -D.2.7 MIME-Version - - HTTP messages may include a single MIME-Version general-header field - to indicate what version of the MIME protocol was used to construct - the message. Use of the MIME-Version header field, as defined by RFC - 1521 [5], should indicate that the message is MIME-conformant. - Unfortunately, some older HTTP/1.0 servers send it indiscriminately, - and thus this field should be ignored. - - - - -Berners-Lee, et al Informational [Page 59] - -RFC 1945 HTTP/1.0 May 1996 - - -D.2.8 Retry-After - - The Retry-After response-header field can be used with a 503 (service - unavailable) response to indicate how long the service is expected to - be unavailable to the requesting client. The value of this field can - be either an HTTP-date or an integer number of seconds (in decimal) - after the time of the response. - -D.2.9 Title - - The Title entity-header field indicates the title of the entity. - -D.2.10 URI - - The URI entity-header field may contain some or all of the Uniform - Resource Identifiers (Section 3.2) by which the Request-URI resource - can be identified. There is no guarantee that the resource can be - accessed using the URI(s) specified. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Berners-Lee, et al Informational [Page 60] - diff --git a/docs/specs/rfc2068.txt b/docs/specs/rfc2068.txt deleted file mode 100644 index e16e4fdf..00000000 --- a/docs/specs/rfc2068.txt +++ /dev/null @@ -1,9075 +0,0 @@ - - - - - - -Network Working Group R. Fielding -Request for Comments: 2068 UC Irvine -Category: Standards Track J. Gettys - J. Mogul - DEC - H. Frystyk - T. Berners-Lee - MIT/LCS - January 1997 - - - Hypertext Transfer Protocol -- HTTP/1.1 - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Abstract - - The Hypertext Transfer Protocol (HTTP) is an application-level - protocol for distributed, collaborative, hypermedia information - systems. It is a generic, stateless, object-oriented protocol which - can be used for many tasks, such as name servers and distributed - object management systems, through extension of its request methods. - A feature of HTTP is the typing and negotiation of data - representation, allowing systems to be built independently of the - data being transferred. - - HTTP has been in use by the World-Wide Web global information - initiative since 1990. This specification defines the protocol - referred to as "HTTP/1.1". - -Table of Contents - - 1 Introduction.............................................7 - 1.1 Purpose ..............................................7 - 1.2 Requirements .........................................7 - 1.3 Terminology ..........................................8 - 1.4 Overall Operation ...................................11 - 2 Notational Conventions and Generic Grammar..............13 - 2.1 Augmented BNF .......................................13 - 2.2 Basic Rules .........................................15 - 3 Protocol Parameters.....................................17 - 3.1 HTTP Version ........................................17 - - - -Fielding, et. al. Standards Track [Page 1] - -RFC 2068 HTTP/1.1 January 1997 - - - 3.2 Uniform Resource Identifiers ........................18 - 3.2.1 General Syntax ...................................18 - 3.2.2 http URL .........................................19 - 3.2.3 URI Comparison ...................................20 - 3.3 Date/Time Formats ...................................21 - 3.3.1 Full Date ........................................21 - 3.3.2 Delta Seconds ....................................22 - 3.4 Character Sets ......................................22 - 3.5 Content Codings .....................................23 - 3.6 Transfer Codings ....................................24 - 3.7 Media Types .........................................25 - 3.7.1 Canonicalization and Text Defaults ...............26 - 3.7.2 Multipart Types ..................................27 - 3.8 Product Tokens ......................................28 - 3.9 Quality Values ......................................28 - 3.10 Language Tags ......................................28 - 3.11 Entity Tags ........................................29 - 3.12 Range Units ........................................30 - 4 HTTP Message............................................30 - 4.1 Message Types .......................................30 - 4.2 Message Headers .....................................31 - 4.3 Message Body ........................................32 - 4.4 Message Length ......................................32 - 4.5 General Header Fields ...............................34 - 5 Request.................................................34 - 5.1 Request-Line ........................................34 - 5.1.1 Method ...........................................35 - 5.1.2 Request-URI ......................................35 - 5.2 The Resource Identified by a Request ................37 - 5.3 Request Header Fields ...............................37 - 6 Response................................................38 - 6.1 Status-Line .........................................38 - 6.1.1 Status Code and Reason Phrase ....................39 - 6.2 Response Header Fields ..............................41 - 7 Entity..................................................41 - 7.1 Entity Header Fields ................................41 - 7.2 Entity Body .........................................42 - 7.2.1 Type .............................................42 - 7.2.2 Length ...........................................43 - 8 Connections.............................................43 - 8.1 Persistent Connections ..............................43 - 8.1.1 Purpose ..........................................43 - 8.1.2 Overall Operation ................................44 - 8.1.3 Proxy Servers ....................................45 - 8.1.4 Practical Considerations .........................45 - 8.2 Message Transmission Requirements ...................46 - 9 Method Definitions......................................48 - 9.1 Safe and Idempotent Methods .........................48 - - - -Fielding, et. al. Standards Track [Page 2] - -RFC 2068 HTTP/1.1 January 1997 - - - 9.1.1 Safe Methods .....................................48 - 9.1.2 Idempotent Methods ...............................49 - 9.2 OPTIONS .............................................49 - 9.3 GET .................................................50 - 9.4 HEAD ................................................50 - 9.5 POST ................................................51 - 9.6 PUT .................................................52 - 9.7 DELETE ..............................................53 - 9.8 TRACE ...............................................53 - 10 Status Code Definitions................................53 - 10.1 Informational 1xx ..................................54 - 10.1.1 100 Continue ....................................54 - 10.1.2 101 Switching Protocols .........................54 - 10.2 Successful 2xx .....................................54 - 10.2.1 200 OK ..........................................54 - 10.2.2 201 Created .....................................55 - 10.2.3 202 Accepted ....................................55 - 10.2.4 203 Non-Authoritative Information ...............55 - 10.2.5 204 No Content ..................................55 - 10.2.6 205 Reset Content ...............................56 - 10.2.7 206 Partial Content .............................56 - 10.3 Redirection 3xx ....................................56 - 10.3.1 300 Multiple Choices ............................57 - 10.3.2 301 Moved Permanently ...........................57 - 10.3.3 302 Moved Temporarily ...........................58 - 10.3.4 303 See Other ...................................58 - 10.3.5 304 Not Modified ................................58 - 10.3.6 305 Use Proxy ...................................59 - 10.4 Client Error 4xx ...................................59 - 10.4.1 400 Bad Request .................................60 - 10.4.2 401 Unauthorized ................................60 - 10.4.3 402 Payment Required ............................60 - 10.4.4 403 Forbidden ...................................60 - 10.4.5 404 Not Found ...................................60 - 10.4.6 405 Method Not Allowed ..........................61 - 10.4.7 406 Not Acceptable ..............................61 - 10.4.8 407 Proxy Authentication Required ...............61 - 10.4.9 408 Request Timeout .............................62 - 10.4.10 409 Conflict ...................................62 - 10.4.11 410 Gone .......................................62 - 10.4.12 411 Length Required ............................63 - 10.4.13 412 Precondition Failed ........................63 - 10.4.14 413 Request Entity Too Large ...................63 - 10.4.15 414 Request-URI Too Long .......................63 - 10.4.16 415 Unsupported Media Type .....................63 - 10.5 Server Error 5xx ...................................64 - 10.5.1 500 Internal Server Error .......................64 - 10.5.2 501 Not Implemented .............................64 - - - -Fielding, et. al. Standards Track [Page 3] - -RFC 2068 HTTP/1.1 January 1997 - - - 10.5.3 502 Bad Gateway .................................64 - 10.5.4 503 Service Unavailable .........................64 - 10.5.5 504 Gateway Timeout .............................64 - 10.5.6 505 HTTP Version Not Supported ..................65 - 11 Access Authentication..................................65 - 11.1 Basic Authentication Scheme ........................66 - 11.2 Digest Authentication Scheme .......................67 - 12 Content Negotiation....................................67 - 12.1 Server-driven Negotiation ..........................68 - 12.2 Agent-driven Negotiation ...........................69 - 12.3 Transparent Negotiation ............................70 - 13 Caching in HTTP........................................70 - 13.1.1 Cache Correctness ...............................72 - 13.1.2 Warnings ........................................73 - 13.1.3 Cache-control Mechanisms ........................74 - 13.1.4 Explicit User Agent Warnings ....................74 - 13.1.5 Exceptions to the Rules and Warnings ............75 - 13.1.6 Client-controlled Behavior ......................75 - 13.2 Expiration Model ...................................75 - 13.2.1 Server-Specified Expiration .....................75 - 13.2.2 Heuristic Expiration ............................76 - 13.2.3 Age Calculations ................................77 - 13.2.4 Expiration Calculations .........................79 - 13.2.5 Disambiguating Expiration Values ................80 - 13.2.6 Disambiguating Multiple Responses ...............80 - 13.3 Validation Model ...................................81 - 13.3.1 Last-modified Dates .............................82 - 13.3.2 Entity Tag Cache Validators .....................82 - 13.3.3 Weak and Strong Validators ......................82 - 13.3.4 Rules for When to Use Entity Tags and Last- - modified Dates..........................................85 - 13.3.5 Non-validating Conditionals .....................86 - 13.4 Response Cachability ...............................86 - 13.5 Constructing Responses From Caches .................87 - 13.5.1 End-to-end and Hop-by-hop Headers ...............88 - 13.5.2 Non-modifiable Headers ..........................88 - 13.5.3 Combining Headers ...............................89 - 13.5.4 Combining Byte Ranges ...........................90 - 13.6 Caching Negotiated Responses .......................90 - 13.7 Shared and Non-Shared Caches .......................91 - 13.8 Errors or Incomplete Response Cache Behavior .......91 - 13.9 Side Effects of GET and HEAD .......................92 - 13.10 Invalidation After Updates or Deletions ...........92 - 13.11 Write-Through Mandatory ...........................93 - 13.12 Cache Replacement .................................93 - 13.13 History Lists .....................................93 - 14 Header Field Definitions...............................94 - 14.1 Accept .............................................95 - - - -Fielding, et. al. Standards Track [Page 4] - -RFC 2068 HTTP/1.1 January 1997 - - - 14.2 Accept-Charset .....................................97 - 14.3 Accept-Encoding ....................................97 - 14.4 Accept-Language ....................................98 - 14.5 Accept-Ranges ......................................99 - 14.6 Age ................................................99 - 14.7 Allow .............................................100 - 14.8 Authorization .....................................100 - 14.9 Cache-Control .....................................101 - 14.9.1 What is Cachable ...............................103 - 14.9.2 What May be Stored by Caches ...................103 - 14.9.3 Modifications of the Basic Expiration Mechanism 104 - 14.9.4 Cache Revalidation and Reload Controls .........105 - 14.9.5 No-Transform Directive .........................107 - 14.9.6 Cache Control Extensions .......................108 - 14.10 Connection .......................................109 - 14.11 Content-Base .....................................109 - 14.12 Content-Encoding .................................110 - 14.13 Content-Language .................................110 - 14.14 Content-Length ...................................111 - 14.15 Content-Location .................................112 - 14.16 Content-MD5 ......................................113 - 14.17 Content-Range ....................................114 - 14.18 Content-Type .....................................116 - 14.19 Date .............................................116 - 14.20 ETag .............................................117 - 14.21 Expires ..........................................117 - 14.22 From .............................................118 - 14.23 Host .............................................119 - 14.24 If-Modified-Since ................................119 - 14.25 If-Match .........................................121 - 14.26 If-None-Match ....................................122 - 14.27 If-Range .........................................123 - 14.28 If-Unmodified-Since ..............................124 - 14.29 Last-Modified ....................................124 - 14.30 Location .........................................125 - 14.31 Max-Forwards .....................................125 - 14.32 Pragma ...........................................126 - 14.33 Proxy-Authenticate ...............................127 - 14.34 Proxy-Authorization ..............................127 - 14.35 Public ...........................................127 - 14.36 Range ............................................128 - 14.36.1 Byte Ranges ...................................128 - 14.36.2 Range Retrieval Requests ......................130 - 14.37 Referer ..........................................131 - 14.38 Retry-After ......................................131 - 14.39 Server ...........................................132 - 14.40 Transfer-Encoding ................................132 - 14.41 Upgrade ..........................................132 - - - -Fielding, et. al. Standards Track [Page 5] - -RFC 2068 HTTP/1.1 January 1997 - - - 14.42 User-Agent .......................................134 - 14.43 Vary .............................................134 - 14.44 Via ..............................................135 - 14.45 Warning ..........................................137 - 14.46 WWW-Authenticate .................................139 - 15 Security Considerations...............................139 - 15.1 Authentication of Clients .........................139 - 15.2 Offering a Choice of Authentication Schemes .......140 - 15.3 Abuse of Server Log Information ...................141 - 15.4 Transfer of Sensitive Information .................141 - 15.5 Attacks Based On File and Path Names ..............142 - 15.6 Personal Information ..............................143 - 15.7 Privacy Issues Connected to Accept Headers ........143 - 15.8 DNS Spoofing ......................................144 - 15.9 Location Headers and Spoofing .....................144 - 16 Acknowledgments.......................................144 - 17 References............................................146 - 18 Authors' Addresses....................................149 - 19 Appendices............................................150 - 19.1 Internet Media Type message/http ..................150 - 19.2 Internet Media Type multipart/byteranges ..........150 - 19.3 Tolerant Applications .............................151 - 19.4 Differences Between HTTP Entities and - MIME Entities...........................................152 - 19.4.1 Conversion to Canonical Form ...................152 - 19.4.2 Conversion of Date Formats .....................153 - 19.4.3 Introduction of Content-Encoding ...............153 - 19.4.4 No Content-Transfer-Encoding ...................153 - 19.4.5 HTTP Header Fields in Multipart Body-Parts .....153 - 19.4.6 Introduction of Transfer-Encoding ..............154 - 19.4.7 MIME-Version ...................................154 - 19.5 Changes from HTTP/1.0 .............................154 - 19.5.1 Changes to Simplify Multi-homed Web Servers and - Conserve IP Addresses .................................155 - 19.6 Additional Features ...............................156 - 19.6.1 Additional Request Methods .....................156 - 19.6.2 Additional Header Field Definitions ............156 - 19.7 Compatibility with Previous Versions ..............160 - 19.7.1 Compatibility with HTTP/1.0 Persistent - Connections............................................161 - - - - - - - - - - - -Fielding, et. al. Standards Track [Page 6] - -RFC 2068 HTTP/1.1 January 1997 - - -1 Introduction - -1.1 Purpose - - The Hypertext Transfer Protocol (HTTP) is an application-level - protocol for distributed, collaborative, hypermedia information - systems. HTTP has been in use by the World-Wide Web global - information initiative since 1990. The first version of HTTP, - referred to as HTTP/0.9, was a simple protocol for raw data transfer - across the Internet. HTTP/1.0, as defined by RFC 1945 [6], improved - the protocol by allowing messages to be in the format of MIME-like - messages, containing metainformation about the data transferred and - modifiers on the request/response semantics. However, HTTP/1.0 does - not sufficiently take into consideration the effects of hierarchical - proxies, caching, the need for persistent connections, and virtual - hosts. In addition, the proliferation of incompletely-implemented - applications calling themselves "HTTP/1.0" has necessitated a - protocol version change in order for two communicating applications - to determine each other's true capabilities. - - This specification defines the protocol referred to as "HTTP/1.1". - This protocol includes more stringent requirements than HTTP/1.0 in - order to ensure reliable implementation of its features. - - Practical information systems require more functionality than simple - retrieval, including search, front-end update, and annotation. HTTP - allows an open-ended set of methods that indicate the purpose of a - request. It builds on the discipline of reference provided by the - Uniform Resource Identifier (URI) [3][20], as a location (URL) [4] or - name (URN) , for indicating the resource to which a method is to be - applied. Messages are passed in a format similar to that used by - Internet mail as defined by the Multipurpose Internet Mail Extensions - (MIME). - - HTTP is also used as a generic protocol for communication between - user agents and proxies/gateways to other Internet systems, including - those supported by the SMTP [16], NNTP [13], FTP [18], Gopher [2], - and WAIS [10] protocols. In this way, HTTP allows basic hypermedia - access to resources available from diverse applications. - -1.2 Requirements - - This specification uses the same words as RFC 1123 [8] for defining - the significance of each particular requirement. These words are: - - MUST - This word or the adjective "required" means that the item is an - absolute requirement of the specification. - - - -Fielding, et. al. Standards Track [Page 7] - -RFC 2068 HTTP/1.1 January 1997 - - - SHOULD - This word or the adjective "recommended" means that there may - exist valid reasons in particular circumstances to ignore this - item, but the full implications should be understood and the case - carefully weighed before choosing a different course. - - MAY - This word or the adjective "optional" means that this item is - truly optional. One vendor may choose to include the item because - a particular marketplace requires it or because it enhances the - product, for example; another vendor may omit the same item. - - An implementation is not compliant if it fails to satisfy one or more - of the MUST requirements for the protocols it implements. An - implementation that satisfies all the MUST and all the SHOULD - requirements for its protocols is said to be "unconditionally - compliant"; one that satisfies all the MUST requirements but not all - the SHOULD requirements for its protocols is said to be - "conditionally compliant." - -1.3 Terminology - - This specification uses a number of terms to refer to the roles - played by participants in, and objects of, the HTTP communication. - - connection - A transport layer virtual circuit established between two programs - for the purpose of communication. - - message - The basic unit of HTTP communication, consisting of a structured - sequence of octets matching the syntax defined in section 4 and - transmitted via the connection. - - request - An HTTP request message, as defined in section 5. - - response - An HTTP response message, as defined in section 6. - - resource - A network data object or service that can be identified by a URI, - as defined in section 3.2. Resources may be available in multiple - representations (e.g. multiple languages, data formats, size, - resolutions) or vary in other ways. - - - - - - -Fielding, et. al. Standards Track [Page 8] - -RFC 2068 HTTP/1.1 January 1997 - - - entity - The information transferred as the payload of a request or - response. An entity consists of metainformation in the form of - entity-header fields and content in the form of an entity-body, as - described in section 7. - - representation - An entity included with a response that is subject to content - negotiation, as described in section 12. There may exist multiple - representations associated with a particular response status. - - content negotiation - The mechanism for selecting the appropriate representation when - servicing a request, as described in section 12. The - representation of entities in any response can be negotiated - (including error responses). - - variant - A resource may have one, or more than one, representation(s) - associated with it at any given instant. Each of these - representations is termed a `variant.' Use of the term `variant' - does not necessarily imply that the resource is subject to content - negotiation. - - client - A program that establishes connections for the purpose of sending - requests. - - user agent - The client which initiates a request. These are often browsers, - editors, spiders (web-traversing robots), or other end user tools. - - server - An application program that accepts connections in order to - service requests by sending back responses. Any given program may - be capable of being both a client and a server; our use of these - terms refers only to the role being performed by the program for a - particular connection, rather than to the program's capabilities - in general. Likewise, any server may act as an origin server, - proxy, gateway, or tunnel, switching behavior based on the nature - of each request. - - origin server - The server on which a given resource resides or is to be created. - - - - - - - -Fielding, et. al. Standards Track [Page 9] - -RFC 2068 HTTP/1.1 January 1997 - - - proxy - An intermediary program which acts as both a server and a client - for the purpose of making requests on behalf of other clients. - Requests are serviced internally or by passing them on, with - possible translation, to other servers. A proxy must implement - both the client and server requirements of this specification. - - gateway - A server which acts as an intermediary for some other server. - Unlike a proxy, a gateway receives requests as if it were the - origin server for the requested resource; the requesting client - may not be aware that it is communicating with a gateway. - - tunnel - An intermediary program which is acting as a blind relay between - two connections. Once active, a tunnel is not considered a party - to the HTTP communication, though the tunnel may have been - initiated by an HTTP request. The tunnel ceases to exist when both - ends of the relayed connections are closed. - - cache - A program's local store of response messages and the subsystem - that controls its message storage, retrieval, and deletion. A - cache stores cachable responses in order to reduce the response - time and network bandwidth consumption on future, equivalent - requests. Any client or server may include a cache, though a cache - cannot be used by a server that is acting as a tunnel. - - cachable - A response is cachable if a cache is allowed to store a copy of - the response message for use in answering subsequent requests. The - rules for determining the cachability of HTTP responses are - defined in section 13. Even if a resource is cachable, there may - be additional constraints on whether a cache can use the cached - copy for a particular request. - - first-hand - A response is first-hand if it comes directly and without - unnecessary delay from the origin server, perhaps via one or more - proxies. A response is also first-hand if its validity has just - been checked directly with the origin server. - - explicit expiration time - The time at which the origin server intends that an entity should - no longer be returned by a cache without further validation. - - - - - - -Fielding, et. al. Standards Track [Page 10] - -RFC 2068 HTTP/1.1 January 1997 - - - heuristic expiration time - An expiration time assigned by a cache when no explicit expiration - time is available. - - age - The age of a response is the time since it was sent by, or - successfully validated with, the origin server. - - freshness lifetime - The length of time between the generation of a response and its - expiration time. - - fresh - A response is fresh if its age has not yet exceeded its freshness - lifetime. - - stale - A response is stale if its age has passed its freshness lifetime. - - semantically transparent - A cache behaves in a "semantically transparent" manner, with - respect to a particular response, when its use affects neither the - requesting client nor the origin server, except to improve - performance. When a cache is semantically transparent, the client - receives exactly the same response (except for hop-by-hop headers) - that it would have received had its request been handled directly - by the origin server. - - validator - A protocol element (e.g., an entity tag or a Last-Modified time) - that is used to find out whether a cache entry is an equivalent - copy of an entity. - -1.4 Overall Operation - - The HTTP protocol is a request/response protocol. A client sends a - request to the server in the form of a request method, URI, and - protocol version, followed by a MIME-like message containing request - modifiers, client information, and possible body content over a - connection with a server. The server responds with a status line, - including the message's protocol version and a success or error code, - followed by a MIME-like message containing server information, entity - metainformation, and possible entity-body content. The relationship - between HTTP and MIME is described in appendix 19.4. - - - - - - - -Fielding, et. al. Standards Track [Page 11] - -RFC 2068 HTTP/1.1 January 1997 - - - Most HTTP communication is initiated by a user agent and consists of - a request to be applied to a resource on some origin server. In the - simplest case, this may be accomplished via a single connection (v) - between the user agent (UA) and the origin server (O). - - request chain ------------------------> - UA -------------------v------------------- O - <----------------------- response chain - - A more complicated situation occurs when one or more intermediaries - are present in the request/response chain. There are three common - forms of intermediary: proxy, gateway, and tunnel. A proxy is a - forwarding agent, receiving requests for a URI in its absolute form, - rewriting all or part of the message, and forwarding the reformatted - request toward the server identified by the URI. A gateway is a - receiving agent, acting as a layer above some other server(s) and, if - necessary, translating the requests to the underlying server's - protocol. A tunnel acts as a relay point between two connections - without changing the messages; tunnels are used when the - communication needs to pass through an intermediary (such as a - firewall) even when the intermediary cannot understand the contents - of the messages. - - request chain --------------------------------------> - UA -----v----- A -----v----- B -----v----- C -----v----- O - <------------------------------------- response chain - - The figure above shows three intermediaries (A, B, and C) between the - user agent and origin server. A request or response message that - travels the whole chain will pass through four separate connections. - This distinction is important because some HTTP communication options - may apply only to the connection with the nearest, non-tunnel - neighbor, only to the end-points of the chain, or to all connections - along the chain. Although the diagram is linear, each participant - may be engaged in multiple, simultaneous communications. For example, - B may be receiving requests from many clients other than A, and/or - forwarding requests to servers other than C, at the same time that it - is handling A's request. - - Any party to the communication which is not acting as a tunnel may - employ an internal cache for handling requests. The effect of a cache - is that the request/response chain is shortened if one of the - participants along the chain has a cached response applicable to that - request. The following illustrates the resulting chain if B has a - cached copy of an earlier response from O (via C) for a request which - has not been cached by UA or A. - - - - - -Fielding, et. al. Standards Track [Page 12] - -RFC 2068 HTTP/1.1 January 1997 - - - request chain ----------> - UA -----v----- A -----v----- B - - - - - - C - - - - - - O - <--------- response chain - - Not all responses are usefully cachable, and some requests may - contain modifiers which place special requirements on cache behavior. - HTTP requirements for cache behavior and cachable responses are - defined in section 13. - - In fact, there are a wide variety of architectures and configurations - of caches and proxies currently being experimented with or deployed - across the World Wide Web; these systems include national hierarchies - of proxy caches to save transoceanic bandwidth, systems that - broadcast or multicast cache entries, organizations that distribute - subsets of cached data via CD-ROM, and so on. HTTP systems are used - in corporate intranets over high-bandwidth links, and for access via - PDAs with low-power radio links and intermittent connectivity. The - goal of HTTP/1.1 is to support the wide diversity of configurations - already deployed while introducing protocol constructs that meet the - needs of those who build web applications that require high - reliability and, failing that, at least reliable indications of - failure. - - HTTP communication usually takes place over TCP/IP connections. The - default port is TCP 80, but other ports can be used. This does not - preclude HTTP from being implemented on top of any other protocol on - the Internet, or on other networks. HTTP only presumes a reliable - transport; any protocol that provides such guarantees can be used; - the mapping of the HTTP/1.1 request and response structures onto the - transport data units of the protocol in question is outside the scope - of this specification. - - In HTTP/1.0, most implementations used a new connection for each - request/response exchange. In HTTP/1.1, a connection may be used for - one or more request/response exchanges, although connections may be - closed for a variety of reasons (see section 8.1). - -2 Notational Conventions and Generic Grammar - -2.1 Augmented BNF - - All of the mechanisms specified in this document are described in - both prose and an augmented Backus-Naur Form (BNF) similar to that - used by RFC 822 [9]. Implementers will need to be familiar with the - notation in order to understand this specification. The augmented BNF - includes the following constructs: - - - - - -Fielding, et. al. Standards Track [Page 13] - -RFC 2068 HTTP/1.1 January 1997 - - -name = definition - The name of a rule is simply the name itself (without any enclosing - "<" and ">") and is separated from its definition by the equal "=" - character. Whitespace is only significant in that indentation of - continuation lines is used to indicate a rule definition that spans - more than one line. Certain basic rules are in uppercase, such as - SP, LWS, HT, CRLF, DIGIT, ALPHA, etc. Angle brackets are used - within definitions whenever their presence will facilitate - discerning the use of rule names. - -"literal" - Quotation marks surround literal text. Unless stated otherwise, the - text is case-insensitive. - -rule1 | rule2 - Elements separated by a bar ("|") are alternatives, e.g., "yes | - no" will accept yes or no. - -(rule1 rule2) - Elements enclosed in parentheses are treated as a single element. - Thus, "(elem (foo | bar) elem)" allows the token sequences "elem - foo elem" and "elem bar elem". - -*rule - The character "*" preceding an element indicates repetition. The - full form is "<n>*<m>element" indicating at least <n> and at most - <m> occurrences of element. Default values are 0 and infinity so - that "*(element)" allows any number, including zero; "1*element" - requires at least one; and "1*2element" allows one or two. - -[rule] - Square brackets enclose optional elements; "[foo bar]" is - equivalent to "*1(foo bar)". - -N rule - Specific repetition: "<n>(element)" is equivalent to - "<n>*<n>(element)"; that is, exactly <n> occurrences of (element). - Thus 2DIGIT is a 2-digit number, and 3ALPHA is a string of three - alphabetic characters. - -#rule - A construct "#" is defined, similar to "*", for defining lists of - elements. The full form is "<n>#<m>element " indicating at least - <n> and at most <m> elements, each separated by one or more commas - (",") and optional linear whitespace (LWS). This makes the usual - form of lists very easy; a rule such as "( *LWS element *( *LWS "," - *LWS element )) " can be shown as "1#element". Wherever this - construct is used, null elements are allowed, but do not contribute - - - -Fielding, et. al. Standards Track [Page 14] - -RFC 2068 HTTP/1.1 January 1997 - - - to the count of elements present. That is, "(element), , (element) - " is permitted, but counts as only two elements. Therefore, where - at least one element is required, at least one non-null element - must be present. Default values are 0 and infinity so that - "#element" allows any number, including zero; "1#element" requires - at least one; and "1#2element" allows one or two. - -; comment - A semi-colon, set off some distance to the right of rule text, - starts a comment that continues to the end of line. This is a - simple way of including useful notes in parallel with the - specifications. - -implied *LWS - The grammar described by this specification is word-based. Except - where noted otherwise, linear whitespace (LWS) can be included - between any two adjacent words (token or quoted-string), and - between adjacent tokens and delimiters (tspecials), without - changing the interpretation of a field. At least one delimiter - (tspecials) must exist between any two tokens, since they would - otherwise be interpreted as a single token. - -2.2 Basic Rules - - The following rules are used throughout this specification to - describe basic parsing constructs. The US-ASCII coded character set - is defined by ANSI X3.4-1986 [21]. - - OCTET = <any 8-bit sequence of data> - CHAR = <any US-ASCII character (octets 0 - 127)> - UPALPHA = <any US-ASCII uppercase letter "A".."Z"> - LOALPHA = <any US-ASCII lowercase letter "a".."z"> - ALPHA = UPALPHA | LOALPHA - DIGIT = <any US-ASCII digit "0".."9"> - CTL = <any US-ASCII control character - (octets 0 - 31) and DEL (127)> - CR = <US-ASCII CR, carriage return (13)> - LF = <US-ASCII LF, linefeed (10)> - SP = <US-ASCII SP, space (32)> - HT = <US-ASCII HT, horizontal-tab (9)> - <"> = <US-ASCII double-quote mark (34)> - - - - - - - - - - -Fielding, et. al. Standards Track [Page 15] - -RFC 2068 HTTP/1.1 January 1997 - - - HTTP/1.1 defines the sequence CR LF as the end-of-line marker for all - protocol elements except the entity-body (see appendix 19.3 for - tolerant applications). The end-of-line marker within an entity-body - is defined by its associated media type, as described in section 3.7. - - CRLF = CR LF - - HTTP/1.1 headers can be folded onto multiple lines if the - continuation line begins with a space or horizontal tab. All linear - white space, including folding, has the same semantics as SP. - - LWS = [CRLF] 1*( SP | HT ) - - The TEXT rule is only used for descriptive field contents and values - that are not intended to be interpreted by the message parser. Words - of *TEXT may contain characters from character sets other than ISO - 8859-1 [22] only when encoded according to the rules of RFC 1522 - [14]. - - TEXT = <any OCTET except CTLs, - but including LWS> - - Hexadecimal numeric characters are used in several protocol elements. - - HEX = "A" | "B" | "C" | "D" | "E" | "F" - | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT - - Many HTTP/1.1 header field values consist of words separated by LWS - or special characters. These special characters MUST be in a quoted - string to be used within a parameter value. - - token = 1*<any CHAR except CTLs or tspecials> - - tspecials = "(" | ")" | "<" | ">" | "@" - | "," | ";" | ":" | "\" | <"> - | "/" | "[" | "]" | "?" | "=" - | "{" | "}" | SP | HT - - Comments can be included in some HTTP header fields by surrounding - the comment text with parentheses. Comments are only allowed in - fields containing "comment" as part of their field value definition. - In all other fields, parentheses are considered part of the field - value. - - comment = "(" *( ctext | comment ) ")" - ctext = <any TEXT excluding "(" and ")"> - - - - - -Fielding, et. al. Standards Track [Page 16] - -RFC 2068 HTTP/1.1 January 1997 - - - A string of text is parsed as a single word if it is quoted using - double-quote marks. - - quoted-string = ( <"> *(qdtext) <"> ) - - qdtext = <any TEXT except <">> - - The backslash character ("\") may be used as a single-character quoting - mechanism only within quoted-string and comment constructs. - - quoted-pair = "\" CHAR - -3 Protocol Parameters - -3.1 HTTP Version - - HTTP uses a "<major>.<minor>" numbering scheme to indicate versions - of the protocol. The protocol versioning policy is intended to allow - the sender to indicate the format of a message and its capacity for - understanding further HTTP communication, rather than the features - obtained via that communication. No change is made to the version - number for the addition of message components which do not affect - communication behavior or which only add to extensible field values. - The <minor> number is incremented when the changes made to the - protocol add features which do not change the general message parsing - algorithm, but which may add to the message semantics and imply - additional capabilities of the sender. The <major> number is - incremented when the format of a message within the protocol is - changed. - - The version of an HTTP message is indicated by an HTTP-Version field - in the first line of the message. - - HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT - - Note that the major and minor numbers MUST be treated as separate - integers and that each may be incremented higher than a single digit. - Thus, HTTP/2.4 is a lower version than HTTP/2.13, which in turn is - lower than HTTP/12.3. Leading zeros MUST be ignored by recipients and - MUST NOT be sent. - - Applications sending Request or Response messages, as defined by this - specification, MUST include an HTTP-Version of "HTTP/1.1". Use of - this version number indicates that the sending application is at - least conditionally compliant with this specification. - - The HTTP version of an application is the highest HTTP version for - which the application is at least conditionally compliant. - - - -Fielding, et. al. Standards Track [Page 17] - -RFC 2068 HTTP/1.1 January 1997 - - - Proxy and gateway applications must be careful when forwarding - messages in protocol versions different from that of the application. - Since the protocol version indicates the protocol capability of the - sender, a proxy/gateway MUST never send a message with a version - indicator which is greater than its actual version; if a higher - version request is received, the proxy/gateway MUST either downgrade - the request version, respond with an error, or switch to tunnel - behavior. Requests with a version lower than that of the - proxy/gateway's version MAY be upgraded before being forwarded; the - proxy/gateway's response to that request MUST be in the same major - version as the request. - - Note: Converting between versions of HTTP may involve modification - of header fields required or forbidden by the versions involved. - -3.2 Uniform Resource Identifiers - - URIs have been known by many names: WWW addresses, Universal Document - Identifiers, Universal Resource Identifiers , and finally the - combination of Uniform Resource Locators (URL) and Names (URN). As - far as HTTP is concerned, Uniform Resource Identifiers are simply - formatted strings which identify--via name, location, or any other - characteristic--a resource. - -3.2.1 General Syntax - - URIs in HTTP can be represented in absolute form or relative to some - known base URI, depending upon the context of their use. The two - forms are differentiated by the fact that absolute URIs always begin - with a scheme name followed by a colon. - - URI = ( absoluteURI | relativeURI ) [ "#" fragment ] - - absoluteURI = scheme ":" *( uchar | reserved ) - - relativeURI = net_path | abs_path | rel_path - - net_path = "//" net_loc [ abs_path ] - abs_path = "/" rel_path - rel_path = [ path ] [ ";" params ] [ "?" query ] - - path = fsegment *( "/" segment ) - fsegment = 1*pchar - segment = *pchar - - params = param *( ";" param ) - param = *( pchar | "/" ) - - - - -Fielding, et. al. Standards Track [Page 18] - -RFC 2068 HTTP/1.1 January 1997 - - - scheme = 1*( ALPHA | DIGIT | "+" | "-" | "." ) - net_loc = *( pchar | ";" | "?" ) - - query = *( uchar | reserved ) - fragment = *( uchar | reserved ) - - pchar = uchar | ":" | "@" | "&" | "=" | "+" - uchar = unreserved | escape - unreserved = ALPHA | DIGIT | safe | extra | national - - escape = "%" HEX HEX - reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" - extra = "!" | "*" | "'" | "(" | ")" | "," - safe = "$" | "-" | "_" | "." - unsafe = CTL | SP | <"> | "#" | "%" | "<" | ">" - national = <any OCTET excluding ALPHA, DIGIT, - reserved, extra, safe, and unsafe> - - For definitive information on URL syntax and semantics, see RFC 1738 - [4] and RFC 1808 [11]. The BNF above includes national characters not - allowed in valid URLs as specified by RFC 1738, since HTTP servers - are not restricted in the set of unreserved characters allowed to - represent the rel_path part of addresses, and HTTP proxies may - receive requests for URIs not defined by RFC 1738. - - The HTTP protocol does not place any a priori limit on the length of - a URI. Servers MUST be able to handle the URI of any resource they - serve, and SHOULD be able to handle URIs of unbounded length if they - provide GET-based forms that could generate such URIs. A server - SHOULD return 414 (Request-URI Too Long) status if a URI is longer - than the server can handle (see section 10.4.15). - - Note: Servers should be cautious about depending on URI lengths - above 255 bytes, because some older client or proxy implementations - may not properly support these lengths. - -3.2.2 http URL - - The "http" scheme is used to locate network resources via the HTTP - protocol. This section defines the scheme-specific syntax and - semantics for http URLs. - - - - - - - - - - -Fielding, et. al. Standards Track [Page 19] - -RFC 2068 HTTP/1.1 January 1997 - - - http_URL = "http:" "//" host [ ":" port ] [ abs_path ] - - host = <A legal Internet host domain name - or IP address (in dotted-decimal form), - as defined by Section 2.1 of RFC 1123> - - port = *DIGIT - - If the port is empty or not given, port 80 is assumed. The semantics - are that the identified resource is located at the server listening - for TCP connections on that port of that host, and the Request-URI - for the resource is abs_path. The use of IP addresses in URL's SHOULD - be avoided whenever possible (see RFC 1900 [24]). If the abs_path is - not present in the URL, it MUST be given as "/" when used as a - Request-URI for a resource (section 5.1.2). - -3.2.3 URI Comparison - - When comparing two URIs to decide if they match or not, a client - SHOULD use a case-sensitive octet-by-octet comparison of the entire - URIs, with these exceptions: - - o A port that is empty or not given is equivalent to the default - port for that URI; - - o Comparisons of host names MUST be case-insensitive; - - o Comparisons of scheme names MUST be case-insensitive; - - o An empty abs_path is equivalent to an abs_path of "/". - - Characters other than those in the "reserved" and "unsafe" sets (see - section 3.2) are equivalent to their ""%" HEX HEX" encodings. - - For example, the following three URIs are equivalent: - - http://abc.com:80/~smith/home.html - http://ABC.com/%7Esmith/home.html - http://ABC.com:/%7esmith/home.html - - - - - - - - - - - - -Fielding, et. al. Standards Track [Page 20] - -RFC 2068 HTTP/1.1 January 1997 - - -3.3 Date/Time Formats - -3.3.1 Full Date - - HTTP applications have historically allowed three different formats - for the representation of date/time stamps: - - Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 - Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 - Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format - - The first format is preferred as an Internet standard and represents - a fixed-length subset of that defined by RFC 1123 (an update to RFC - 822). The second format is in common use, but is based on the - obsolete RFC 850 [12] date format and lacks a four-digit year. - HTTP/1.1 clients and servers that parse the date value MUST accept - all three formats (for compatibility with HTTP/1.0), though they MUST - only generate the RFC 1123 format for representing HTTP-date values - in header fields. - - Note: Recipients of date values are encouraged to be robust in - accepting date values that may have been sent by non-HTTP - applications, as is sometimes the case when retrieving or posting - messages via proxies/gateways to SMTP or NNTP. - - All HTTP date/time stamps MUST be represented in Greenwich Mean Time - (GMT), without exception. This is indicated in the first two formats - by the inclusion of "GMT" as the three-letter abbreviation for time - zone, and MUST be assumed when reading the asctime format. - - HTTP-date = rfc1123-date | rfc850-date | asctime-date - - rfc1123-date = wkday "," SP date1 SP time SP "GMT" - rfc850-date = weekday "," SP date2 SP time SP "GMT" - asctime-date = wkday SP date3 SP time SP 4DIGIT - - date1 = 2DIGIT SP month SP 4DIGIT - ; day month year (e.g., 02 Jun 1982) - date2 = 2DIGIT "-" month "-" 2DIGIT - ; day-month-year (e.g., 02-Jun-82) - date3 = month SP ( 2DIGIT | ( SP 1DIGIT )) - ; month day (e.g., Jun 2) - - time = 2DIGIT ":" 2DIGIT ":" 2DIGIT - ; 00:00:00 - 23:59:59 - - wkday = "Mon" | "Tue" | "Wed" - | "Thu" | "Fri" | "Sat" | "Sun" - - - -Fielding, et. al. Standards Track [Page 21] - -RFC 2068 HTTP/1.1 January 1997 - - - weekday = "Monday" | "Tuesday" | "Wednesday" - | "Thursday" | "Friday" | "Saturday" | "Sunday" - - month = "Jan" | "Feb" | "Mar" | "Apr" - | "May" | "Jun" | "Jul" | "Aug" - | "Sep" | "Oct" | "Nov" | "Dec" - - Note: HTTP requirements for the date/time stamp format apply only - to their usage within the protocol stream. Clients and servers are - not required to use these formats for user presentation, request - logging, etc. - -3.3.2 Delta Seconds - - Some HTTP header fields allow a time value to be specified as an - integer number of seconds, represented in decimal, after the time - that the message was received. - - delta-seconds = 1*DIGIT - -3.4 Character Sets - - HTTP uses the same definition of the term "character set" as that - described for MIME: - - The term "character set" is used in this document to refer to a - method used with one or more tables to convert a sequence of octets - into a sequence of characters. Note that unconditional conversion - in the other direction is not required, in that not all characters - may be available in a given character set and a character set may - provide more than one sequence of octets to represent a particular - character. This definition is intended to allow various kinds of - character encodings, from simple single-table mappings such as US- - ASCII to complex table switching methods such as those that use ISO - 2022's techniques. However, the definition associated with a MIME - character set name MUST fully specify the mapping to be performed - from octets to characters. In particular, use of external profiling - information to determine the exact mapping is not permitted. - - Note: This use of the term "character set" is more commonly - referred to as a "character encoding." However, since HTTP and MIME - share the same registry, it is important that the terminology also - be shared. - - - - - - - - -Fielding, et. al. Standards Track [Page 22] - -RFC 2068 HTTP/1.1 January 1997 - - - HTTP character sets are identified by case-insensitive tokens. The - complete set of tokens is defined by the IANA Character Set registry - [19]. - - charset = token - - Although HTTP allows an arbitrary token to be used as a charset - value, any token that has a predefined value within the IANA - Character Set registry MUST represent the character set defined by - that registry. Applications SHOULD limit their use of character sets - to those defined by the IANA registry. - -3.5 Content Codings - - Content coding values indicate an encoding transformation that has - been or can be applied to an entity. Content codings are primarily - used to allow a document to be compressed or otherwise usefully - transformed without losing the identity of its underlying media type - and without loss of information. Frequently, the entity is stored in - coded form, transmitted directly, and only decoded by the recipient. - - content-coding = token - - All content-coding values are case-insensitive. HTTP/1.1 uses - content-coding values in the Accept-Encoding (section 14.3) and - Content-Encoding (section 14.12) header fields. Although the value - describes the content-coding, what is more important is that it - indicates what decoding mechanism will be required to remove the - encoding. - - The Internet Assigned Numbers Authority (IANA) acts as a registry for - content-coding value tokens. Initially, the registry contains the - following tokens: - - gzip An encoding format produced by the file compression program "gzip" - (GNU zip) as described in RFC 1952 [25]. This format is a Lempel- - Ziv coding (LZ77) with a 32 bit CRC. - - compress - The encoding format produced by the common UNIX file compression - program "compress". This format is an adaptive Lempel-Ziv-Welch - coding (LZW). - - - - - - - - - -Fielding, et. al. Standards Track [Page 23] - -RFC 2068 HTTP/1.1 January 1997 - - - Note: Use of program names for the identification of encoding - formats is not desirable and should be discouraged for future - encodings. Their use here is representative of historical practice, - not good design. For compatibility with previous implementations of - HTTP, applications should consider "x-gzip" and "x-compress" to be - equivalent to "gzip" and "compress" respectively. - - deflate The "zlib" format defined in RFC 1950[31] in combination with - the "deflate" compression mechanism described in RFC 1951[29]. - - New content-coding value tokens should be registered; to allow - interoperability between clients and servers, specifications of the - content coding algorithms needed to implement a new value should be - publicly available and adequate for independent implementation, and - conform to the purpose of content coding defined in this section. - -3.6 Transfer Codings - - Transfer coding values are used to indicate an encoding - transformation that has been, can be, or may need to be applied to an - entity-body in order to ensure "safe transport" through the network. - This differs from a content coding in that the transfer coding is a - property of the message, not of the original entity. - - transfer-coding = "chunked" | transfer-extension - - transfer-extension = token - - All transfer-coding values are case-insensitive. HTTP/1.1 uses - transfer coding values in the Transfer-Encoding header field (section - 14.40). - - Transfer codings are analogous to the Content-Transfer-Encoding - values of MIME , which were designed to enable safe transport of - binary data over a 7-bit transport service. However, safe transport - has a different focus for an 8bit-clean transfer protocol. In HTTP, - the only unsafe characteristic of message-bodies is the difficulty in - determining the exact body length (section 7.2.2), or the desire to - encrypt data over a shared transport. - - The chunked encoding modifies the body of a message in order to - transfer it as a series of chunks, each with its own size indicator, - followed by an optional footer containing entity-header fields. This - allows dynamically-produced content to be transferred along with the - information necessary for the recipient to verify that it has - received the full message. - - - - - -Fielding, et. al. Standards Track [Page 24] - -RFC 2068 HTTP/1.1 January 1997 - - - Chunked-Body = *chunk - "0" CRLF - footer - CRLF - - chunk = chunk-size [ chunk-ext ] CRLF - chunk-data CRLF - - hex-no-zero = <HEX excluding "0"> - - chunk-size = hex-no-zero *HEX - chunk-ext = *( ";" chunk-ext-name [ "=" chunk-ext-value ] ) - chunk-ext-name = token - chunk-ext-val = token | quoted-string - chunk-data = chunk-size(OCTET) - - footer = *entity-header - - The chunked encoding is ended by a zero-sized chunk followed by the - footer, which is terminated by an empty line. The purpose of the - footer is to provide an efficient way to supply information about an - entity that is generated dynamically; applications MUST NOT send - header fields in the footer which are not explicitly defined as being - appropriate for the footer, such as Content-MD5 or future extensions - to HTTP for digital signatures or other facilities. - - An example process for decoding a Chunked-Body is presented in - appendix 19.4.6. - - All HTTP/1.1 applications MUST be able to receive and decode the - "chunked" transfer coding, and MUST ignore transfer coding extensions - they do not understand. A server which receives an entity-body with a - transfer-coding it does not understand SHOULD return 501 - (Unimplemented), and close the connection. A server MUST NOT send - transfer-codings to an HTTP/1.0 client. - -3.7 Media Types - - HTTP uses Internet Media Types in the Content-Type (section 14.18) - and Accept (section 14.1) header fields in order to provide open and - extensible data typing and type negotiation. - - media-type = type "/" subtype *( ";" parameter ) - type = token - subtype = token - - Parameters may follow the type/subtype in the form of attribute/value - pairs. - - - -Fielding, et. al. Standards Track [Page 25] - -RFC 2068 HTTP/1.1 January 1997 - - - parameter = attribute "=" value - attribute = token - value = token | quoted-string - - The type, subtype, and parameter attribute names are case- - insensitive. Parameter values may or may not be case-sensitive, - depending on the semantics of the parameter name. Linear white space - (LWS) MUST NOT be used between the type and subtype, nor between an - attribute and its value. User agents that recognize the media-type - MUST process (or arrange to be processed by any external applications - used to process that type/subtype by the user agent) the parameters - for that MIME type as described by that type/subtype definition to - the and inform the user of any problems discovered. - - Note: some older HTTP applications do not recognize media type - parameters. When sending data to older HTTP applications, - implementations should only use media type parameters when they are - required by that type/subtype definition. - - Media-type values are registered with the Internet Assigned Number - Authority (IANA). The media type registration process is outlined in - RFC 2048 [17]. Use of non-registered media types is discouraged. - -3.7.1 Canonicalization and Text Defaults - - Internet media types are registered with a canonical form. In - general, an entity-body transferred via HTTP messages MUST be - represented in the appropriate canonical form prior to its - transmission; the exception is "text" types, as defined in the next - paragraph. - - When in canonical form, media subtypes of the "text" type use CRLF as - the text line break. HTTP relaxes this requirement and allows the - transport of text media with plain CR or LF alone representing a line - break when it is done consistently for an entire entity-body. HTTP - applications MUST accept CRLF, bare CR, and bare LF as being - representative of a line break in text media received via HTTP. In - addition, if the text is represented in a character set that does not - use octets 13 and 10 for CR and LF respectively, as is the case for - some multi-byte character sets, HTTP allows the use of whatever octet - sequences are defined by that character set to represent the - equivalent of CR and LF for line breaks. This flexibility regarding - line breaks applies only to text media in the entity-body; a bare CR - or LF MUST NOT be substituted for CRLF within any of the HTTP control - structures (such as header fields and multipart boundaries). - - If an entity-body is encoded with a Content-Encoding, the underlying - data MUST be in a form defined above prior to being encoded. - - - -Fielding, et. al. Standards Track [Page 26] - -RFC 2068 HTTP/1.1 January 1997 - - - The "charset" parameter is used with some media types to define the - character set (section 3.4) of the data. When no explicit charset - parameter is provided by the sender, media subtypes of the "text" - type are defined to have a default charset value of "ISO-8859-1" when - received via HTTP. Data in character sets other than "ISO-8859-1" or - its subsets MUST be labeled with an appropriate charset value. - - Some HTTP/1.0 software has interpreted a Content-Type header without - charset parameter incorrectly to mean "recipient should guess." - Senders wishing to defeat this behavior MAY include a charset - parameter even when the charset is ISO-8859-1 and SHOULD do so when - it is known that it will not confuse the recipient. - - Unfortunately, some older HTTP/1.0 clients did not deal properly with - an explicit charset parameter. HTTP/1.1 recipients MUST respect the - charset label provided by the sender; and those user agents that have - a provision to "guess" a charset MUST use the charset from the - content-type field if they support that charset, rather than the - recipient's preference, when initially displaying a document. - -3.7.2 Multipart Types - - MIME provides for a number of "multipart" types -- encapsulations of - one or more entities within a single message-body. All multipart - types share a common syntax, as defined in MIME [7], and MUST - include a boundary parameter as part of the media type value. The - message body is itself a protocol element and MUST therefore use only - CRLF to represent line breaks between body-parts. Unlike in MIME, the - epilogue of any multipart message MUST be empty; HTTP applications - MUST NOT transmit the epilogue (even if the original multipart - contains an epilogue). - - In HTTP, multipart body-parts MAY contain header fields which are - significant to the meaning of that part. A Content-Location header - field (section 14.15) SHOULD be included in the body-part of each - enclosed entity that can be identified by a URL. - - In general, an HTTP user agent SHOULD follow the same or similar - behavior as a MIME user agent would upon receipt of a multipart type. - If an application receives an unrecognized multipart subtype, the - application MUST treat it as being equivalent to "multipart/mixed". - - Note: The "multipart/form-data" type has been specifically defined - for carrying form data suitable for processing via the POST request - method, as described in RFC 1867 [15]. - - - - - - -Fielding, et. al. Standards Track [Page 27] - -RFC 2068 HTTP/1.1 January 1997 - - -3.8 Product Tokens - - Product tokens are used to allow communicating applications to - identify themselves by software name and version. Most fields using - product tokens also allow sub-products which form a significant part - of the application to be listed, separated by whitespace. By - convention, the products are listed in order of their significance - for identifying the application. - - product = token ["/" product-version] - product-version = token - - Examples: - - User-Agent: CERN-LineMode/2.15 libwww/2.17b3 - Server: Apache/0.8.4 - - Product tokens should be short and to the point -- use of them for - advertising or other non-essential information is explicitly - forbidden. Although any token character may appear in a product- - version, this token SHOULD only be used for a version identifier - (i.e., successive versions of the same product SHOULD only differ in - the product-version portion of the product value). - -3.9 Quality Values - - HTTP content negotiation (section 12) uses short "floating point" - numbers to indicate the relative importance ("weight") of various - negotiable parameters. A weight is normalized to a real number in the - range 0 through 1, where 0 is the minimum and 1 the maximum value. - HTTP/1.1 applications MUST NOT generate more than three digits after - the decimal point. User configuration of these values SHOULD also be - limited in this fashion. - - qvalue = ( "0" [ "." 0*3DIGIT ] ) - | ( "1" [ "." 0*3("0") ] ) - - "Quality values" is a misnomer, since these values merely represent - relative degradation in desired quality. - -3.10 Language Tags - - A language tag identifies a natural language spoken, written, or - otherwise conveyed by human beings for communication of information - to other human beings. Computer languages are explicitly excluded. - HTTP uses language tags within the Accept-Language and Content- - Language fields. - - - - -Fielding, et. al. Standards Track [Page 28] - -RFC 2068 HTTP/1.1 January 1997 - - - The syntax and registry of HTTP language tags is the same as that - defined by RFC 1766 [1]. In summary, a language tag is composed of 1 - or more parts: A primary language tag and a possibly empty series of - subtags: - - language-tag = primary-tag *( "-" subtag ) - - primary-tag = 1*8ALPHA - subtag = 1*8ALPHA - - Whitespace is not allowed within the tag and all tags are case- - insensitive. The name space of language tags is administered by the - IANA. Example tags include: - - en, en-US, en-cockney, i-cherokee, x-pig-latin - - where any two-letter primary-tag is an ISO 639 language abbreviation - and any two-letter initial subtag is an ISO 3166 country code. (The - last three tags above are not registered tags; all but the last are - examples of tags which could be registered in future.) - -3.11 Entity Tags - - Entity tags are used for comparing two or more entities from the same - requested resource. HTTP/1.1 uses entity tags in the ETag (section - 14.20), If-Match (section 14.25), If-None-Match (section 14.26), and - If-Range (section 14.27) header fields. The definition of how they - are used and compared as cache validators is in section 13.3.3. An - entity tag consists of an opaque quoted string, possibly prefixed by - a weakness indicator. - - entity-tag = [ weak ] opaque-tag - - weak = "W/" - opaque-tag = quoted-string - - A "strong entity tag" may be shared by two entities of a resource - only if they are equivalent by octet equality. - - A "weak entity tag," indicated by the "W/" prefix, may be shared by - two entities of a resource only if the entities are equivalent and - could be substituted for each other with no significant change in - semantics. A weak entity tag can only be used for weak comparison. - - An entity tag MUST be unique across all versions of all entities - associated with a particular resource. A given entity tag value may - be used for entities obtained by requests on different URIs without - implying anything about the equivalence of those entities. - - - -Fielding, et. al. Standards Track [Page 29] - -RFC 2068 HTTP/1.1 January 1997 - - -3.12 Range Units - - HTTP/1.1 allows a client to request that only part (a range of) the - response entity be included within the response. HTTP/1.1 uses range - units in the Range (section 14.36) and Content-Range (section 14.17) - header fields. An entity may be broken down into subranges according - to various structural units. - - range-unit = bytes-unit | other-range-unit - - bytes-unit = "bytes" - other-range-unit = token - -The only range unit defined by HTTP/1.1 is "bytes". HTTP/1.1 - implementations may ignore ranges specified using other units. - HTTP/1.1 has been designed to allow implementations of applications - that do not depend on knowledge of ranges. - -4 HTTP Message - -4.1 Message Types - - HTTP messages consist of requests from client to server and responses - from server to client. - - HTTP-message = Request | Response ; HTTP/1.1 messages - - Request (section 5) and Response (section 6) messages use the generic - message format of RFC 822 [9] for transferring entities (the payload - of the message). Both types of message consist of a start-line, one - or more header fields (also known as "headers"), an empty line (i.e., - a line with nothing preceding the CRLF) indicating the end of the - header fields, and an optional message-body. - - generic-message = start-line - *message-header - CRLF - [ message-body ] - - start-line = Request-Line | Status-Line - - In the interest of robustness, servers SHOULD ignore any empty - line(s) received where a Request-Line is expected. In other words, if - the server is reading the protocol stream at the beginning of a - message and receives a CRLF first, it should ignore the CRLF. - - - - - - -Fielding, et. al. Standards Track [Page 30] - -RFC 2068 HTTP/1.1 January 1997 - - - Note: certain buggy HTTP/1.0 client implementations generate an - extra CRLF's after a POST request. To restate what is explicitly - forbidden by the BNF, an HTTP/1.1 client must not preface or follow - a request with an extra CRLF. - -4.2 Message Headers - - HTTP header fields, which include general-header (section 4.5), - request-header (section 5.3), response-header (section 6.2), and - entity-header (section 7.1) fields, follow the same generic format as - that given in Section 3.1 of RFC 822 [9]. Each header field consists - of a name followed by a colon (":") and the field value. Field names - are case-insensitive. The field value may be preceded by any amount - of LWS, though a single SP is preferred. Header fields can be - extended over multiple lines by preceding each extra line with at - least one SP or HT. Applications SHOULD follow "common form" when - generating HTTP constructs, since there might exist some - implementations that fail to accept anything beyond the common forms. - - message-header = field-name ":" [ field-value ] CRLF - - field-name = token - field-value = *( field-content | LWS ) - - field-content = <the OCTETs making up the field-value - and consisting of either *TEXT or combinations - of token, tspecials, and quoted-string> - - The order in which header fields with differing field names are - received is not significant. However, it is "good practice" to send - general-header fields first, followed by request-header or response- - header fields, and ending with the entity-header fields. - - Multiple message-header fields with the same field-name may be - present in a message if and only if the entire field-value for that - header field is defined as a comma-separated list [i.e., #(values)]. - It MUST be possible to combine the multiple header fields into one - "field-name: field-value" pair, without changing the semantics of the - message, by appending each subsequent field-value to the first, each - separated by a comma. The order in which header fields with the same - field-name are received is therefore significant to the - interpretation of the combined field value, and thus a proxy MUST NOT - change the order of these field values when a message is forwarded. - - - - - - - - -Fielding, et. al. Standards Track [Page 31] - -RFC 2068 HTTP/1.1 January 1997 - - -4.3 Message Body - - The message-body (if any) of an HTTP message is used to carry the - entity-body associated with the request or response. The message-body - differs from the entity-body only when a transfer coding has been - applied, as indicated by the Transfer-Encoding header field (section - 14.40). - - message-body = entity-body - | <entity-body encoded as per Transfer-Encoding> - - Transfer-Encoding MUST be used to indicate any transfer codings - applied by an application to ensure safe and proper transfer of the - message. Transfer-Encoding is a property of the message, not of the - entity, and thus can be added or removed by any application along the - request/response chain. - - The rules for when a message-body is allowed in a message differ for - requests and responses. - - The presence of a message-body in a request is signaled by the - inclusion of a Content-Length or Transfer-Encoding header field in - the request's message-headers. A message-body MAY be included in a - request only when the request method (section 5.1.1) allows an - entity-body. - - For response messages, whether or not a message-body is included with - a message is dependent on both the request method and the response - status code (section 6.1.1). All responses to the HEAD request method - MUST NOT include a message-body, even though the presence of entity- - header fields might lead one to believe they do. All 1xx - (informational), 204 (no content), and 304 (not modified) responses - MUST NOT include a message-body. All other responses do include a - message-body, although it may be of zero length. - -4.4 Message Length - - When a message-body is included with a message, the length of that - body is determined by one of the following (in order of precedence): - - 1. Any response message which MUST NOT include a message-body - (such as the 1xx, 204, and 304 responses and any response to a HEAD - request) is always terminated by the first empty line after the - header fields, regardless of the entity-header fields present in the - message. - - 2. If a Transfer-Encoding header field (section 14.40) is present and - indicates that the "chunked" transfer coding has been applied, then - - - -Fielding, et. al. Standards Track [Page 32] - -RFC 2068 HTTP/1.1 January 1997 - - - the length is defined by the chunked encoding (section 3.6). - - 3. If a Content-Length header field (section 14.14) is present, its - value in bytes represents the length of the message-body. - - 4. If the message uses the media type "multipart/byteranges", which is - self-delimiting, then that defines the length. This media type MUST - NOT be used unless the sender knows that the recipient can parse it; - the presence in a request of a Range header with multiple byte-range - specifiers implies that the client can parse multipart/byteranges - responses. - - 5. By the server closing the connection. (Closing the connection - cannot be used to indicate the end of a request body, since that - would leave no possibility for the server to send back a response.) - - For compatibility with HTTP/1.0 applications, HTTP/1.1 requests - containing a message-body MUST include a valid Content-Length header - field unless the server is known to be HTTP/1.1 compliant. If a - request contains a message-body and a Content-Length is not given, - the server SHOULD respond with 400 (bad request) if it cannot - determine the length of the message, or with 411 (length required) if - it wishes to insist on receiving a valid Content-Length. - - All HTTP/1.1 applications that receive entities MUST accept the - "chunked" transfer coding (section 3.6), thus allowing this mechanism - to be used for messages when the message length cannot be determined - in advance. - - Messages MUST NOT include both a Content-Length header field and the - "chunked" transfer coding. If both are received, the Content-Length - MUST be ignored. - - When a Content-Length is given in a message where a message-body is - allowed, its field value MUST exactly match the number of OCTETs in - the message-body. HTTP/1.1 user agents MUST notify the user when an - invalid length is received and detected. - - - - - - - - - - - - - - -Fielding, et. al. Standards Track [Page 33] - -RFC 2068 HTTP/1.1 January 1997 - - -4.5 General Header Fields - - There are a few header fields which have general applicability for - both request and response messages, but which do not apply to the - entity being transferred. These header fields apply only to the - message being transmitted. - - general-header = Cache-Control ; Section 14.9 - | Connection ; Section 14.10 - | Date ; Section 14.19 - | Pragma ; Section 14.32 - | Transfer-Encoding ; Section 14.40 - | Upgrade ; Section 14.41 - | Via ; Section 14.44 - - General-header field names can be extended reliably only in - combination with a change in the protocol version. However, new or - experimental header fields may be given the semantics of general - header fields if all parties in the communication recognize them to - be general-header fields. Unrecognized header fields are treated as - entity-header fields. - -5 Request - - A request message from a client to a server includes, within the - first line of that message, the method to be applied to the resource, - the identifier of the resource, and the protocol version in use. - - Request = Request-Line ; Section 5.1 - *( general-header ; Section 4.5 - | request-header ; Section 5.3 - | entity-header ) ; Section 7.1 - CRLF - [ message-body ] ; Section 7.2 - -5.1 Request-Line - - The Request-Line begins with a method token, followed by the - Request-URI and the protocol version, and ending with CRLF. The - elements are separated by SP characters. No CR or LF are allowed - except in the final CRLF sequence. - - Request-Line = Method SP Request-URI SP HTTP-Version CRLF - - - - - - - - -Fielding, et. al. Standards Track [Page 34] - -RFC 2068 HTTP/1.1 January 1997 - - -5.1.1 Method - - The Method token indicates the method to be performed on the resource - identified by the Request-URI. The method is case-sensitive. - - Method = "OPTIONS" ; Section 9.2 - | "GET" ; Section 9.3 - | "HEAD" ; Section 9.4 - | "POST" ; Section 9.5 - | "PUT" ; Section 9.6 - | "DELETE" ; Section 9.7 - | "TRACE" ; Section 9.8 - | extension-method - - extension-method = token - - The list of methods allowed by a resource can be specified in an - Allow header field (section 14.7). The return code of the response - always notifies the client whether a method is currently allowed on a - resource, since the set of allowed methods can change dynamically. - Servers SHOULD return the status code 405 (Method Not Allowed) if the - method is known by the server but not allowed for the requested - resource, and 501 (Not Implemented) if the method is unrecognized or - not implemented by the server. The list of methods known by a server - can be listed in a Public response-header field (section 14.35). - - The methods GET and HEAD MUST be supported by all general-purpose - servers. All other methods are optional; however, if the above - methods are implemented, they MUST be implemented with the same - semantics as those specified in section 9. - -5.1.2 Request-URI - - The Request-URI is a Uniform Resource Identifier (section 3.2) and - identifies the resource upon which to apply the request. - - Request-URI = "*" | absoluteURI | abs_path - - The three options for Request-URI are dependent on the nature of the - request. The asterisk "*" means that the request does not apply to a - particular resource, but to the server itself, and is only allowed - when the method used does not necessarily apply to a resource. One - example would be - - OPTIONS * HTTP/1.1 - - The absoluteURI form is required when the request is being made to a - proxy. The proxy is requested to forward the request or service it - - - -Fielding, et. al. Standards Track [Page 35] - -RFC 2068 HTTP/1.1 January 1997 - - - from a valid cache, and return the response. Note that the proxy MAY - forward the request on to another proxy or directly to the server - specified by the absoluteURI. In order to avoid request loops, a - proxy MUST be able to recognize all of its server names, including - any aliases, local variations, and the numeric IP address. An example - Request-Line would be: - - GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1 - - To allow for transition to absoluteURIs in all requests in future - versions of HTTP, all HTTP/1.1 servers MUST accept the absoluteURI - form in requests, even though HTTP/1.1 clients will only generate - them in requests to proxies. - - The most common form of Request-URI is that used to identify a - resource on an origin server or gateway. In this case the absolute - path of the URI MUST be transmitted (see section 3.2.1, abs_path) as - the Request-URI, and the network location of the URI (net_loc) MUST - be transmitted in a Host header field. For example, a client wishing - to retrieve the resource above directly from the origin server would - create a TCP connection to port 80 of the host "www.w3.org" and send - the lines: - - GET /pub/WWW/TheProject.html HTTP/1.1 - Host: www.w3.org - - followed by the remainder of the Request. Note that the absolute path - cannot be empty; if none is present in the original URI, it MUST be - given as "/" (the server root). - - If a proxy receives a request without any path in the Request-URI and - the method specified is capable of supporting the asterisk form of - request, then the last proxy on the request chain MUST forward the - request with "*" as the final Request-URI. For example, the request - - OPTIONS http://www.ics.uci.edu:8001 HTTP/1.1 - - would be forwarded by the proxy as - - OPTIONS * HTTP/1.1 - Host: www.ics.uci.edu:8001 - - after connecting to port 8001 of host "www.ics.uci.edu". - - The Request-URI is transmitted in the format specified in section - 3.2.1. The origin server MUST decode the Request-URI in order to - properly interpret the request. Servers SHOULD respond to invalid - Request-URIs with an appropriate status code. - - - -Fielding, et. al. Standards Track [Page 36] - -RFC 2068 HTTP/1.1 January 1997 - - - In requests that they forward, proxies MUST NOT rewrite the - "abs_path" part of a Request-URI in any way except as noted above to - replace a null abs_path with "*", no matter what the proxy does in - its internal implementation. - - Note: The "no rewrite" rule prevents the proxy from changing the - meaning of the request when the origin server is improperly using a - non-reserved URL character for a reserved purpose. Implementers - should be aware that some pre-HTTP/1.1 proxies have been known to - rewrite the Request-URI. - -5.2 The Resource Identified by a Request - - HTTP/1.1 origin servers SHOULD be aware that the exact resource - identified by an Internet request is determined by examining both the - Request-URI and the Host header field. - - An origin server that does not allow resources to differ by the - requested host MAY ignore the Host header field value. (But see - section 19.5.1 for other requirements on Host support in HTTP/1.1.) - - An origin server that does differentiate resources based on the host - requested (sometimes referred to as virtual hosts or vanity - hostnames) MUST use the following rules for determining the requested - resource on an HTTP/1.1 request: - - 1. If Request-URI is an absoluteURI, the host is part of the - Request-URI. Any Host header field value in the request MUST be - ignored. - - 2. If the Request-URI is not an absoluteURI, and the request - includes a Host header field, the host is determined by the Host - header field value. - - 3. If the host as determined by rule 1 or 2 is not a valid host on - the server, the response MUST be a 400 (Bad Request) error - message. - - Recipients of an HTTP/1.0 request that lacks a Host header field MAY - attempt to use heuristics (e.g., examination of the URI path for - something unique to a particular host) in order to determine what - exact resource is being requested. - -5.3 Request Header Fields - - The request-header fields allow the client to pass additional - information about the request, and about the client itself, to the - server. These fields act as request modifiers, with semantics - - - -Fielding, et. al. Standards Track [Page 37] - -RFC 2068 HTTP/1.1 January 1997 - - - equivalent to the parameters on a programming language method - invocation. - - request-header = Accept ; Section 14.1 - | Accept-Charset ; Section 14.2 - | Accept-Encoding ; Section 14.3 - | Accept-Language ; Section 14.4 - | Authorization ; Section 14.8 - | From ; Section 14.22 - | Host ; Section 14.23 - | If-Modified-Since ; Section 14.24 - | If-Match ; Section 14.25 - | If-None-Match ; Section 14.26 - | If-Range ; Section 14.27 - | If-Unmodified-Since ; Section 14.28 - | Max-Forwards ; Section 14.31 - | Proxy-Authorization ; Section 14.34 - | Range ; Section 14.36 - | Referer ; Section 14.37 - | User-Agent ; Section 14.42 - - Request-header field names can be extended reliably only in - combination with a change in the protocol version. However, new or - experimental header fields MAY be given the semantics of request- - header fields if all parties in the communication recognize them to - be request-header fields. Unrecognized header fields are treated as - entity-header fields. - -6 Response - - After receiving and interpreting a request message, a server responds - with an HTTP response message. - - Response = Status-Line ; Section 6.1 - *( general-header ; Section 4.5 - | response-header ; Section 6.2 - | entity-header ) ; Section 7.1 - CRLF - [ message-body ] ; Section 7.2 - -6.1 Status-Line - - The first line of a Response message is the Status-Line, consisting - of the protocol version followed by a numeric status code and its - associated textual phrase, with each element separated by SP - characters. No CR or LF is allowed except in the final CRLF - sequence. - - - - -Fielding, et. al. Standards Track [Page 38] - -RFC 2068 HTTP/1.1 January 1997 - - - Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF - -6.1.1 Status Code and Reason Phrase - - The Status-Code element is a 3-digit integer result code of the - attempt to understand and satisfy the request. These codes are fully - defined in section 10. The Reason-Phrase is intended to give a short - textual description of the Status-Code. The Status-Code is intended - for use by automata and the Reason-Phrase is intended for the human - user. The client is not required to examine or display the Reason- - Phrase. - - The first digit of the Status-Code defines the class of response. The - last two digits do not have any categorization role. There are 5 - values for the first digit: - - o 1xx: Informational - Request received, continuing process - - o 2xx: Success - The action was successfully received, understood, - and accepted - - o 3xx: Redirection - Further action must be taken in order to - complete the request - - o 4xx: Client Error - The request contains bad syntax or cannot be - fulfilled - - o 5xx: Server Error - The server failed to fulfill an apparently - valid request - - The individual values of the numeric status codes defined for - HTTP/1.1, and an example set of corresponding Reason-Phrase's, are - presented below. The reason phrases listed here are only recommended - -- they may be replaced by local equivalents without affecting the - protocol. - - Status-Code = "100" ; Continue - | "101" ; Switching Protocols - | "200" ; OK - | "201" ; Created - | "202" ; Accepted - | "203" ; Non-Authoritative Information - | "204" ; No Content - | "205" ; Reset Content - | "206" ; Partial Content - | "300" ; Multiple Choices - | "301" ; Moved Permanently - | "302" ; Moved Temporarily - - - -Fielding, et. al. Standards Track [Page 39] - -RFC 2068 HTTP/1.1 January 1997 - - - | "303" ; See Other - | "304" ; Not Modified - | "305" ; Use Proxy - | "400" ; Bad Request - | "401" ; Unauthorized - | "402" ; Payment Required - | "403" ; Forbidden - | "404" ; Not Found - | "405" ; Method Not Allowed - | "406" ; Not Acceptable - | "407" ; Proxy Authentication Required - | "408" ; Request Time-out - | "409" ; Conflict - | "410" ; Gone - | "411" ; Length Required - | "412" ; Precondition Failed - | "413" ; Request Entity Too Large - | "414" ; Request-URI Too Large - | "415" ; Unsupported Media Type - | "500" ; Internal Server Error - | "501" ; Not Implemented - | "502" ; Bad Gateway - | "503" ; Service Unavailable - | "504" ; Gateway Time-out - | "505" ; HTTP Version not supported - | extension-code - - extension-code = 3DIGIT - - Reason-Phrase = *<TEXT, excluding CR, LF> - - HTTP status codes are extensible. HTTP applications are not required - to understand the meaning of all registered status codes, though such - understanding is obviously desirable. However, applications MUST - understand the class of any status code, as indicated by the first - digit, and treat any unrecognized response as being equivalent to the - x00 status code of that class, with the exception that an - unrecognized response MUST NOT be cached. For example, if an - unrecognized status code of 431 is received by the client, it can - safely assume that there was something wrong with its request and - treat the response as if it had received a 400 status code. In such - cases, user agents SHOULD present to the user the entity returned - with the response, since that entity is likely to include human- - readable information which will explain the unusual status. - - - - - - - -Fielding, et. al. Standards Track [Page 40] - -RFC 2068 HTTP/1.1 January 1997 - - -6.2 Response Header Fields - - The response-header fields allow the server to pass additional - information about the response which cannot be placed in the Status- - Line. These header fields give information about the server and about - further access to the resource identified by the Request-URI. - - response-header = Age ; Section 14.6 - | Location ; Section 14.30 - | Proxy-Authenticate ; Section 14.33 - | Public ; Section 14.35 - | Retry-After ; Section 14.38 - | Server ; Section 14.39 - | Vary ; Section 14.43 - | Warning ; Section 14.45 - | WWW-Authenticate ; Section 14.46 - - Response-header field names can be extended reliably only in - combination with a change in the protocol version. However, new or - experimental header fields MAY be given the semantics of response- - header fields if all parties in the communication recognize them to - be response-header fields. Unrecognized header fields are treated as - entity-header fields. - -7 Entity - - Request and Response messages MAY transfer an entity if not otherwise - restricted by the request method or response status code. An entity - consists of entity-header fields and an entity-body, although some - responses will only include the entity-headers. - - In this section, both sender and recipient refer to either the client - or the server, depending on who sends and who receives the entity. - -7.1 Entity Header Fields - - Entity-header fields define optional metainformation about the - entity-body or, if no body is present, about the resource identified - by the request. - - - - - - - - - - - - -Fielding, et. al. Standards Track [Page 41] - -RFC 2068 HTTP/1.1 January 1997 - - - entity-header = Allow ; Section 14.7 - | Content-Base ; Section 14.11 - | Content-Encoding ; Section 14.12 - | Content-Language ; Section 14.13 - | Content-Length ; Section 14.14 - | Content-Location ; Section 14.15 - | Content-MD5 ; Section 14.16 - | Content-Range ; Section 14.17 - | Content-Type ; Section 14.18 - | ETag ; Section 14.20 - | Expires ; Section 14.21 - | Last-Modified ; Section 14.29 - | extension-header - - extension-header = message-header - - The extension-header mechanism allows additional entity-header fields - to be defined without changing the protocol, but these fields cannot - be assumed to be recognizable by the recipient. Unrecognized header - fields SHOULD be ignored by the recipient and forwarded by proxies. - -7.2 Entity Body - - The entity-body (if any) sent with an HTTP request or response is in - a format and encoding defined by the entity-header fields. - - entity-body = *OCTET - - An entity-body is only present in a message when a message-body is - present, as described in section 4.3. The entity-body is obtained - from the message-body by decoding any Transfer-Encoding that may have - been applied to ensure safe and proper transfer of the message. - -7.2.1 Type - - When an entity-body is included with a message, the data type of that - body is determined via the header fields Content-Type and Content- - Encoding. These define a two-layer, ordered encoding model: - - entity-body := Content-Encoding( Content-Type( data ) ) - - Content-Type specifies the media type of the underlying data. - Content-Encoding may be used to indicate any additional content - codings applied to the data, usually for the purpose of data - compression, that are a property of the requested resource. There is - no default encoding. - - - - - -Fielding, et. al. Standards Track [Page 42] - -RFC 2068 HTTP/1.1 January 1997 - - - Any HTTP/1.1 message containing an entity-body SHOULD include a - Content-Type header field defining the media type of that body. If - and only if the media type is not given by a Content-Type field, the - recipient MAY attempt to guess the media type via inspection of its - content and/or the name extension(s) of the URL used to identify the - resource. If the media type remains unknown, the recipient SHOULD - treat it as type "application/octet-stream". - -7.2.2 Length - - The length of an entity-body is the length of the message-body after - any transfer codings have been removed. Section 4.4 defines how the - length of a message-body is determined. - -8 Connections - -8.1 Persistent Connections - -8.1.1 Purpose - - Prior to persistent connections, a separate TCP connection was - established to fetch each URL, increasing the load on HTTP servers - and causing congestion on the Internet. The use of inline images and - other associated data often requires a client to make multiple - requests of the same server in a short amount of time. Analyses of - these performance problems are available [30][27]; analysis and - results from a prototype implementation are in [26]. - - Persistent HTTP connections have a number of advantages: - - o By opening and closing fewer TCP connections, CPU time is saved, - and memory used for TCP protocol control blocks is also saved. - o HTTP requests and responses can be pipelined on a connection. - Pipelining allows a client to make multiple requests without - waiting for each response, allowing a single TCP connection to be - used much more efficiently, with much lower elapsed time. - o Network congestion is reduced by reducing the number of packets - caused by TCP opens, and by allowing TCP sufficient time to - determine the congestion state of the network. - o HTTP can evolve more gracefully; since errors can be reported - without the penalty of closing the TCP connection. Clients using - future versions of HTTP might optimistically try a new feature, but - if communicating with an older server, retry with old semantics - after an error is reported. - - HTTP implementations SHOULD implement persistent connections. - - - - - -Fielding, et. al. Standards Track [Page 43] - -RFC 2068 HTTP/1.1 January 1997 - - -8.1.2 Overall Operation - - A significant difference between HTTP/1.1 and earlier versions of - HTTP is that persistent connections are the default behavior of any - HTTP connection. That is, unless otherwise indicated, the client may - assume that the server will maintain a persistent connection. - - Persistent connections provide a mechanism by which a client and a - server can signal the close of a TCP connection. This signaling takes - place using the Connection header field. Once a close has been - signaled, the client MUST not send any more requests on that - connection. - -8.1.2.1 Negotiation - - An HTTP/1.1 server MAY assume that a HTTP/1.1 client intends to - maintain a persistent connection unless a Connection header including - the connection-token "close" was sent in the request. If the server - chooses to close the connection immediately after sending the - response, it SHOULD send a Connection header including the - connection-token close. - - An HTTP/1.1 client MAY expect a connection to remain open, but would - decide to keep it open based on whether the response from a server - contains a Connection header with the connection-token close. In case - the client does not want to maintain a connection for more than that - request, it SHOULD send a Connection header including the - connection-token close. - - If either the client or the server sends the close token in the - Connection header, that request becomes the last one for the - connection. - - Clients and servers SHOULD NOT assume that a persistent connection is - maintained for HTTP versions less than 1.1 unless it is explicitly - signaled. See section 19.7.1 for more information on backwards - compatibility with HTTP/1.0 clients. - - In order to remain persistent, all messages on the connection must - have a self-defined message length (i.e., one not defined by closure - of the connection), as described in section 4.4. - -8.1.2.2 Pipelining - - A client that supports persistent connections MAY "pipeline" its - requests (i.e., send multiple requests without waiting for each - response). A server MUST send its responses to those requests in the - same order that the requests were received. - - - -Fielding, et. al. Standards Track [Page 44] - -RFC 2068 HTTP/1.1 January 1997 - - - Clients which assume persistent connections and pipeline immediately - after connection establishment SHOULD be prepared to retry their - connection if the first pipelined attempt fails. If a client does - such a retry, it MUST NOT pipeline before it knows the connection is - persistent. Clients MUST also be prepared to resend their requests if - the server closes the connection before sending all of the - corresponding responses. - -8.1.3 Proxy Servers - - It is especially important that proxies correctly implement the - properties of the Connection header field as specified in 14.2.1. - - The proxy server MUST signal persistent connections separately with - its clients and the origin servers (or other proxy servers) that it - connects to. Each persistent connection applies to only one transport - link. - - A proxy server MUST NOT establish a persistent connection with an - HTTP/1.0 client. - -8.1.4 Practical Considerations - - Servers will usually have some time-out value beyond which they will - no longer maintain an inactive connection. Proxy servers might make - this a higher value since it is likely that the client will be making - more connections through the same server. The use of persistent - connections places no requirements on the length of this time-out for - either the client or the server. - - When a client or server wishes to time-out it SHOULD issue a graceful - close on the transport connection. Clients and servers SHOULD both - constantly watch for the other side of the transport close, and - respond to it as appropriate. If a client or server does not detect - the other side's close promptly it could cause unnecessary resource - drain on the network. - - A client, server, or proxy MAY close the transport connection at any - time. For example, a client MAY have started to send a new request at - the same time that the server has decided to close the "idle" - connection. From the server's point of view, the connection is being - closed while it was idle, but from the client's point of view, a - request is in progress. - - This means that clients, servers, and proxies MUST be able to recover - from asynchronous close events. Client software SHOULD reopen the - transport connection and retransmit the aborted request without user - interaction so long as the request method is idempotent (see section - - - -Fielding, et. al. Standards Track [Page 45] - -RFC 2068 HTTP/1.1 January 1997 - - - 9.1.2); other methods MUST NOT be automatically retried, although - user agents MAY offer a human operator the choice of retrying the - request. - - However, this automatic retry SHOULD NOT be repeated if the second - request fails. - - Servers SHOULD always respond to at least one request per connection, - if at all possible. Servers SHOULD NOT close a connection in the - middle of transmitting a response, unless a network or client failure - is suspected. - - Clients that use persistent connections SHOULD limit the number of - simultaneous connections that they maintain to a given server. A - single-user client SHOULD maintain AT MOST 2 connections with any - server or proxy. A proxy SHOULD use up to 2*N connections to another - server or proxy, where N is the number of simultaneously active - users. These guidelines are intended to improve HTTP response times - and avoid congestion of the Internet or other networks. - -8.2 Message Transmission Requirements - -General requirements: - -o HTTP/1.1 servers SHOULD maintain persistent connections and use - TCP's flow control mechanisms to resolve temporary overloads, - rather than terminating connections with the expectation that - clients will retry. The latter technique can exacerbate network - congestion. - -o An HTTP/1.1 (or later) client sending a message-body SHOULD monitor - the network connection for an error status while it is transmitting - the request. If the client sees an error status, it SHOULD - immediately cease transmitting the body. If the body is being sent - using a "chunked" encoding (section 3.6), a zero length chunk and - empty footer MAY be used to prematurely mark the end of the - message. If the body was preceded by a Content-Length header, the - client MUST close the connection. - -o An HTTP/1.1 (or later) client MUST be prepared to accept a 100 - (Continue) status followed by a regular response. - -o An HTTP/1.1 (or later) server that receives a request from a - HTTP/1.0 (or earlier) client MUST NOT transmit the 100 (continue) - response; it SHOULD either wait for the request to be completed - normally (thus avoiding an interrupted request) or close the - connection prematurely. - - - - -Fielding, et. al. Standards Track [Page 46] - -RFC 2068 HTTP/1.1 January 1997 - - - Upon receiving a method subject to these requirements from an - HTTP/1.1 (or later) client, an HTTP/1.1 (or later) server MUST either - respond with 100 (Continue) status and continue to read from the - input stream, or respond with an error status. If it responds with an - error status, it MAY close the transport (TCP) connection or it MAY - continue to read and discard the rest of the request. It MUST NOT - perform the requested method if it returns an error status. - - Clients SHOULD remember the version number of at least the most - recently used server; if an HTTP/1.1 client has seen an HTTP/1.1 or - later response from the server, and it sees the connection close - before receiving any status from the server, the client SHOULD retry - the request without user interaction so long as the request method is - idempotent (see section 9.1.2); other methods MUST NOT be - automatically retried, although user agents MAY offer a human - operator the choice of retrying the request.. If the client does - retry the request, the client - - o MUST first send the request header fields, and then - - o MUST wait for the server to respond with either a 100 (Continue) - response, in which case the client should continue, or with an - error status. - - If an HTTP/1.1 client has not seen an HTTP/1.1 or later response from - the server, it should assume that the server implements HTTP/1.0 or - older and will not use the 100 (Continue) response. If in this case - the client sees the connection close before receiving any status from - the server, the client SHOULD retry the request. If the client does - retry the request to this HTTP/1.0 server, it should use the - following "binary exponential backoff" algorithm to be assured of - obtaining a reliable response: - - 1. Initiate a new connection to the server - - 2. Transmit the request-headers - - 3. Initialize a variable R to the estimated round-trip time to the - server (e.g., based on the time it took to establish the - connection), or to a constant value of 5 seconds if the round-trip - time is not available. - - 4. Compute T = R * (2**N), where N is the number of previous retries - of this request. - - 5. Wait either for an error response from the server, or for T seconds - (whichever comes first) - - - - -Fielding, et. al. Standards Track [Page 47] - -RFC 2068 HTTP/1.1 January 1997 - - - 6. If no error response is received, after T seconds transmit the body - of the request. - - 7. If client sees that the connection is closed prematurely, repeat - from step 1 until the request is accepted, an error response is - received, or the user becomes impatient and terminates the retry - process. - - No matter what the server version, if an error status is received, - the client - - o MUST NOT continue and - - o MUST close the connection if it has not completed sending the - message. - - An HTTP/1.1 (or later) client that sees the connection close after - receiving a 100 (Continue) but before receiving any other status - SHOULD retry the request, and need not wait for 100 (Continue) - response (but MAY do so if this simplifies the implementation). - -9 Method Definitions - - The set of common methods for HTTP/1.1 is defined below. Although - this set can be expanded, additional methods cannot be assumed to - share the same semantics for separately extended clients and servers. - - The Host request-header field (section 14.23) MUST accompany all - HTTP/1.1 requests. - -9.1 Safe and Idempotent Methods - -9.1.1 Safe Methods - - Implementers should be aware that the software represents the user in - their interactions over the Internet, and should be careful to allow - the user to be aware of any actions they may take which may have an - unexpected significance to themselves or others. - - In particular, the convention has been established that the GET and - HEAD methods should never have the significance of taking an action - other than retrieval. These methods should be considered "safe." This - allows user agents to represent other methods, such as POST, PUT and - DELETE, in a special way, so that the user is made aware of the fact - that a possibly unsafe action is being requested. - - Naturally, it is not possible to ensure that the server does not - generate side-effects as a result of performing a GET request; in - - - -Fielding, et. al. Standards Track [Page 48] - -RFC 2068 HTTP/1.1 January 1997 - - - fact, some dynamic resources consider that a feature. The important - distinction here is that the user did not request the side-effects, - so therefore cannot be held accountable for them. - -9.1.2 Idempotent Methods - - Methods may also have the property of "idempotence" in that (aside - from error or expiration issues) the side-effects of N > 0 identical - requests is the same as for a single request. The methods GET, HEAD, - PUT and DELETE share this property. - -9.2 OPTIONS - - The OPTIONS method represents a request for information about the - communication options available on the request/response chain - identified by the Request-URI. This method allows the client to - determine the options and/or requirements associated with a resource, - or the capabilities of a server, without implying a resource action - or initiating a resource retrieval. - - Unless the server's response is an error, the response MUST NOT - include entity information other than what can be considered as - communication options (e.g., Allow is appropriate, but Content-Type - is not). Responses to this method are not cachable. - - If the Request-URI is an asterisk ("*"), the OPTIONS request is - intended to apply to the server as a whole. A 200 response SHOULD - include any header fields which indicate optional features - implemented by the server (e.g., Public), including any extensions - not defined by this specification, in addition to any applicable - general or response-header fields. As described in section 5.1.2, an - "OPTIONS *" request can be applied through a proxy by specifying the - destination server in the Request-URI without any path information. - - If the Request-URI is not an asterisk, the OPTIONS request applies - only to the options that are available when communicating with that - resource. A 200 response SHOULD include any header fields which - indicate optional features implemented by the server and applicable - to that resource (e.g., Allow), including any extensions not defined - by this specification, in addition to any applicable general or - response-header fields. If the OPTIONS request passes through a - proxy, the proxy MUST edit the response to exclude those options - which apply to a proxy's capabilities and which are known to be - unavailable through that proxy. - - - - - - - -Fielding, et. al. Standards Track [Page 49] - -RFC 2068 HTTP/1.1 January 1997 - - -9.3 GET - - The GET method means retrieve whatever information (in the form of an - entity) is identified by the Request-URI. If the Request-URI refers - to a data-producing process, it is the produced data which shall be - returned as the entity in the response and not the source text of the - process, unless that text happens to be the output of the process. - - The semantics of the GET method change to a "conditional GET" if the - request message includes an If-Modified-Since, If-Unmodified-Since, - If-Match, If-None-Match, or If-Range header field. A conditional GET - method requests that the entity be transferred only under the - circumstances described by the conditional header field(s). The - conditional GET method is intended to reduce unnecessary network - usage by allowing cached entities to be refreshed without requiring - multiple requests or transferring data already held by the client. - - The semantics of the GET method change to a "partial GET" if the - request message includes a Range header field. A partial GET requests - that only part of the entity be transferred, as described in section - 14.36. The partial GET method is intended to reduce unnecessary - network usage by allowing partially-retrieved entities to be - completed without transferring data already held by the client. - - The response to a GET request is cachable if and only if it meets the - requirements for HTTP caching described in section 13. - -9.4 HEAD - - The HEAD method is identical to GET except that the server MUST NOT - return a message-body in the response. The metainformation contained - in the HTTP headers in response to a HEAD request SHOULD be identical - to the information sent in response to a GET request. This method can - be used for obtaining metainformation about the entity implied by the - request without transferring the entity-body itself. This method is - often used for testing hypertext links for validity, accessibility, - and recent modification. - - The response to a HEAD request may be cachable in the sense that the - information contained in the response may be used to update a - previously cached entity from that resource. If the new field values - indicate that the cached entity differs from the current entity (as - would be indicated by a change in Content-Length, Content-MD5, ETag - or Last-Modified), then the cache MUST treat the cache entry as - stale. - - - - - - -Fielding, et. al. Standards Track [Page 50] - -RFC 2068 HTTP/1.1 January 1997 - - -9.5 POST - - The POST method is used to request that the destination server accept - the entity enclosed in the request as a new subordinate of the - resource identified by the Request-URI in the Request-Line. POST is - designed to allow a uniform method to cover the following functions: - - o Annotation of existing resources; - - o Posting a message to a bulletin board, newsgroup, mailing list, - or similar group of articles; - - o Providing a block of data, such as the result of submitting a - form, to a data-handling process; - - o Extending a database through an append operation. - - The actual function performed by the POST method is determined by the - server and is usually dependent on the Request-URI. The posted entity - is subordinate to that URI in the same way that a file is subordinate - to a directory containing it, a news article is subordinate to a - newsgroup to which it is posted, or a record is subordinate to a - database. - - The action performed by the POST method might not result in a - resource that can be identified by a URI. In this case, either 200 - (OK) or 204 (No Content) is the appropriate response status, - depending on whether or not the response includes an entity that - describes the result. - - If a resource has been created on the origin server, the response - SHOULD be 201 (Created) and contain an entity which describes the - status of the request and refers to the new resource, and a Location - header (see section 14.30). - - Responses to this method are not cachable, unless the response - includes appropriate Cache-Control or Expires header fields. However, - the 303 (See Other) response can be used to direct the user agent to - retrieve a cachable resource. - - POST requests must obey the message transmission requirements set out - in section 8.2. - - - - - - - - - -Fielding, et. al. Standards Track [Page 51] - -RFC 2068 HTTP/1.1 January 1997 - - -9.6 PUT - - The PUT method requests that the enclosed entity be stored under the - supplied Request-URI. If the Request-URI refers to an already - existing resource, the enclosed entity SHOULD be considered as a - modified version of the one residing on the origin server. If the - Request-URI does not point to an existing resource, and that URI is - capable of being defined as a new resource by the requesting user - agent, the origin server can create the resource with that URI. If a - new resource is created, the origin server MUST inform the user agent - via the 201 (Created) response. If an existing resource is modified, - either the 200 (OK) or 204 (No Content) response codes SHOULD be sent - to indicate successful completion of the request. If the resource - could not be created or modified with the Request-URI, an appropriate - error response SHOULD be given that reflects the nature of the - problem. The recipient of the entity MUST NOT ignore any Content-* - (e.g. Content-Range) headers that it does not understand or implement - and MUST return a 501 (Not Implemented) response in such cases. - - If the request passes through a cache and the Request-URI identifies - one or more currently cached entities, those entries should be - treated as stale. Responses to this method are not cachable. - - The fundamental difference between the POST and PUT requests is - reflected in the different meaning of the Request-URI. The URI in a - POST request identifies the resource that will handle the enclosed - entity. That resource may be a data-accepting process, a gateway to - some other protocol, or a separate entity that accepts annotations. - In contrast, the URI in a PUT request identifies the entity enclosed - with the request -- the user agent knows what URI is intended and the - server MUST NOT attempt to apply the request to some other resource. - If the server desires that the request be applied to a different URI, - it MUST send a 301 (Moved Permanently) response; the user agent MAY - then make its own decision regarding whether or not to redirect the - request. - - A single resource MAY be identified by many different URIs. For - example, an article may have a URI for identifying "the current - version" which is separate from the URI identifying each particular - version. In this case, a PUT request on a general URI may result in - several other URIs being defined by the origin server. - - HTTP/1.1 does not define how a PUT method affects the state of an - origin server. - - PUT requests must obey the message transmission requirements set out - in section 8.2. - - - - -Fielding, et. al. Standards Track [Page 52] - -RFC 2068 HTTP/1.1 January 1997 - - -9.7 DELETE - - The DELETE method requests that the origin server delete the resource - identified by the Request-URI. This method MAY be overridden by human - intervention (or other means) on the origin server. The client cannot - be guaranteed that the operation has been carried out, even if the - status code returned from the origin server indicates that the action - has been completed successfully. However, the server SHOULD not - indicate success unless, at the time the response is given, it - intends to delete the resource or move it to an inaccessible - location. - - A successful response SHOULD be 200 (OK) if the response includes an - entity describing the status, 202 (Accepted) if the action has not - yet been enacted, or 204 (No Content) if the response is OK but does - not include an entity. - - If the request passes through a cache and the Request-URI identifies - one or more currently cached entities, those entries should be - treated as stale. Responses to this method are not cachable. - -9.8 TRACE - - The TRACE method is used to invoke a remote, application-layer loop- - back of the request message. The final recipient of the request - SHOULD reflect the message received back to the client as the - entity-body of a 200 (OK) response. The final recipient is either the - origin server or the first proxy or gateway to receive a Max-Forwards - value of zero (0) in the request (see section 14.31). A TRACE request - MUST NOT include an entity. - - TRACE allows the client to see what is being received at the other - end of the request chain and use that data for testing or diagnostic - information. The value of the Via header field (section 14.44) is of - particular interest, since it acts as a trace of the request chain. - Use of the Max-Forwards header field allows the client to limit the - length of the request chain, which is useful for testing a chain of - proxies forwarding messages in an infinite loop. - - If successful, the response SHOULD contain the entire request message - in the entity-body, with a Content-Type of "message/http". Responses - to this method MUST NOT be cached. - -10 Status Code Definitions - - Each Status-Code is described below, including a description of which - method(s) it can follow and any metainformation required in the - response. - - - -Fielding, et. al. Standards Track [Page 53] - -RFC 2068 HTTP/1.1 January 1997 - - -10.1 Informational 1xx - - This class of status code indicates a provisional response, - consisting only of the Status-Line and optional headers, and is - terminated by an empty line. Since HTTP/1.0 did not define any 1xx - status codes, servers MUST NOT send a 1xx response to an HTTP/1.0 - client except under experimental conditions. - -10.1.1 100 Continue - - The client may continue with its request. This interim response is - used to inform the client that the initial part of the request has - been received and has not yet been rejected by the server. The client - SHOULD continue by sending the remainder of the request or, if the - request has already been completed, ignore this response. The server - MUST send a final response after the request has been completed. - -10.1.2 101 Switching Protocols - - The server understands and is willing to comply with the client's - request, via the Upgrade message header field (section 14.41), for a - change in the application protocol being used on this connection. The - server will switch protocols to those defined by the response's - Upgrade header field immediately after the empty line which - terminates the 101 response. - - The protocol should only be switched when it is advantageous to do - so. For example, switching to a newer version of HTTP is - advantageous over older versions, and switching to a real-time, - synchronous protocol may be advantageous when delivering resources - that use such features. - -10.2 Successful 2xx - - This class of status code indicates that the client's request was - successfully received, understood, and accepted. - -10.2.1 200 OK - - The request has succeeded. The information returned with the response - is dependent on the method used in the request, for example: - - GET an entity corresponding to the requested resource is sent in the - response; - - HEAD the entity-header fields corresponding to the requested resource - are sent in the response without any message-body; - - - - -Fielding, et. al. Standards Track [Page 54] - -RFC 2068 HTTP/1.1 January 1997 - - - POST an entity describing or containing the result of the action; - - TRACE an entity containing the request message as received by the end - server. - -10.2.2 201 Created - - The request has been fulfilled and resulted in a new resource being - created. The newly created resource can be referenced by the URI(s) - returned in the entity of the response, with the most specific URL - for the resource given by a Location header field. The origin server - MUST create the resource before returning the 201 status code. If the - action cannot be carried out immediately, the server should respond - with 202 (Accepted) response instead. - -10.2.3 202 Accepted - - The request has been accepted for processing, but the processing has - not been completed. The request MAY or MAY NOT eventually be acted - upon, as it MAY be disallowed when processing actually takes place. - There is no facility for re-sending a status code from an - asynchronous operation such as this. - - The 202 response is intentionally non-committal. Its purpose is to - allow a server to accept a request for some other process (perhaps a - batch-oriented process that is only run once per day) without - requiring that the user agent's connection to the server persist - until the process is completed. The entity returned with this - response SHOULD include an indication of the request's current status - and either a pointer to a status monitor or some estimate of when the - user can expect the request to be fulfilled. - -10.2.4 203 Non-Authoritative Information - - The returned metainformation in the entity-header is not the - definitive set as available from the origin server, but is gathered - from a local or a third-party copy. The set presented MAY be a subset - or superset of the original version. For example, including local - annotation information about the resource MAY result in a superset of - the metainformation known by the origin server. Use of this response - code is not required and is only appropriate when the response would - otherwise be 200 (OK). - -10.2.5 204 No Content - - The server has fulfilled the request but there is no new information - to send back. If the client is a user agent, it SHOULD NOT change its - document view from that which caused the request to be sent. This - - - -Fielding, et. al. Standards Track [Page 55] - -RFC 2068 HTTP/1.1 January 1997 - - - response is primarily intended to allow input for actions to take - place without causing a change to the user agent's active document - view. The response MAY include new metainformation in the form of - entity-headers, which SHOULD apply to the document currently in the - user agent's active view. - - The 204 response MUST NOT include a message-body, and thus is always - terminated by the first empty line after the header fields. - -10.2.6 205 Reset Content - - The server has fulfilled the request and the user agent SHOULD reset - the document view which caused the request to be sent. This response - is primarily intended to allow input for actions to take place via - user input, followed by a clearing of the form in which the input is - given so that the user can easily initiate another input action. The - response MUST NOT include an entity. - -10.2.7 206 Partial Content - - The server has fulfilled the partial GET request for the resource. - The request must have included a Range header field (section 14.36) - indicating the desired range. The response MUST include either a - Content-Range header field (section 14.17) indicating the range - included with this response, or a multipart/byteranges Content-Type - including Content-Range fields for each part. If multipart/byteranges - is not used, the Content-Length header field in the response MUST - match the actual number of OCTETs transmitted in the message-body. - - A cache that does not support the Range and Content-Range headers - MUST NOT cache 206 (Partial) responses. - -10.3 Redirection 3xx - - This class of status code indicates that further action needs to be - taken by the user agent in order to fulfill the request. The action - required MAY be carried out by the user agent without interaction - with the user if and only if the method used in the second request is - GET or HEAD. A user agent SHOULD NOT automatically redirect a request - more than 5 times, since such redirections usually indicate an - infinite loop. - - - - - - - - - - -Fielding, et. al. Standards Track [Page 56] - -RFC 2068 HTTP/1.1 January 1997 - - -10.3.1 300 Multiple Choices - - The requested resource corresponds to any one of a set of - representations, each with its own specific location, and agent- - driven negotiation information (section 12) is being provided so that - the user (or user agent) can select a preferred representation and - redirect its request to that location. - - Unless it was a HEAD request, the response SHOULD include an entity - containing a list of resource characteristics and location(s) from - which the user or user agent can choose the one most appropriate. The - entity format is specified by the media type given in the Content- - Type header field. Depending upon the format and the capabilities of - the user agent, selection of the most appropriate choice may be - performed automatically. However, this specification does not define - any standard for such automatic selection. - - If the server has a preferred choice of representation, it SHOULD - include the specific URL for that representation in the Location - field; user agents MAY use the Location field value for automatic - redirection. This response is cachable unless indicated otherwise. - -10.3.2 301 Moved Permanently - - The requested resource has been assigned a new permanent URI and any - future references to this resource SHOULD be done using one of the - returned URIs. Clients with link editing capabilities SHOULD - automatically re-link references to the Request-URI to one or more of - the new references returned by the server, where possible. This - response is cachable unless indicated otherwise. - - If the new URI is a location, its URL SHOULD be given by the Location - field in the response. Unless the request method was HEAD, the entity - of the response SHOULD contain a short hypertext note with a - hyperlink to the new URI(s). - - If the 301 status code is received in response to a request other - than GET or HEAD, the user agent MUST NOT automatically redirect the - request unless it can be confirmed by the user, since this might - change the conditions under which the request was issued. - - Note: When automatically redirecting a POST request after receiving - a 301 status code, some existing HTTP/1.0 user agents will - erroneously change it into a GET request. - - - - - - - -Fielding, et. al. Standards Track [Page 57] - -RFC 2068 HTTP/1.1 January 1997 - - -10.3.3 302 Moved Temporarily - - The requested resource resides temporarily under a different URI. - Since the redirection may be altered on occasion, the client SHOULD - continue to use the Request-URI for future requests. This response is - only cachable if indicated by a Cache-Control or Expires header - field. - - If the new URI is a location, its URL SHOULD be given by the Location - field in the response. Unless the request method was HEAD, the entity - of the response SHOULD contain a short hypertext note with a - hyperlink to the new URI(s). - - If the 302 status code is received in response to a request other - than GET or HEAD, the user agent MUST NOT automatically redirect the - request unless it can be confirmed by the user, since this might - change the conditions under which the request was issued. - - Note: When automatically redirecting a POST request after receiving - a 302 status code, some existing HTTP/1.0 user agents will - erroneously change it into a GET request. - -10.3.4 303 See Other - - The response to the request can be found under a different URI and - SHOULD be retrieved using a GET method on that resource. This method - exists primarily to allow the output of a POST-activated script to - redirect the user agent to a selected resource. The new URI is not a - substitute reference for the originally requested resource. The 303 - response is not cachable, but the response to the second (redirected) - request MAY be cachable. - - If the new URI is a location, its URL SHOULD be given by the Location - field in the response. Unless the request method was HEAD, the entity - of the response SHOULD contain a short hypertext note with a - hyperlink to the new URI(s). - -10.3.5 304 Not Modified - - If the client has performed a conditional GET request and access is - allowed, but the document has not been modified, the server SHOULD - respond with this status code. The response MUST NOT contain a - message-body. - - - - - - - - -Fielding, et. al. Standards Track [Page 58] - -RFC 2068 HTTP/1.1 January 1997 - - - The response MUST include the following header fields: - - o Date - - o ETag and/or Content-Location, if the header would have been sent in - a 200 response to the same request - - o Expires, Cache-Control, and/or Vary, if the field-value might - differ from that sent in any previous response for the same variant - - If the conditional GET used a strong cache validator (see section - 13.3.3), the response SHOULD NOT include other entity-headers. - Otherwise (i.e., the conditional GET used a weak validator), the - response MUST NOT include other entity-headers; this prevents - inconsistencies between cached entity-bodies and updated headers. - - If a 304 response indicates an entity not currently cached, then the - cache MUST disregard the response and repeat the request without the - conditional. - - If a cache uses a received 304 response to update a cache entry, the - cache MUST update the entry to reflect any new field values given in - the response. - - The 304 response MUST NOT include a message-body, and thus is always - terminated by the first empty line after the header fields. - -10.3.6 305 Use Proxy - - The requested resource MUST be accessed through the proxy given by - the Location field. The Location field gives the URL of the proxy. - The recipient is expected to repeat the request via the proxy. - -10.4 Client Error 4xx - - The 4xx class of status code is intended for cases in which the - client seems to have erred. Except when responding to a HEAD request, - the server SHOULD include an entity containing an explanation of the - error situation, and whether it is a temporary or permanent - condition. These status codes are applicable to any request method. - User agents SHOULD display any included entity to the user. - - Note: If the client is sending data, a server implementation using - TCP should be careful to ensure that the client acknowledges - receipt of the packet(s) containing the response, before the server - closes the input connection. If the client continues sending data - to the server after the close, the server's TCP stack will send a - reset packet to the client, which may erase the client's - - - -Fielding, et. al. Standards Track [Page 59] - -RFC 2068 HTTP/1.1 January 1997 - - - unacknowledged input buffers before they can be read and - interpreted by the HTTP application. - -10.4.1 400 Bad Request - - The request could not be understood by the server due to malformed - syntax. The client SHOULD NOT repeat the request without - modifications. - -10.4.2 401 Unauthorized - - The request requires user authentication. The response MUST include a - WWW-Authenticate header field (section 14.46) containing a challenge - applicable to the requested resource. The client MAY repeat the - request with a suitable Authorization header field (section 14.8). If - the request already included Authorization credentials, then the 401 - response indicates that authorization has been refused for those - credentials. If the 401 response contains the same challenge as the - prior response, and the user agent has already attempted - authentication at least once, then the user SHOULD be presented the - entity that was given in the response, since that entity MAY include - relevant diagnostic information. HTTP access authentication is - explained in section 11. - -10.4.3 402 Payment Required - - This code is reserved for future use. - -10.4.4 403 Forbidden - - The server understood the request, but is refusing to fulfill it. - Authorization will not help and the request SHOULD NOT be repeated. - If the request method was not HEAD and the server wishes to make - public why the request has not been fulfilled, it SHOULD describe the - reason for the refusal in the entity. This status code is commonly - used when the server does not wish to reveal exactly why the request - has been refused, or when no other response is applicable. - -10.4.5 404 Not Found - - The server has not found anything matching the Request-URI. No - indication is given of whether the condition is temporary or - permanent. - - - - - - - - -Fielding, et. al. Standards Track [Page 60] - -RFC 2068 HTTP/1.1 January 1997 - - - If the server does not wish to make this information available to the - client, the status code 403 (Forbidden) can be used instead. The 410 - (Gone) status code SHOULD be used if the server knows, through some - internally configurable mechanism, that an old resource is - permanently unavailable and has no forwarding address. - -10.4.6 405 Method Not Allowed - - The method specified in the Request-Line is not allowed for the - resource identified by the Request-URI. The response MUST include an - Allow header containing a list of valid methods for the requested - resource. - -10.4.7 406 Not Acceptable - - The resource identified by the request is only capable of generating - response entities which have content characteristics not acceptable - according to the accept headers sent in the request. - - Unless it was a HEAD request, the response SHOULD include an entity - containing a list of available entity characteristics and location(s) - from which the user or user agent can choose the one most - appropriate. The entity format is specified by the media type given - in the Content-Type header field. Depending upon the format and the - capabilities of the user agent, selection of the most appropriate - choice may be performed automatically. However, this specification - does not define any standard for such automatic selection. - - Note: HTTP/1.1 servers are allowed to return responses which are - not acceptable according to the accept headers sent in the request. - In some cases, this may even be preferable to sending a 406 - response. User agents are encouraged to inspect the headers of an - incoming response to determine if it is acceptable. If the response - could be unacceptable, a user agent SHOULD temporarily stop receipt - of more data and query the user for a decision on further actions. - -10.4.8 407 Proxy Authentication Required - - This code is similar to 401 (Unauthorized), but indicates that the - client MUST first authenticate itself with the proxy. The proxy MUST - return a Proxy-Authenticate header field (section 14.33) containing a - challenge applicable to the proxy for the requested resource. The - client MAY repeat the request with a suitable Proxy-Authorization - header field (section 14.34). HTTP access authentication is explained - in section 11. - - - - - - -Fielding, et. al. Standards Track [Page 61] - -RFC 2068 HTTP/1.1 January 1997 - - -10.4.9 408 Request Timeout - - The client did not produce a request within the time that the server - was prepared to wait. The client MAY repeat the request without - modifications at any later time. - -10.4.10 409 Conflict - - The request could not be completed due to a conflict with the current - state of the resource. This code is only allowed in situations where - it is expected that the user might be able to resolve the conflict - and resubmit the request. The response body SHOULD include enough - information for the user to recognize the source of the conflict. - Ideally, the response entity would include enough information for the - user or user agent to fix the problem; however, that may not be - possible and is not required. - - Conflicts are most likely to occur in response to a PUT request. If - versioning is being used and the entity being PUT includes changes to - a resource which conflict with those made by an earlier (third-party) - request, the server MAY use the 409 response to indicate that it - can't complete the request. In this case, the response entity SHOULD - contain a list of the differences between the two versions in a - format defined by the response Content-Type. - -10.4.11 410 Gone - - The requested resource is no longer available at the server and no - forwarding address is known. This condition SHOULD be considered - permanent. Clients with link editing capabilities SHOULD delete - references to the Request-URI after user approval. If the server does - not know, or has no facility to determine, whether or not the - condition is permanent, the status code 404 (Not Found) SHOULD be - used instead. This response is cachable unless indicated otherwise. - - The 410 response is primarily intended to assist the task of web - maintenance by notifying the recipient that the resource is - intentionally unavailable and that the server owners desire that - remote links to that resource be removed. Such an event is common for - limited-time, promotional services and for resources belonging to - individuals no longer working at the server's site. It is not - necessary to mark all permanently unavailable resources as "gone" or - to keep the mark for any length of time -- that is left to the - discretion of the server owner. - - - - - - - -Fielding, et. al. Standards Track [Page 62] - -RFC 2068 HTTP/1.1 January 1997 - - -10.4.12 411 Length Required - - The server refuses to accept the request without a defined Content- - Length. The client MAY repeat the request if it adds a valid - Content-Length header field containing the length of the message-body - in the request message. - -10.4.13 412 Precondition Failed - - The precondition given in one or more of the request-header fields - evaluated to false when it was tested on the server. This response - code allows the client to place preconditions on the current resource - metainformation (header field data) and thus prevent the requested - method from being applied to a resource other than the one intended. - -10.4.14 413 Request Entity Too Large - - The server is refusing to process a request because the request - entity is larger than the server is willing or able to process. The - server may close the connection to prevent the client from continuing - the request. - - If the condition is temporary, the server SHOULD include a Retry- - After header field to indicate that it is temporary and after what - time the client may try again. - -10.4.15 414 Request-URI Too Long - - The server is refusing to service the request because the Request-URI - is longer than the server is willing to interpret. This rare - condition is only likely to occur when a client has improperly - converted a POST request to a GET request with long query - information, when the client has descended into a URL "black hole" of - redirection (e.g., a redirected URL prefix that points to a suffix of - itself), or when the server is under attack by a client attempting to - exploit security holes present in some servers using fixed-length - buffers for reading or manipulating the Request-URI. - -10.4.16 415 Unsupported Media Type - - The server is refusing to service the request because the entity of - the request is in a format not supported by the requested resource - for the requested method. - - - - - - - - -Fielding, et. al. Standards Track [Page 63] - -RFC 2068 HTTP/1.1 January 1997 - - -10.5 Server Error 5xx - - Response status codes beginning with the digit "5" indicate cases in - which the server is aware that it has erred or is incapable of - performing the request. Except when responding to a HEAD request, the - server SHOULD include an entity containing an explanation of the - error situation, and whether it is a temporary or permanent - condition. User agents SHOULD display any included entity to the - user. These response codes are applicable to any request method. - -10.5.1 500 Internal Server Error - - The server encountered an unexpected condition which prevented it - from fulfilling the request. - -10.5.2 501 Not Implemented - - The server does not support the functionality required to fulfill the - request. This is the appropriate response when the server does not - recognize the request method and is not capable of supporting it for - any resource. - -10.5.3 502 Bad Gateway - - The server, while acting as a gateway or proxy, received an invalid - response from the upstream server it accessed in attempting to - fulfill the request. - -10.5.4 503 Service Unavailable - - The server is currently unable to handle the request due to a - temporary overloading or maintenance of the server. The implication - is that this is a temporary condition which will be alleviated after - some delay. If known, the length of the delay may be indicated in a - Retry-After header. If no Retry-After is given, the client SHOULD - handle the response as it would for a 500 response. - - Note: The existence of the 503 status code does not imply that a - server must use it when becoming overloaded. Some servers may wish - to simply refuse the connection. - -10.5.5 504 Gateway Timeout - - The server, while acting as a gateway or proxy, did not receive a - timely response from the upstream server it accessed in attempting to - complete the request. - - - - - -Fielding, et. al. Standards Track [Page 64] - -RFC 2068 HTTP/1.1 January 1997 - - -10.5.6 505 HTTP Version Not Supported - - The server does not support, or refuses to support, the HTTP protocol - version that was used in the request message. The server is - indicating that it is unable or unwilling to complete the request - using the same major version as the client, as described in section - 3.1, other than with this error message. The response SHOULD contain - an entity describing why that version is not supported and what other - protocols are supported by that server. - -11 Access Authentication - - HTTP provides a simple challenge-response authentication mechanism - which MAY be used by a server to challenge a client request and by a - client to provide authentication information. It uses an extensible, - case-insensitive token to identify the authentication scheme, - followed by a comma-separated list of attribute-value pairs which - carry the parameters necessary for achieving authentication via that - scheme. - - auth-scheme = token - - auth-param = token "=" quoted-string - - The 401 (Unauthorized) response message is used by an origin server - to challenge the authorization of a user agent. This response MUST - include a WWW-Authenticate header field containing at least one - challenge applicable to the requested resource. - - challenge = auth-scheme 1*SP realm *( "," auth-param ) - - realm = "realm" "=" realm-value - realm-value = quoted-string - - The realm attribute (case-insensitive) is required for all - authentication schemes which issue a challenge. The realm value - (case-sensitive), in combination with the canonical root URL (see - section 5.1.2) of the server being accessed, defines the protection - space. These realms allow the protected resources on a server to be - partitioned into a set of protection spaces, each with its own - authentication scheme and/or authorization database. The realm value - is a string, generally assigned by the origin server, which may have - additional semantics specific to the authentication scheme. - - A user agent that wishes to authenticate itself with a server-- - usually, but not necessarily, after receiving a 401 or 411 response- - -MAY do so by including an Authorization header field with the - request. The Authorization field value consists of credentials - - - -Fielding, et. al. Standards Track [Page 65] - -RFC 2068 HTTP/1.1 January 1997 - - - containing the authentication information of the user agent for the - realm of the resource being requested. - - credentials = basic-credentials - | auth-scheme #auth-param - - The domain over which credentials can be automatically applied by a - user agent is determined by the protection space. If a prior request - has been authorized, the same credentials MAY be reused for all other - requests within that protection space for a period of time determined - by the authentication scheme, parameters, and/or user preference. - Unless otherwise defined by the authentication scheme, a single - protection space cannot extend outside the scope of its server. - - If the server does not wish to accept the credentials sent with a - request, it SHOULD return a 401 (Unauthorized) response. The response - MUST include a WWW-Authenticate header field containing the (possibly - new) challenge applicable to the requested resource and an entity - explaining the refusal. - - The HTTP protocol does not restrict applications to this simple - challenge-response mechanism for access authentication. Additional - mechanisms MAY be used, such as encryption at the transport level or - via message encapsulation, and with additional header fields - specifying authentication information. However, these additional - mechanisms are not defined by this specification. - - Proxies MUST be completely transparent regarding user agent - authentication. That is, they MUST forward the WWW-Authenticate and - Authorization headers untouched, and follow the rules found in - section 14.8. - - HTTP/1.1 allows a client to pass authentication information to and - from a proxy via the Proxy-Authenticate and Proxy-Authorization - headers. - -11.1 Basic Authentication Scheme - - The "basic" authentication scheme is based on the model that the user - agent must authenticate itself with a user-ID and a password for each - realm. The realm value should be considered an opaque string which - can only be compared for equality with other realms on that server. - The server will service the request only if it can validate the - user-ID and password for the protection space of the Request-URI. - There are no optional authentication parameters. - - - - - - -Fielding, et. al. Standards Track [Page 66] - -RFC 2068 HTTP/1.1 January 1997 - - - Upon receipt of an unauthorized request for a URI within the - protection space, the server MAY respond with a challenge like the - following: - - WWW-Authenticate: Basic realm="WallyWorld" - - where "WallyWorld" is the string assigned by the server to identify - the protection space of the Request-URI. - - To receive authorization, the client sends the userid and password, - separated by a single colon (":") character, within a base64 encoded - string in the credentials. - - basic-credentials = "Basic" SP basic-cookie - - basic-cookie = <base64 [7] encoding of user-pass, - except not limited to 76 char/line> - - user-pass = userid ":" password - - userid = *<TEXT excluding ":"> - - password = *TEXT - - Userids might be case sensitive. - - If the user agent wishes to send the userid "Aladdin" and password - "open sesame", it would use the following header field: - - Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ== - - See section 15 for security considerations associated with Basic - authentication. - -11.2 Digest Authentication Scheme - - A digest authentication for HTTP is specified in RFC 2069 [32]. - -12 Content Negotiation - - Most HTTP responses include an entity which contains information for - interpretation by a human user. Naturally, it is desirable to supply - the user with the "best available" entity corresponding to the - request. Unfortunately for servers and caches, not all users have - the same preferences for what is "best," and not all user agents are - equally capable of rendering all entity types. For that reason, HTTP - has provisions for several mechanisms for "content negotiation" -- - the process of selecting the best representation for a given response - - - -Fielding, et. al. Standards Track [Page 67] - -RFC 2068 HTTP/1.1 January 1997 - - - when there are multiple representations available. - - Note: This is not called "format negotiation" because the alternate - representations may be of the same media type, but use different - capabilities of that type, be in different languages, etc. - - Any response containing an entity-body MAY be subject to negotiation, - including error responses. - - There are two kinds of content negotiation which are possible in - HTTP: server-driven and agent-driven negotiation. These two kinds of - negotiation are orthogonal and thus may be used separately or in - combination. One method of combination, referred to as transparent - negotiation, occurs when a cache uses the agent-driven negotiation - information provided by the origin server in order to provide - server-driven negotiation for subsequent requests. - -12.1 Server-driven Negotiation - - If the selection of the best representation for a response is made by - an algorithm located at the server, it is called server-driven - negotiation. Selection is based on the available representations of - the response (the dimensions over which it can vary; e.g. language, - content-coding, etc.) and the contents of particular header fields in - the request message or on other information pertaining to the request - (such as the network address of the client). - - Server-driven negotiation is advantageous when the algorithm for - selecting from among the available representations is difficult to - describe to the user agent, or when the server desires to send its - "best guess" to the client along with the first response (hoping to - avoid the round-trip delay of a subsequent request if the "best - guess" is good enough for the user). In order to improve the server's - guess, the user agent MAY include request header fields (Accept, - Accept-Language, Accept-Encoding, etc.) which describe its - preferences for such a response. - - Server-driven negotiation has disadvantages: - -1. It is impossible for the server to accurately determine what might be - "best" for any given user, since that would require complete - knowledge of both the capabilities of the user agent and the intended - use for the response (e.g., does the user want to view it on screen - or print it on paper?). - -2. Having the user agent describe its capabilities in every request can - be both very inefficient (given that only a small percentage of - responses have multiple representations) and a potential violation of - - - -Fielding, et. al. Standards Track [Page 68] - -RFC 2068 HTTP/1.1 January 1997 - - - the user's privacy. - -3. It complicates the implementation of an origin server and the - algorithms for generating responses to a request. - -4. It may limit a public cache's ability to use the same response for - multiple user's requests. - - HTTP/1.1 includes the following request-header fields for enabling - server-driven negotiation through description of user agent - capabilities and user preferences: Accept (section 14.1), Accept- - Charset (section 14.2), Accept-Encoding (section 14.3), Accept- - Language (section 14.4), and User-Agent (section 14.42). However, an - origin server is not limited to these dimensions and MAY vary the - response based on any aspect of the request, including information - outside the request-header fields or within extension header fields - not defined by this specification. - - HTTP/1.1 origin servers MUST include an appropriate Vary header field - (section 14.43) in any cachable response based on server-driven - negotiation. The Vary header field describes the dimensions over - which the response might vary (i.e. the dimensions over which the - origin server picks its "best guess" response from multiple - representations). - - HTTP/1.1 public caches MUST recognize the Vary header field when it - is included in a response and obey the requirements described in - section 13.6 that describes the interactions between caching and - content negotiation. - -12.2 Agent-driven Negotiation - - With agent-driven negotiation, selection of the best representation - for a response is performed by the user agent after receiving an - initial response from the origin server. Selection is based on a list - of the available representations of the response included within the - header fields (this specification reserves the field-name Alternates, - as described in appendix 19.6.2.1) or entity-body of the initial - response, with each representation identified by its own URI. - Selection from among the representations may be performed - automatically (if the user agent is capable of doing so) or manually - by the user selecting from a generated (possibly hypertext) menu. - - Agent-driven negotiation is advantageous when the response would vary - over commonly-used dimensions (such as type, language, or encoding), - when the origin server is unable to determine a user agent's - capabilities from examining the request, and generally when public - caches are used to distribute server load and reduce network usage. - - - -Fielding, et. al. Standards Track [Page 69] - -RFC 2068 HTTP/1.1 January 1997 - - - Agent-driven negotiation suffers from the disadvantage of needing a - second request to obtain the best alternate representation. This - second request is only efficient when caching is used. In addition, - this specification does not define any mechanism for supporting - automatic selection, though it also does not prevent any such - mechanism from being developed as an extension and used within - HTTP/1.1. - - HTTP/1.1 defines the 300 (Multiple Choices) and 406 (Not Acceptable) - status codes for enabling agent-driven negotiation when the server is - unwilling or unable to provide a varying response using server-driven - negotiation. - -12.3 Transparent Negotiation - - Transparent negotiation is a combination of both server-driven and - agent-driven negotiation. When a cache is supplied with a form of the - list of available representations of the response (as in agent-driven - negotiation) and the dimensions of variance are completely understood - by the cache, then the cache becomes capable of performing server- - driven negotiation on behalf of the origin server for subsequent - requests on that resource. - - Transparent negotiation has the advantage of distributing the - negotiation work that would otherwise be required of the origin - server and also removing the second request delay of agent-driven - negotiation when the cache is able to correctly guess the right - response. - - This specification does not define any mechanism for transparent - negotiation, though it also does not prevent any such mechanism from - being developed as an extension and used within HTTP/1.1. An HTTP/1.1 - cache performing transparent negotiation MUST include a Vary header - field in the response (defining the dimensions of its variance) if it - is cachable to ensure correct interoperation with all HTTP/1.1 - clients. The agent-driven negotiation information supplied by the - origin server SHOULD be included with the transparently negotiated - response. - -13 Caching in HTTP - - HTTP is typically used for distributed information systems, where - performance can be improved by the use of response caches. The - HTTP/1.1 protocol includes a number of elements intended to make - caching work as well as possible. Because these elements are - inextricable from other aspects of the protocol, and because they - interact with each other, it is useful to describe the basic caching - design of HTTP separately from the detailed descriptions of methods, - - - -Fielding, et. al. Standards Track [Page 70] - -RFC 2068 HTTP/1.1 January 1997 - - - headers, response codes, etc. - - Caching would be useless if it did not significantly improve - performance. The goal of caching in HTTP/1.1 is to eliminate the need - to send requests in many cases, and to eliminate the need to send - full responses in many other cases. The former reduces the number of - network round-trips required for many operations; we use an - "expiration" mechanism for this purpose (see section 13.2). The - latter reduces network bandwidth requirements; we use a "validation" - mechanism for this purpose (see section 13.3). - - Requirements for performance, availability, and disconnected - operation require us to be able to relax the goal of semantic - transparency. The HTTP/1.1 protocol allows origin servers, caches, - and clients to explicitly reduce transparency when necessary. - However, because non-transparent operation may confuse non-expert - users, and may be incompatible with certain server applications (such - as those for ordering merchandise), the protocol requires that - transparency be relaxed - - o only by an explicit protocol-level request when relaxed by client - or origin server - - o only with an explicit warning to the end user when relaxed by cache - or client - - - - - - - - - - - - - - - - - - - - - - - - - - -Fielding, et. al. Standards Track [Page 71] - -RFC 2068 HTTP/1.1 January 1997 - - - Therefore, the HTTP/1.1 protocol provides these important elements: - - 1. Protocol features that provide full semantic transparency when this - is required by all parties. - - 2. Protocol features that allow an origin server or user agent to - explicitly request and control non-transparent operation. - - 3. Protocol features that allow a cache to attach warnings to - responses that do not preserve the requested approximation of - semantic transparency. - - A basic principle is that it must be possible for the clients to - detect any potential relaxation of semantic transparency. - - Note: The server, cache, or client implementer may be faced with - design decisions not explicitly discussed in this specification. If - a decision may affect semantic transparency, the implementer ought - to err on the side of maintaining transparency unless a careful and - complete analysis shows significant benefits in breaking - transparency. - -13.1.1 Cache Correctness - - A correct cache MUST respond to a request with the most up-to-date - response held by the cache that is appropriate to the request (see - sections 13.2.5, 13.2.6, and 13.12) which meets one of the following - conditions: - - 1. It has been checked for equivalence with what the origin server - would have returned by revalidating the response with the origin - server (section 13.3); - - 2. It is "fresh enough" (see section 13.2). In the default case, this - means it meets the least restrictive freshness requirement of the - client, server, and cache (see section 14.9); if the origin server - so specifies, it is the freshness requirement of the origin server - alone. - - 3. It includes a warning if the freshness demand of the client or the - origin server is violated (see section 13.1.5 and 14.45). - - 4. It is an appropriate 304 (Not Modified), 305 (Proxy Redirect), or - error (4xx or 5xx) response message. - - If the cache can not communicate with the origin server, then a - correct cache SHOULD respond as above if the response can be - correctly served from the cache; if not it MUST return an error or - - - -Fielding, et. al. Standards Track [Page 72] - -RFC 2068 HTTP/1.1 January 1997 - - - warning indicating that there was a communication failure. - - If a cache receives a response (either an entire response, or a 304 - (Not Modified) response) that it would normally forward to the - requesting client, and the received response is no longer fresh, the - cache SHOULD forward it to the requesting client without adding a new - Warning (but without removing any existing Warning headers). A cache - SHOULD NOT attempt to revalidate a response simply because that - response became stale in transit; this might lead to an infinite - loop. An user agent that receives a stale response without a Warning - MAY display a warning indication to the user. - -13.1.2 Warnings - - Whenever a cache returns a response that is neither first-hand nor - "fresh enough" (in the sense of condition 2 in section 13.1.1), it - must attach a warning to that effect, using a Warning response- - header. This warning allows clients to take appropriate action. - - Warnings may be used for other purposes, both cache-related and - otherwise. The use of a warning, rather than an error status code, - distinguish these responses from true failures. - - Warnings are always cachable, because they never weaken the - transparency of a response. This means that warnings can be passed to - HTTP/1.0 caches without danger; such caches will simply pass the - warning along as an entity-header in the response. - - Warnings are assigned numbers between 0 and 99. This specification - defines the code numbers and meanings of each currently assigned - warnings, allowing a client or cache to take automated action in some - (but not all) cases. - - Warnings also carry a warning text. The text may be in any - appropriate natural language (perhaps based on the client's Accept - headers), and include an optional indication of what character set is - used. - - Multiple warnings may be attached to a response (either by the origin - server or by a cache), including multiple warnings with the same code - number. For example, a server may provide the same warning with texts - in both English and Basque. - - When multiple warnings are attached to a response, it may not be - practical or reasonable to display all of them to the user. This - version of HTTP does not specify strict priority rules for deciding - which warnings to display and in what order, but does suggest some - heuristics. - - - -Fielding, et. al. Standards Track [Page 73] - -RFC 2068 HTTP/1.1 January 1997 - - - The Warning header and the currently defined warnings are described - in section 14.45. - -13.1.3 Cache-control Mechanisms - - The basic cache mechanisms in HTTP/1.1 (server-specified expiration - times and validators) are implicit directives to caches. In some - cases, a server or client may need to provide explicit directives to - the HTTP caches. We use the Cache-Control header for this purpose. - - The Cache-Control header allows a client or server to transmit a - variety of directives in either requests or responses. These - directives typically override the default caching algorithms. As a - general rule, if there is any apparent conflict between header - values, the most restrictive interpretation should be applied (that - is, the one that is most likely to preserve semantic transparency). - However, in some cases, Cache-Control directives are explicitly - specified as weakening the approximation of semantic transparency - (for example, "max-stale" or "public"). - - The Cache-Control directives are described in detail in section 14.9. - -13.1.4 Explicit User Agent Warnings - - Many user agents make it possible for users to override the basic - caching mechanisms. For example, the user agent may allow the user to - specify that cached entities (even explicitly stale ones) are never - validated. Or the user agent might habitually add "Cache-Control: - max-stale=3600" to every request. The user should have to explicitly - request either non-transparent behavior, or behavior that results in - abnormally ineffective caching. - - If the user has overridden the basic caching mechanisms, the user - agent should explicitly indicate to the user whenever this results in - the display of information that might not meet the server's - transparency requirements (in particular, if the displayed entity is - known to be stale). Since the protocol normally allows the user agent - to determine if responses are stale or not, this indication need only - be displayed when this actually happens. The indication need not be a - dialog box; it could be an icon (for example, a picture of a rotting - fish) or some other visual indicator. - - If the user has overridden the caching mechanisms in a way that would - abnormally reduce the effectiveness of caches, the user agent should - continually display an indication (for example, a picture of currency - in flames) so that the user does not inadvertently consume excess - resources or suffer from excessive latency. - - - - -Fielding, et. al. Standards Track [Page 74] - -RFC 2068 HTTP/1.1 January 1997 - - -13.1.5 Exceptions to the Rules and Warnings - - In some cases, the operator of a cache may choose to configure it to - return stale responses even when not requested by clients. This - decision should not be made lightly, but may be necessary for reasons - of availability or performance, especially when the cache is poorly - connected to the origin server. Whenever a cache returns a stale - response, it MUST mark it as such (using a Warning header). This - allows the client software to alert the user that there may be a - potential problem. - - It also allows the user agent to take steps to obtain a first-hand or - fresh response. For this reason, a cache SHOULD NOT return a stale - response if the client explicitly requests a first-hand or fresh one, - unless it is impossible to comply for technical or policy reasons. - -13.1.6 Client-controlled Behavior - - While the origin server (and to a lesser extent, intermediate caches, - by their contribution to the age of a response) are the primary - source of expiration information, in some cases the client may need - to control a cache's decision about whether to return a cached - response without validating it. Clients do this using several - directives of the Cache-Control header. - - A client's request may specify the maximum age it is willing to - accept of an unvalidated response; specifying a value of zero forces - the cache(s) to revalidate all responses. A client may also specify - the minimum time remaining before a response expires. Both of these - options increase constraints on the behavior of caches, and so cannot - further relax the cache's approximation of semantic transparency. - - A client may also specify that it will accept stale responses, up to - some maximum amount of staleness. This loosens the constraints on the - caches, and so may violate the origin server's specified constraints - on semantic transparency, but may be necessary to support - disconnected operation, or high availability in the face of poor - connectivity. - -13.2 Expiration Model - -13.2.1 Server-Specified Expiration - - HTTP caching works best when caches can entirely avoid making - requests to the origin server. The primary mechanism for avoiding - requests is for an origin server to provide an explicit expiration - time in the future, indicating that a response may be used to satisfy - subsequent requests. In other words, a cache can return a fresh - - - -Fielding, et. al. Standards Track [Page 75] - -RFC 2068 HTTP/1.1 January 1997 - - - response without first contacting the server. - - Our expectation is that servers will assign future explicit - expiration times to responses in the belief that the entity is not - likely to change, in a semantically significant way, before the - expiration time is reached. This normally preserves semantic - transparency, as long as the server's expiration times are carefully - chosen. - - The expiration mechanism applies only to responses taken from a cache - and not to first-hand responses forwarded immediately to the - requesting client. - - If an origin server wishes to force a semantically transparent cache - to validate every request, it may assign an explicit expiration time - in the past. This means that the response is always stale, and so the - cache SHOULD validate it before using it for subsequent requests. See - section 14.9.4 for a more restrictive way to force revalidation. - - If an origin server wishes to force any HTTP/1.1 cache, no matter how - it is configured, to validate every request, it should use the - "must-revalidate" Cache-Control directive (see section 14.9). - - Servers specify explicit expiration times using either the Expires - header, or the max-age directive of the Cache-Control header. - - An expiration time cannot be used to force a user agent to refresh - its display or reload a resource; its semantics apply only to caching - mechanisms, and such mechanisms need only check a resource's - expiration status when a new request for that resource is initiated. - See section 13.13 for explanation of the difference between caches - and history mechanisms. - -13.2.2 Heuristic Expiration - - Since origin servers do not always provide explicit expiration times, - HTTP caches typically assign heuristic expiration times, employing - algorithms that use other header values (such as the Last-Modified - time) to estimate a plausible expiration time. The HTTP/1.1 - specification does not provide specific algorithms, but does impose - worst-case constraints on their results. Since heuristic expiration - times may compromise semantic transparency, they should be used - cautiously, and we encourage origin servers to provide explicit - expiration times as much as possible. - - - - - - - -Fielding, et. al. Standards Track [Page 76] - -RFC 2068 HTTP/1.1 January 1997 - - -13.2.3 Age Calculations - - In order to know if a cached entry is fresh, a cache needs to know if - its age exceeds its freshness lifetime. We discuss how to calculate - the latter in section 13.2.4; this section describes how to calculate - the age of a response or cache entry. - - In this discussion, we use the term "now" to mean "the current value - of the clock at the host performing the calculation." Hosts that use - HTTP, but especially hosts running origin servers and caches, should - use NTP [28] or some similar protocol to synchronize their clocks to - a globally accurate time standard. - - Also note that HTTP/1.1 requires origin servers to send a Date header - with every response, giving the time at which the response was - generated. We use the term "date_value" to denote the value of the - Date header, in a form appropriate for arithmetic operations. - - HTTP/1.1 uses the Age response-header to help convey age information - between caches. The Age header value is the sender's estimate of the - amount of time since the response was generated at the origin server. - In the case of a cached response that has been revalidated with the - origin server, the Age value is based on the time of revalidation, - not of the original response. - - In essence, the Age value is the sum of the time that the response - has been resident in each of the caches along the path from the - origin server, plus the amount of time it has been in transit along - network paths. - - We use the term "age_value" to denote the value of the Age header, in - a form appropriate for arithmetic operations. - - A response's age can be calculated in two entirely independent ways: - - 1. now minus date_value, if the local clock is reasonably well - synchronized to the origin server's clock. If the result is - negative, the result is replaced by zero. - - 2. age_value, if all of the caches along the response path - implement HTTP/1.1. - - Given that we have two independent ways to compute the age of a - response when it is received, we can combine these as - - corrected_received_age = max(now - date_value, age_value) - - and as long as we have either nearly synchronized clocks or all- - - - -Fielding, et. al. Standards Track [Page 77] - -RFC 2068 HTTP/1.1 January 1997 - - - HTTP/1.1 paths, one gets a reliable (conservative) result. - - Note that this correction is applied at each HTTP/1.1 cache along the - path, so that if there is an HTTP/1.0 cache in the path, the correct - received age is computed as long as the receiving cache's clock is - nearly in sync. We don't need end-to-end clock synchronization - (although it is good to have), and there is no explicit clock - synchronization step. - - Because of network-imposed delays, some significant interval may pass - from the time that a server generates a response and the time it is - received at the next outbound cache or client. If uncorrected, this - delay could result in improperly low ages. - - Because the request that resulted in the returned Age value must have - been initiated prior to that Age value's generation, we can correct - for delays imposed by the network by recording the time at which the - request was initiated. Then, when an Age value is received, it MUST - be interpreted relative to the time the request was initiated, not - the time that the response was received. This algorithm results in - conservative behavior no matter how much delay is experienced. So, we - compute: - - corrected_initial_age = corrected_received_age - + (now - request_time) - - where "request_time" is the time (according to the local clock) when - the request that elicited this response was sent. - - Summary of age calculation algorithm, when a cache receives a - response: - - /* - * age_value - * is the value of Age: header received by the cache with - * this response. - * date_value - * is the value of the origin server's Date: header - * request_time - * is the (local) time when the cache made the request - * that resulted in this cached response - * response_time - * is the (local) time when the cache received the - * response - * now - * is the current (local) time - */ - apparent_age = max(0, response_time - date_value); - - - -Fielding, et. al. Standards Track [Page 78] - -RFC 2068 HTTP/1.1 January 1997 - - - corrected_received_age = max(apparent_age, age_value); - response_delay = response_time - request_time; - corrected_initial_age = corrected_received_age + response_delay; - resident_time = now - response_time; - current_age = corrected_initial_age + resident_time; - - When a cache sends a response, it must add to the - corrected_initial_age the amount of time that the response was - resident locally. It must then transmit this total age, using the Age - header, to the next recipient cache. - - Note that a client cannot reliably tell that a response is first- - hand, but the presence of an Age header indicates that a response - is definitely not first-hand. Also, if the Date in a response is - earlier than the client's local request time, the response is - probably not first-hand (in the absence of serious clock skew). - -13.2.4 Expiration Calculations - - In order to decide whether a response is fresh or stale, we need to - compare its freshness lifetime to its age. The age is calculated as - described in section 13.2.3; this section describes how to calculate - the freshness lifetime, and to determine if a response has expired. - In the discussion below, the values can be represented in any form - appropriate for arithmetic operations. - - We use the term "expires_value" to denote the value of the Expires - header. We use the term "max_age_value" to denote an appropriate - value of the number of seconds carried by the max-age directive of - the Cache-Control header in a response (see section 14.10. - - The max-age directive takes priority over Expires, so if max-age is - present in a response, the calculation is simply: - - freshness_lifetime = max_age_value - - Otherwise, if Expires is present in the response, the calculation is: - - freshness_lifetime = expires_value - date_value - - Note that neither of these calculations is vulnerable to clock skew, - since all of the information comes from the origin server. - - If neither Expires nor Cache-Control: max-age appears in the - response, and the response does not include other restrictions on - caching, the cache MAY compute a freshness lifetime using a - heuristic. If the value is greater than 24 hours, the cache must - attach Warning 13 to any response whose age is more than 24 hours if - - - -Fielding, et. al. Standards Track [Page 79] - -RFC 2068 HTTP/1.1 January 1997 - - - such warning has not already been added. - - Also, if the response does have a Last-Modified time, the heuristic - expiration value SHOULD be no more than some fraction of the interval - since that time. A typical setting of this fraction might be 10%. - - The calculation to determine if a response has expired is quite - simple: - - response_is_fresh = (freshness_lifetime > current_age) - -13.2.5 Disambiguating Expiration Values - - Because expiration values are assigned optimistically, it is possible - for two caches to contain fresh values for the same resource that are - different. - - If a client performing a retrieval receives a non-first-hand response - for a request that was already fresh in its own cache, and the Date - header in its existing cache entry is newer than the Date on the new - response, then the client MAY ignore the response. If so, it MAY - retry the request with a "Cache-Control: max-age=0" directive (see - section 14.9), to force a check with the origin server. - - If a cache has two fresh responses for the same representation with - different validators, it MUST use the one with the more recent Date - header. This situation may arise because the cache is pooling - responses from other caches, or because a client has asked for a - reload or a revalidation of an apparently fresh cache entry. - -13.2.6 Disambiguating Multiple Responses - - Because a client may be receiving responses via multiple paths, so - that some responses flow through one set of caches and other - responses flow through a different set of caches, a client may - receive responses in an order different from that in which the origin - server sent them. We would like the client to use the most recently - generated response, even if older responses are still apparently - fresh. - - Neither the entity tag nor the expiration value can impose an - ordering on responses, since it is possible that a later response - intentionally carries an earlier expiration time. However, the - HTTP/1.1 specification requires the transmission of Date headers on - every response, and the Date values are ordered to a granularity of - one second. - - - - - -Fielding, et. al. Standards Track [Page 80] - -RFC 2068 HTTP/1.1 January 1997 - - - When a client tries to revalidate a cache entry, and the response it - receives contains a Date header that appears to be older than the one - for the existing entry, then the client SHOULD repeat the request - unconditionally, and include - - Cache-Control: max-age=0 - - to force any intermediate caches to validate their copies directly - with the origin server, or - - Cache-Control: no-cache - - to force any intermediate caches to obtain a new copy from the origin - server. - - If the Date values are equal, then the client may use either response - (or may, if it is being extremely prudent, request a new response). - Servers MUST NOT depend on clients being able to choose - deterministically between responses generated during the same second, - if their expiration times overlap. - -13.3 Validation Model - - When a cache has a stale entry that it would like to use as a - response to a client's request, it first has to check with the origin - server (or possibly an intermediate cache with a fresh response) to - see if its cached entry is still usable. We call this "validating" - the cache entry. Since we do not want to have to pay the overhead of - retransmitting the full response if the cached entry is good, and we - do not want to pay the overhead of an extra round trip if the cached - entry is invalid, the HTTP/1.1 protocol supports the use of - conditional methods. - - The key protocol features for supporting conditional methods are - those concerned with "cache validators." When an origin server - generates a full response, it attaches some sort of validator to it, - which is kept with the cache entry. When a client (user agent or - proxy cache) makes a conditional request for a resource for which it - has a cache entry, it includes the associated validator in the - request. - - The server then checks that validator against the current validator - for the entity, and, if they match, it responds with a special status - code (usually, 304 (Not Modified)) and no entity-body. Otherwise, it - returns a full response (including entity-body). Thus, we avoid - transmitting the full response if the validator matches, and we avoid - an extra round trip if it does not match. - - - - -Fielding, et. al. Standards Track [Page 81] - -RFC 2068 HTTP/1.1 January 1997 - - - Note: the comparison functions used to decide if validators match - are defined in section 13.3.3. - - In HTTP/1.1, a conditional request looks exactly the same as a normal - request for the same resource, except that it carries a special - header (which includes the validator) that implicitly turns the - method (usually, GET) into a conditional. - - The protocol includes both positive and negative senses of cache- - validating conditions. That is, it is possible to request either that - a method be performed if and only if a validator matches or if and - only if no validators match. - - Note: a response that lacks a validator may still be cached, and - served from cache until it expires, unless this is explicitly - prohibited by a Cache-Control directive. However, a cache cannot do - a conditional retrieval if it does not have a validator for the - entity, which means it will not be refreshable after it expires. - -13.3.1 Last-modified Dates - - The Last-Modified entity-header field value is often used as a cache - validator. In simple terms, a cache entry is considered to be valid - if the entity has not been modified since the Last-Modified value. - -13.3.2 Entity Tag Cache Validators - - The ETag entity-header field value, an entity tag, provides for an - "opaque" cache validator. This may allow more reliable validation in - situations where it is inconvenient to store modification dates, - where the one-second resolution of HTTP date values is not - sufficient, or where the origin server wishes to avoid certain - paradoxes that may arise from the use of modification dates. - - Entity Tags are described in section 3.11. The headers used with - entity tags are described in sections 14.20, 14.25, 14.26 and 14.43. - -13.3.3 Weak and Strong Validators - - Since both origin servers and caches will compare two validators to - decide if they represent the same or different entities, one normally - would expect that if the entity (the entity-body or any entity- - headers) changes in any way, then the associated validator would - change as well. If this is true, then we call this validator a - "strong validator." - - However, there may be cases when a server prefers to change the - validator only on semantically significant changes, and not when - - - -Fielding, et. al. Standards Track [Page 82] - -RFC 2068 HTTP/1.1 January 1997 - - - insignificant aspects of the entity change. A validator that does not - always change when the resource changes is a "weak validator." - - Entity tags are normally "strong validators," but the protocol - provides a mechanism to tag an entity tag as "weak." One can think of - a strong validator as one that changes whenever the bits of an entity - changes, while a weak value changes whenever the meaning of an entity - changes. Alternatively, one can think of a strong validator as part - of an identifier for a specific entity, while a weak validator is - part of an identifier for a set of semantically equivalent entities. - - Note: One example of a strong validator is an integer that is - incremented in stable storage every time an entity is changed. - - An entity's modification time, if represented with one-second - resolution, could be a weak validator, since it is possible that - the resource may be modified twice during a single second. - - Support for weak validators is optional; however, weak validators - allow for more efficient caching of equivalent objects; for - example, a hit counter on a site is probably good enough if it is - updated every few days or weeks, and any value during that period - is likely "good enough" to be equivalent. - - A "use" of a validator is either when a client generates a request - and includes the validator in a validating header field, or when a - server compares two validators. - - Strong validators are usable in any context. Weak validators are only - usable in contexts that do not depend on exact equality of an entity. - For example, either kind is usable for a conditional GET of a full - entity. However, only a strong validator is usable for a sub-range - retrieval, since otherwise the client may end up with an internally - inconsistent entity. - - The only function that the HTTP/1.1 protocol defines on validators is - comparison. There are two validator comparison functions, depending - on whether the comparison context allows the use of weak validators - or not: - - o The strong comparison function: in order to be considered equal, - both validators must be identical in every way, and neither may be - weak. - o The weak comparison function: in order to be considered equal, both - validators must be identical in every way, but either or both of - them may be tagged as "weak" without affecting the result. - - The weak comparison function MAY be used for simple (non-subrange) - - - -Fielding, et. al. Standards Track [Page 83] - -RFC 2068 HTTP/1.1 January 1997 - - - GET requests. The strong comparison function MUST be used in all - other cases. - - An entity tag is strong unless it is explicitly tagged as weak. - Section 3.11 gives the syntax for entity tags. - - A Last-Modified time, when used as a validator in a request, is - implicitly weak unless it is possible to deduce that it is strong, - using the following rules: - - o The validator is being compared by an origin server to the actual - current validator for the entity and, - o That origin server reliably knows that the associated entity did - not change twice during the second covered by the presented - validator. -or - - o The validator is about to be used by a client in an If-Modified- - Since or If-Unmodified-Since header, because the client has a cache - entry for the associated entity, and - o That cache entry includes a Date value, which gives the time when - the origin server sent the original response, and - o The presented Last-Modified time is at least 60 seconds before the - Date value. -or - - o The validator is being compared by an intermediate cache to the - validator stored in its cache entry for the entity, and - o That cache entry includes a Date value, which gives the time when - the origin server sent the original response, and - o The presented Last-Modified time is at least 60 seconds before the - Date value. - - This method relies on the fact that if two different responses were - sent by the origin server during the same second, but both had the - same Last-Modified time, then at least one of those responses would - have a Date value equal to its Last-Modified time. The arbitrary 60- - second limit guards against the possibility that the Date and Last- - Modified values are generated from different clocks, or at somewhat - different times during the preparation of the response. An - implementation may use a value larger than 60 seconds, if it is - believed that 60 seconds is too short. - - If a client wishes to perform a sub-range retrieval on a value for - which it has only a Last-Modified time and no opaque validator, it - may do this only if the Last-Modified time is strong in the sense - described here. - - - - -Fielding, et. al. Standards Track [Page 84] - -RFC 2068 HTTP/1.1 January 1997 - - - A cache or origin server receiving a cache-conditional request, other - than a full-body GET request, MUST use the strong comparison function - to evaluate the condition. - - These rules allow HTTP/1.1 caches and clients to safely perform sub- - range retrievals on values that have been obtained from HTTP/1.0 - servers. - -13.3.4 Rules for When to Use Entity Tags and Last-modified Dates - - We adopt a set of rules and recommendations for origin servers, - clients, and caches regarding when various validator types should be - used, and for what purposes. - - HTTP/1.1 origin servers: - - o SHOULD send an entity tag validator unless it is not feasible to - generate one. - o MAY send a weak entity tag instead of a strong entity tag, if - performance considerations support the use of weak entity tags, or - if it is unfeasible to send a strong entity tag. - o SHOULD send a Last-Modified value if it is feasible to send one, - unless the risk of a breakdown in semantic transparency that could - result from using this date in an If-Modified-Since header would - lead to serious problems. - - In other words, the preferred behavior for an HTTP/1.1 origin server - is to send both a strong entity tag and a Last-Modified value. - - In order to be legal, a strong entity tag MUST change whenever the - associated entity value changes in any way. A weak entity tag SHOULD - change whenever the associated entity changes in a semantically - significant way. - - Note: in order to provide semantically transparent caching, an - origin server must avoid reusing a specific strong entity tag value - for two different entities, or reusing a specific weak entity tag - value for two semantically different entities. Cache entries may - persist for arbitrarily long periods, regardless of expiration - times, so it may be inappropriate to expect that a cache will never - again attempt to validate an entry using a validator that it - obtained at some point in the past. - - HTTP/1.1 clients: - - o If an entity tag has been provided by the origin server, MUST - use that entity tag in any cache-conditional request (using - If-Match or If-None-Match). - - - -Fielding, et. al. Standards Track [Page 85] - -RFC 2068 HTTP/1.1 January 1997 - - - o If only a Last-Modified value has been provided by the origin - server, SHOULD use that value in non-subrange cache-conditional - requests (using If-Modified-Since). - o If only a Last-Modified value has been provided by an HTTP/1.0 - origin server, MAY use that value in subrange cache-conditional - requests (using If-Unmodified-Since:). The user agent should - provide a way to disable this, in case of difficulty. - o If both an entity tag and a Last-Modified value have been - provided by the origin server, SHOULD use both validators in - cache-conditional requests. This allows both HTTP/1.0 and - HTTP/1.1 caches to respond appropriately. - - An HTTP/1.1 cache, upon receiving a request, MUST use the most - restrictive validator when deciding whether the client's cache entry - matches the cache's own cache entry. This is only an issue when the - request contains both an entity tag and a last-modified-date - validator (If-Modified-Since or If-Unmodified-Since). - - A note on rationale: The general principle behind these rules is - that HTTP/1.1 servers and clients should transmit as much non- - redundant information as is available in their responses and - requests. HTTP/1.1 systems receiving this information will make the - most conservative assumptions about the validators they receive. - - HTTP/1.0 clients and caches will ignore entity tags. Generally, - last-modified values received or used by these systems will support - transparent and efficient caching, and so HTTP/1.1 origin servers - should provide Last-Modified values. In those rare cases where the - use of a Last-Modified value as a validator by an HTTP/1.0 system - could result in a serious problem, then HTTP/1.1 origin servers - should not provide one. - -13.3.5 Non-validating Conditionals - - The principle behind entity tags is that only the service author - knows the semantics of a resource well enough to select an - appropriate cache validation mechanism, and the specification of any - validator comparison function more complex than byte-equality would - open up a can of worms. Thus, comparisons of any other headers - (except Last-Modified, for compatibility with HTTP/1.0) are never - used for purposes of validating a cache entry. - -13.4 Response Cachability - - Unless specifically constrained by a Cache-Control (section 14.9) - directive, a caching system may always store a successful response - (see section 13.8) as a cache entry, may return it without validation - if it is fresh, and may return it after successful validation. If - - - -Fielding, et. al. Standards Track [Page 86] - -RFC 2068 HTTP/1.1 January 1997 - - - there is neither a cache validator nor an explicit expiration time - associated with a response, we do not expect it to be cached, but - certain caches may violate this expectation (for example, when little - or no network connectivity is available). A client can usually detect - that such a response was taken from a cache by comparing the Date - header to the current time. - - Note that some HTTP/1.0 caches are known to violate this - expectation without providing any Warning. - - However, in some cases it may be inappropriate for a cache to retain - an entity, or to return it in response to a subsequent request. This - may be because absolute semantic transparency is deemed necessary by - the service author, or because of security or privacy considerations. - Certain Cache-Control directives are therefore provided so that the - server can indicate that certain resource entities, or portions - thereof, may not be cached regardless of other considerations. - - Note that section 14.8 normally prevents a shared cache from saving - and returning a response to a previous request if that request - included an Authorization header. - - A response received with a status code of 200, 203, 206, 300, 301 or - 410 may be stored by a cache and used in reply to a subsequent - request, subject to the expiration mechanism, unless a Cache-Control - directive prohibits caching. However, a cache that does not support - the Range and Content-Range headers MUST NOT cache 206 (Partial - Content) responses. - - A response received with any other status code MUST NOT be returned - in a reply to a subsequent request unless there are Cache-Control - directives or another header(s) that explicitly allow it. For - example, these include the following: an Expires header (section - 14.21); a "max-age", "must-revalidate", "proxy-revalidate", "public" - or "private" Cache-Control directive (section 14.9). - -13.5 Constructing Responses From Caches - - The purpose of an HTTP cache is to store information received in - response to requests, for use in responding to future requests. In - many cases, a cache simply returns the appropriate parts of a - response to the requester. However, if the cache holds a cache entry - based on a previous response, it may have to combine parts of a new - response with what is held in the cache entry. - - - - - - - -Fielding, et. al. Standards Track [Page 87] - -RFC 2068 HTTP/1.1 January 1997 - - -13.5.1 End-to-end and Hop-by-hop Headers - - For the purpose of defining the behavior of caches and non-caching - proxies, we divide HTTP headers into two categories: - - o End-to-end headers, which must be transmitted to the - ultimate recipient of a request or response. End-to-end - headers in responses must be stored as part of a cache entry - and transmitted in any response formed from a cache entry. - o Hop-by-hop headers, which are meaningful only for a single - transport-level connection, and are not stored by caches or - forwarded by proxies. - - The following HTTP/1.1 headers are hop-by-hop headers: - - o Connection - o Keep-Alive - o Public - o Proxy-Authenticate - o Transfer-Encoding - o Upgrade - - All other headers defined by HTTP/1.1 are end-to-end headers. - - Hop-by-hop headers introduced in future versions of HTTP MUST be - listed in a Connection header, as described in section 14.10. - -13.5.2 Non-modifiable Headers - - Some features of the HTTP/1.1 protocol, such as Digest - Authentication, depend on the value of certain end-to-end headers. A - cache or non-caching proxy SHOULD NOT modify an end-to-end header - unless the definition of that header requires or specifically allows - that. - - A cache or non-caching proxy MUST NOT modify any of the following - fields in a request or response, nor may it add any of these fields - if not already present: - - o Content-Location - o ETag - o Expires - o Last-Modified - - - - - - - - -Fielding, et. al. Standards Track [Page 88] - -RFC 2068 HTTP/1.1 January 1997 - - - A cache or non-caching proxy MUST NOT modify or add any of the - following fields in a response that contains the no-transform Cache- - Control directive, or in any request: - - o Content-Encoding - o Content-Length - o Content-Range - o Content-Type - - A cache or non-caching proxy MAY modify or add these fields in a - response that does not include no-transform, but if it does so, it - MUST add a Warning 14 (Transformation applied) if one does not - already appear in the response. - - Warning: unnecessary modification of end-to-end headers may cause - authentication failures if stronger authentication mechanisms are - introduced in later versions of HTTP. Such authentication - mechanisms may rely on the values of header fields not listed here. - -13.5.3 Combining Headers - - When a cache makes a validating request to a server, and the server - provides a 304 (Not Modified) response, the cache must construct a - response to send to the requesting client. The cache uses the - entity-body stored in the cache entry as the entity-body of this - outgoing response. The end-to-end headers stored in the cache entry - are used for the constructed response, except that any end-to-end - headers provided in the 304 response MUST replace the corresponding - headers from the cache entry. Unless the cache decides to remove the - cache entry, it MUST also replace the end-to-end headers stored with - the cache entry with corresponding headers received in the incoming - response. - - In other words, the set of end-to-end headers received in the - incoming response overrides all corresponding end-to-end headers - stored with the cache entry. The cache may add Warning headers (see - section 14.45) to this set. - - If a header field-name in the incoming response matches more than one - header in the cache entry, all such old headers are replaced. - - Note: this rule allows an origin server to use a 304 (Not Modified) - response to update any header associated with a previous response - for the same entity, although it might not always be meaningful or - correct to do so. This rule does not allow an origin server to use - a 304 (not Modified) response to entirely delete a header that it - had provided with a previous response. - - - - -Fielding, et. al. Standards Track [Page 89] - -RFC 2068 HTTP/1.1 January 1997 - - -13.5.4 Combining Byte Ranges - - A response may transfer only a subrange of the bytes of an entity- - body, either because the request included one or more Range - specifications, or because a connection was broken prematurely. After - several such transfers, a cache may have received several ranges of - the same entity-body. - - If a cache has a stored non-empty set of subranges for an entity, and - an incoming response transfers another subrange, the cache MAY - combine the new subrange with the existing set if both the following - conditions are met: - - o Both the incoming response and the cache entry must have a cache - validator. - o The two cache validators must match using the strong comparison - function (see section 13.3.3). - - If either requirement is not meant, the cache must use only the most - recent partial response (based on the Date values transmitted with - every response, and using the incoming response if these values are - equal or missing), and must discard the other partial information. - -13.6 Caching Negotiated Responses - - Use of server-driven content negotiation (section 12), as indicated - by the presence of a Vary header field in a response, alters the - conditions and procedure by which a cache can use the response for - subsequent requests. - - A server MUST use the Vary header field (section 14.43) to inform a - cache of what header field dimensions are used to select among - multiple representations of a cachable response. A cache may use the - selected representation (the entity included with that particular - response) for replying to subsequent requests on that resource only - when the subsequent requests have the same or equivalent values for - all header fields specified in the Vary response-header. Requests - with a different value for one or more of those header fields would - be forwarded toward the origin server. - - If an entity tag was assigned to the representation, the forwarded - request SHOULD be conditional and include the entity tags in an If- - None-Match header field from all its cache entries for the Request- - URI. This conveys to the server the set of entities currently held by - the cache, so that if any one of these entities matches the requested - entity, the server can use the ETag header in its 304 (Not Modified) - response to tell the cache which entry is appropriate. If the - entity-tag of the new response matches that of an existing entry, the - - - -Fielding, et. al. Standards Track [Page 90] - -RFC 2068 HTTP/1.1 January 1997 - - - new response SHOULD be used to update the header fields of the - existing entry, and the result MUST be returned to the client. - - The Vary header field may also inform the cache that the - representation was selected using criteria not limited to the - request-headers; in this case, a cache MUST NOT use the response in a - reply to a subsequent request unless the cache relays the new request - to the origin server in a conditional request and the server responds - with 304 (Not Modified), including an entity tag or Content-Location - that indicates which entity should be used. - - If any of the existing cache entries contains only partial content - for the associated entity, its entity-tag SHOULD NOT be included in - the If-None-Match header unless the request is for a range that would - be fully satisfied by that entry. - - If a cache receives a successful response whose Content-Location - field matches that of an existing cache entry for the same Request- - URI, whose entity-tag differs from that of the existing entry, and - whose Date is more recent than that of the existing entry, the - existing entry SHOULD NOT be returned in response to future requests, - and should be deleted from the cache. - -13.7 Shared and Non-Shared Caches - - For reasons of security and privacy, it is necessary to make a - distinction between "shared" and "non-shared" caches. A non-shared - cache is one that is accessible only to a single user. Accessibility - in this case SHOULD be enforced by appropriate security mechanisms. - All other caches are considered to be "shared." Other sections of - this specification place certain constraints on the operation of - shared caches in order to prevent loss of privacy or failure of - access controls. - -13.8 Errors or Incomplete Response Cache Behavior - - A cache that receives an incomplete response (for example, with fewer - bytes of data than specified in a Content-Length header) may store - the response. However, the cache MUST treat this as a partial - response. Partial responses may be combined as described in section - 13.5.4; the result might be a full response or might still be - partial. A cache MUST NOT return a partial response to a client - without explicitly marking it as such, using the 206 (Partial - Content) status code. A cache MUST NOT return a partial response - using a status code of 200 (OK). - - If a cache receives a 5xx response while attempting to revalidate an - entry, it may either forward this response to the requesting client, - - - -Fielding, et. al. Standards Track [Page 91] - -RFC 2068 HTTP/1.1 January 1997 - - - or act as if the server failed to respond. In the latter case, it MAY - return a previously received response unless the cached entry - includes the "must-revalidate" Cache-Control directive (see section - 14.9). - -13.9 Side Effects of GET and HEAD - - Unless the origin server explicitly prohibits the caching of their - responses, the application of GET and HEAD methods to any resources - SHOULD NOT have side effects that would lead to erroneous behavior if - these responses are taken from a cache. They may still have side - effects, but a cache is not required to consider such side effects in - its caching decisions. Caches are always expected to observe an - origin server's explicit restrictions on caching. - - We note one exception to this rule: since some applications have - traditionally used GETs and HEADs with query URLs (those containing a - "?" in the rel_path part) to perform operations with significant side - effects, caches MUST NOT treat responses to such URLs as fresh unless - the server provides an explicit expiration time. This specifically - means that responses from HTTP/1.0 servers for such URIs should not - be taken from a cache. See section 9.1.1 for related information. - -13.10 Invalidation After Updates or Deletions - - The effect of certain methods at the origin server may cause one or - more existing cache entries to become non-transparently invalid. That - is, although they may continue to be "fresh," they do not accurately - reflect what the origin server would return for a new request. - - There is no way for the HTTP protocol to guarantee that all such - cache entries are marked invalid. For example, the request that - caused the change at the origin server may not have gone through the - proxy where a cache entry is stored. However, several rules help - reduce the likelihood of erroneous behavior. - - In this section, the phrase "invalidate an entity" means that the - cache should either remove all instances of that entity from its - storage, or should mark these as "invalid" and in need of a mandatory - revalidation before they can be returned in response to a subsequent - request. - - - - - - - - - - -Fielding, et. al. Standards Track [Page 92] - -RFC 2068 HTTP/1.1 January 1997 - - - Some HTTP methods may invalidate an entity. This is either the entity - referred to by the Request-URI, or by the Location or Content- - Location response-headers (if present). These methods are: - - o PUT - o DELETE - o POST - - In order to prevent denial of service attacks, an invalidation based - on the URI in a Location or Content-Location header MUST only be - performed if the host part is the same as in the Request-URI. - -13.11 Write-Through Mandatory - - All methods that may be expected to cause modifications to the origin - server's resources MUST be written through to the origin server. This - currently includes all methods except for GET and HEAD. A cache MUST - NOT reply to such a request from a client before having transmitted - the request to the inbound server, and having received a - corresponding response from the inbound server. This does not prevent - a cache from sending a 100 (Continue) response before the inbound - server has replied. - - The alternative (known as "write-back" or "copy-back" caching) is not - allowed in HTTP/1.1, due to the difficulty of providing consistent - updates and the problems arising from server, cache, or network - failure prior to write-back. - -13.12 Cache Replacement - - If a new cachable (see sections 14.9.2, 13.2.5, 13.2.6 and 13.8) - response is received from a resource while any existing responses for - the same resource are cached, the cache SHOULD use the new response - to reply to the current request. It may insert it into cache storage - and may, if it meets all other requirements, use it to respond to any - future requests that would previously have caused the old response to - be returned. If it inserts the new response into cache storage it - should follow the rules in section 13.5.3. - - Note: a new response that has an older Date header value than - existing cached responses is not cachable. - -13.13 History Lists - - User agents often have history mechanisms, such as "Back" buttons and - history lists, which can be used to redisplay an entity retrieved - earlier in a session. - - - - -Fielding, et. al. Standards Track [Page 93] - -RFC 2068 HTTP/1.1 January 1997 - - - History mechanisms and caches are different. In particular history - mechanisms SHOULD NOT try to show a semantically transparent view of - the current state of a resource. Rather, a history mechanism is meant - to show exactly what the user saw at the time when the resource was - retrieved. - - By default, an expiration time does not apply to history mechanisms. - If the entity is still in storage, a history mechanism should display - it even if the entity has expired, unless the user has specifically - configured the agent to refresh expired history documents. - - This should not be construed to prohibit the history mechanism from - telling the user that a view may be stale. - - Note: if history list mechanisms unnecessarily prevent users from - viewing stale resources, this will tend to force service authors to - avoid using HTTP expiration controls and cache controls when they - would otherwise like to. Service authors may consider it important - that users not be presented with error messages or warning messages - when they use navigation controls (such as BACK) to view previously - fetched resources. Even though sometimes such resources ought not - to cached, or ought to expire quickly, user interface - considerations may force service authors to resort to other means - of preventing caching (e.g. "once-only" URLs) in order not to - suffer the effects of improperly functioning history mechanisms. - -14 Header Field Definitions - - This section defines the syntax and semantics of all standard - HTTP/1.1 header fields. For entity-header fields, both sender and - recipient refer to either the client or the server, depending on who - sends and who receives the entity. - - - - - - - - - - - - - - - - - - - -Fielding, et. al. Standards Track [Page 94] - -RFC 2068 HTTP/1.1 January 1997 - - -14.1 Accept - - The Accept request-header field can be used to specify certain media - types which are acceptable for the response. Accept headers can be - used to indicate that the request is specifically limited to a small - set of desired types, as in the case of a request for an in-line - image. - - Accept = "Accept" ":" - #( media-range [ accept-params ] ) - - media-range = ( "*/*" - | ( type "/" "*" ) - | ( type "/" subtype ) - ) *( ";" parameter ) - - accept-params = ";" "q" "=" qvalue *( accept-extension ) - - accept-extension = ";" token [ "=" ( token | quoted-string ) ] - - The asterisk "*" character is used to group media types into ranges, - with "*/*" indicating all media types and "type/*" indicating all - subtypes of that type. The media-range MAY include media type - parameters that are applicable to that range. - - Each media-range MAY be followed by one or more accept-params, - beginning with the "q" parameter for indicating a relative quality - factor. The first "q" parameter (if any) separates the media-range - parameter(s) from the accept-params. Quality factors allow the user - or user agent to indicate the relative degree of preference for that - media-range, using the qvalue scale from 0 to 1 (section 3.9). The - default value is q=1. - - Note: Use of the "q" parameter name to separate media type - parameters from Accept extension parameters is due to historical - practice. Although this prevents any media type parameter named - "q" from being used with a media range, such an event is believed - to be unlikely given the lack of any "q" parameters in the IANA - media type registry and the rare usage of any media type parameters - in Accept. Future media types should be discouraged from - registering any parameter named "q". - - The example - - Accept: audio/*; q=0.2, audio/basic - - SHOULD be interpreted as "I prefer audio/basic, but send me any audio - type if it is the best available after an 80% mark-down in quality." - - - -Fielding, et. al. Standards Track [Page 95] - -RFC 2068 HTTP/1.1 January 1997 - - - If no Accept header field is present, then it is assumed that the - client accepts all media types. If an Accept header field is present, - and if the server cannot send a response which is acceptable - according to the combined Accept field value, then the server SHOULD - send a 406 (not acceptable) response. - - A more elaborate example is - - Accept: text/plain; q=0.5, text/html, - text/x-dvi; q=0.8, text/x-c - - Verbally, this would be interpreted as "text/html and text/x-c are - the preferred media types, but if they do not exist, then send the - text/x-dvi entity, and if that does not exist, send the text/plain - entity." - - Media ranges can be overridden by more specific media ranges or - specific media types. If more than one media range applies to a given - type, the most specific reference has precedence. For example, - - Accept: text/*, text/html, text/html;level=1, */* - - have the following precedence: - - 1) text/html;level=1 - 2) text/html - 3) text/* - 4) */* - - The media type quality factor associated with a given type is - determined by finding the media range with the highest precedence - which matches that type. For example, - - Accept: text/*;q=0.3, text/html;q=0.7, text/html;level=1, - text/html;level=2;q=0.4, */*;q=0.5 - - would cause the following values to be associated: - - text/html;level=1 = 1 - text/html = 0.7 - text/plain = 0.3 - image/jpeg = 0.5 - text/html;level=2 = 0.4 - text/html;level=3 = 0.7 - - Note: A user agent may be provided with a default set of quality - values for certain media ranges. However, unless the user agent is - a closed system which cannot interact with other rendering agents, - - - -Fielding, et. al. Standards Track [Page 96] - -RFC 2068 HTTP/1.1 January 1997 - - - this default set should be configurable by the user. - -14.2 Accept-Charset - - The Accept-Charset request-header field can be used to indicate what - character sets are acceptable for the response. This field allows - clients capable of understanding more comprehensive or special- - purpose character sets to signal that capability to a server which is - capable of representing documents in those character sets. The ISO- - 8859-1 character set can be assumed to be acceptable to all user - agents. - - Accept-Charset = "Accept-Charset" ":" - 1#( charset [ ";" "q" "=" qvalue ] ) - - Character set values are described in section 3.4. Each charset may - be given an associated quality value which represents the user's - preference for that charset. The default value is q=1. An example is - - Accept-Charset: iso-8859-5, unicode-1-1;q=0.8 - - If no Accept-Charset header is present, the default is that any - character set is acceptable. If an Accept-Charset header is present, - and if the server cannot send a response which is acceptable - according to the Accept-Charset header, then the server SHOULD send - an error response with the 406 (not acceptable) status code, though - the sending of an unacceptable response is also allowed. - -14.3 Accept-Encoding - - The Accept-Encoding request-header field is similar to Accept, but - restricts the content-coding values (section 14.12) which are - acceptable in the response. - - Accept-Encoding = "Accept-Encoding" ":" - #( content-coding ) - - An example of its use is - - Accept-Encoding: compress, gzip - - If no Accept-Encoding header is present in a request, the server MAY - assume that the client will accept any content coding. If an Accept- - Encoding header is present, and if the server cannot send a response - which is acceptable according to the Accept-Encoding header, then the - server SHOULD send an error response with the 406 (Not Acceptable) - status code. - - - - -Fielding, et. al. Standards Track [Page 97] - -RFC 2068 HTTP/1.1 January 1997 - - - An empty Accept-Encoding value indicates none are acceptable. - -14.4 Accept-Language - - The Accept-Language request-header field is similar to Accept, but - restricts the set of natural languages that are preferred as a - response to the request. - - Accept-Language = "Accept-Language" ":" - 1#( language-range [ ";" "q" "=" qvalue ] ) - - language-range = ( ( 1*8ALPHA *( "-" 1*8ALPHA ) ) | "*" ) - - Each language-range MAY be given an associated quality value which - represents an estimate of the user's preference for the languages - specified by that range. The quality value defaults to "q=1". For - example, - - Accept-Language: da, en-gb;q=0.8, en;q=0.7 - - would mean: "I prefer Danish, but will accept British English and - other types of English." A language-range matches a language-tag if - it exactly equals the tag, or if it exactly equals a prefix of the - tag such that the first tag character following the prefix is "-". - The special range "*", if present in the Accept-Language field, - matches every tag not matched by any other range present in the - Accept-Language field. - - Note: This use of a prefix matching rule does not imply that - language tags are assigned to languages in such a way that it is - always true that if a user understands a language with a certain - tag, then this user will also understand all languages with tags - for which this tag is a prefix. The prefix rule simply allows the - use of prefix tags if this is the case. - - The language quality factor assigned to a language-tag by the - Accept-Language field is the quality value of the longest language- - range in the field that matches the language-tag. If no language- - range in the field matches the tag, the language quality factor - assigned is 0. If no Accept-Language header is present in the - request, the server SHOULD assume that all languages are equally - acceptable. If an Accept-Language header is present, then all - languages which are assigned a quality factor greater than 0 are - acceptable. - - It may be contrary to the privacy expectations of the user to send an - Accept-Language header with the complete linguistic preferences of - the user in every request. For a discussion of this issue, see - - - -Fielding, et. al. Standards Track [Page 98] - -RFC 2068 HTTP/1.1 January 1997 - - - section 15.7. - - Note: As intelligibility is highly dependent on the individual - user, it is recommended that client applications make the choice of - linguistic preference available to the user. If the choice is not - made available, then the Accept-Language header field must not be - given in the request. - -14.5 Accept-Ranges - - The Accept-Ranges response-header field allows the server to indicate - its acceptance of range requests for a resource: - - Accept-Ranges = "Accept-Ranges" ":" acceptable-ranges - - acceptable-ranges = 1#range-unit | "none" - - Origin servers that accept byte-range requests MAY send - - Accept-Ranges: bytes - - but are not required to do so. Clients MAY generate byte-range - requests without having received this header for the resource - involved. - - Servers that do not accept any kind of range request for a resource - MAY send - - Accept-Ranges: none - - to advise the client not to attempt a range request. - -14.6 Age - - The Age response-header field conveys the sender's estimate of the - amount of time since the response (or its revalidation) was generated - at the origin server. A cached response is "fresh" if its age does - not exceed its freshness lifetime. Age values are calculated as - specified in section 13.2.3. - - Age = "Age" ":" age-value - - age-value = delta-seconds - - Age values are non-negative decimal integers, representing time in - seconds. - - - - - -Fielding, et. al. Standards Track [Page 99] - -RFC 2068 HTTP/1.1 January 1997 - - - If a cache receives a value larger than the largest positive integer - it can represent, or if any of its age calculations overflows, it - MUST transmit an Age header with a value of 2147483648 (2^31). - HTTP/1.1 caches MUST send an Age header in every response. Caches - SHOULD use an arithmetic type of at least 31 bits of range. - -14.7 Allow - - The Allow entity-header field lists the set of methods supported by - the resource identified by the Request-URI. The purpose of this field - is strictly to inform the recipient of valid methods associated with - the resource. An Allow header field MUST be present in a 405 (Method - Not Allowed) response. - - Allow = "Allow" ":" 1#method - - Example of use: - - Allow: GET, HEAD, PUT - - This field cannot prevent a client from trying other methods. - However, the indications given by the Allow header field value SHOULD - be followed. The actual set of allowed methods is defined by the - origin server at the time of each request. - - The Allow header field MAY be provided with a PUT request to - recommend the methods to be supported by the new or modified - resource. The server is not required to support these methods and - SHOULD include an Allow header in the response giving the actual - supported methods. - - A proxy MUST NOT modify the Allow header field even if it does not - understand all the methods specified, since the user agent MAY have - other means of communicating with the origin server. - - The Allow header field does not indicate what methods are implemented - at the server level. Servers MAY use the Public response-header field - (section 14.35) to describe what methods are implemented on the - server as a whole. - -14.8 Authorization - - A user agent that wishes to authenticate itself with a server-- - usually, but not necessarily, after receiving a 401 response--MAY do - so by including an Authorization request-header field with the - request. The Authorization field value consists of credentials - containing the authentication information of the user agent for the - realm of the resource being requested. - - - -Fielding, et. al. Standards Track [Page 100] - -RFC 2068 HTTP/1.1 January 1997 - - - Authorization = "Authorization" ":" credentials - - HTTP access authentication is described in section 11. If a request - is authenticated and a realm specified, the same credentials SHOULD - be valid for all other requests within this realm. - - When a shared cache (see section 13.7) receives a request containing - an Authorization field, it MUST NOT return the corresponding response - as a reply to any other request, unless one of the following specific - exceptions holds: - - 1. If the response includes the "proxy-revalidate" Cache-Control - directive, the cache MAY use that response in replying to a - subsequent request, but a proxy cache MUST first revalidate it with - the origin server, using the request-headers from the new request - to allow the origin server to authenticate the new request. - 2. If the response includes the "must-revalidate" Cache-Control - directive, the cache MAY use that response in replying to a - subsequent request, but all caches MUST first revalidate it with - the origin server, using the request-headers from the new request - to allow the origin server to authenticate the new request. - 3. If the response includes the "public" Cache-Control directive, it - may be returned in reply to any subsequent request. - -14.9 Cache-Control - - The Cache-Control general-header field is used to specify directives - that MUST be obeyed by all caching mechanisms along the - request/response chain. The directives specify behavior intended to - prevent caches from adversely interfering with the request or - response. These directives typically override the default caching - algorithms. Cache directives are unidirectional in that the presence - of a directive in a request does not imply that the same directive - should be given in the response. - - Note that HTTP/1.0 caches may not implement Cache-Control and may - only implement Pragma: no-cache (see section 14.32). - - Cache directives must be passed through by a proxy or gateway - application, regardless of their significance to that application, - since the directives may be applicable to all recipients along the - request/response chain. It is not possible to specify a cache- - directive for a specific cache. - - Cache-Control = "Cache-Control" ":" 1#cache-directive - - cache-directive = cache-request-directive - | cache-response-directive - - - -Fielding, et. al. Standards Track [Page 101] - -RFC 2068 HTTP/1.1 January 1997 - - - cache-request-directive = - "no-cache" [ "=" <"> 1#field-name <"> ] - | "no-store" - | "max-age" "=" delta-seconds - | "max-stale" [ "=" delta-seconds ] - | "min-fresh" "=" delta-seconds - | "only-if-cached" - | cache-extension - - cache-response-directive = - "public" - | "private" [ "=" <"> 1#field-name <"> ] - | "no-cache" [ "=" <"> 1#field-name <"> ] - | "no-store" - | "no-transform" - | "must-revalidate" - | "proxy-revalidate" - | "max-age" "=" delta-seconds - | cache-extension - - cache-extension = token [ "=" ( token | quoted-string ) ] - - When a directive appears without any 1#field-name parameter, the - directive applies to the entire request or response. When such a - directive appears with a 1#field-name parameter, it applies only to - the named field or fields, and not to the rest of the request or - response. This mechanism supports extensibility; implementations of - future versions of the HTTP protocol may apply these directives to - header fields not defined in HTTP/1.1. - - The cache-control directives can be broken down into these general - categories: - - o Restrictions on what is cachable; these may only be imposed by the - origin server. - o Restrictions on what may be stored by a cache; these may be imposed - by either the origin server or the user agent. - o Modifications of the basic expiration mechanism; these may be - imposed by either the origin server or the user agent. - o Controls over cache revalidation and reload; these may only be - imposed by a user agent. - o Control over transformation of entities. - o Extensions to the caching system. - - - - - - - - -Fielding, et. al. Standards Track [Page 102] - -RFC 2068 HTTP/1.1 January 1997 - - -14.9.1 What is Cachable - - By default, a response is cachable if the requirements of the request - method, request header fields, and the response status indicate that - it is cachable. Section 13.4 summarizes these defaults for - cachability. The following Cache-Control response directives allow an - origin server to override the default cachability of a response: - -public - Indicates that the response is cachable by any cache, even if it - would normally be non-cachable or cachable only within a non-shared - cache. (See also Authorization, section 14.8, for additional - details.) - -private - Indicates that all or part of the response message is intended for a - single user and MUST NOT be cached by a shared cache. This allows an - origin server to state that the specified parts of the response are - intended for only one user and are not a valid response for requests - by other users. A private (non-shared) cache may cache the response. - - Note: This usage of the word private only controls where the - response may be cached, and cannot ensure the privacy of the - message content. - -no-cache - Indicates that all or part of the response message MUST NOT be cached - anywhere. This allows an origin server to prevent caching even by - caches that have been configured to return stale responses to client - requests. - - Note: Most HTTP/1.0 caches will not recognize or obey this - directive. - -14.9.2 What May be Stored by Caches - - The purpose of the no-store directive is to prevent the inadvertent - release or retention of sensitive information (for example, on backup - tapes). The no-store directive applies to the entire message, and may - be sent either in a response or in a request. If sent in a request, a - cache MUST NOT store any part of either this request or any response - to it. If sent in a response, a cache MUST NOT store any part of - either this response or the request that elicited it. This directive - applies to both non-shared and shared caches. "MUST NOT store" in - this context means that the cache MUST NOT intentionally store the - information in non-volatile storage, and MUST make a best-effort - attempt to remove the information from volatile storage as promptly - as possible after forwarding it. - - - -Fielding, et. al. Standards Track [Page 103] - -RFC 2068 HTTP/1.1 January 1997 - - - Even when this directive is associated with a response, users may - explicitly store such a response outside of the caching system (e.g., - with a "Save As" dialog). History buffers may store such responses as - part of their normal operation. - - The purpose of this directive is to meet the stated requirements of - certain users and service authors who are concerned about accidental - releases of information via unanticipated accesses to cache data - structures. While the use of this directive may improve privacy in - some cases, we caution that it is NOT in any way a reliable or - sufficient mechanism for ensuring privacy. In particular, malicious - or compromised caches may not recognize or obey this directive; and - communications networks may be vulnerable to eavesdropping. - -14.9.3 Modifications of the Basic Expiration Mechanism - - The expiration time of an entity may be specified by the origin - server using the Expires header (see section 14.21). Alternatively, - it may be specified using the max-age directive in a response. - - If a response includes both an Expires header and a max-age - directive, the max-age directive overrides the Expires header, even - if the Expires header is more restrictive. This rule allows an origin - server to provide, for a given response, a longer expiration time to - an HTTP/1.1 (or later) cache than to an HTTP/1.0 cache. This may be - useful if certain HTTP/1.0 caches improperly calculate ages or - expiration times, perhaps due to desynchronized clocks. - - Note: most older caches, not compliant with this specification, do - not implement any Cache-Control directives. An origin server - wishing to use a Cache-Control directive that restricts, but does - not prevent, caching by an HTTP/1.1-compliant cache may exploit the - requirement that the max-age directive overrides the Expires - header, and the fact that non-HTTP/1.1-compliant caches do not - observe the max-age directive. - - Other directives allow an user agent to modify the basic expiration - mechanism. These directives may be specified on a request: - - max-age - Indicates that the client is willing to accept a response whose age - is no greater than the specified time in seconds. Unless max-stale - directive is also included, the client is not willing to accept a - stale response. - - min-fresh - Indicates that the client is willing to accept a response whose - freshness lifetime is no less than its current age plus the - - - -Fielding, et. al. Standards Track [Page 104] - -RFC 2068 HTTP/1.1 January 1997 - - - specified time in seconds. That is, the client wants a response - that will still be fresh for at least the specified number of - seconds. - - max-stale - Indicates that the client is willing to accept a response that has - exceeded its expiration time. If max-stale is assigned a value, - then the client is willing to accept a response that has exceeded - its expiration time by no more than the specified number of - seconds. If no value is assigned to max-stale, then the client is - willing to accept a stale response of any age. - - If a cache returns a stale response, either because of a max-stale - directive on a request, or because the cache is configured to - override the expiration time of a response, the cache MUST attach a - Warning header to the stale response, using Warning 10 (Response is - stale). - -14.9.4 Cache Revalidation and Reload Controls - - Sometimes an user agent may want or need to insist that a cache - revalidate its cache entry with the origin server (and not just with - the next cache along the path to the origin server), or to reload its - cache entry from the origin server. End-to-end revalidation may be - necessary if either the cache or the origin server has overestimated - the expiration time of the cached response. End-to-end reload may be - necessary if the cache entry has become corrupted for some reason. - - End-to-end revalidation may be requested either when the client does - not have its own local cached copy, in which case we call it - "unspecified end-to-end revalidation", or when the client does have a - local cached copy, in which case we call it "specific end-to-end - revalidation." - - The client can specify these three kinds of action using Cache- - Control request directives: - - End-to-end reload - The request includes a "no-cache" Cache-Control directive or, for - compatibility with HTTP/1.0 clients, "Pragma: no-cache". No field - names may be included with the no-cache directive in a request. The - server MUST NOT use a cached copy when responding to such a - request. - - Specific end-to-end revalidation - The request includes a "max-age=0" Cache-Control directive, which - forces each cache along the path to the origin server to revalidate - its own entry, if any, with the next cache or server. The initial - - - -Fielding, et. al. Standards Track [Page 105] - -RFC 2068 HTTP/1.1 January 1997 - - - request includes a cache-validating conditional with the client's - current validator. - - Unspecified end-to-end revalidation - The request includes "max-age=0" Cache-Control directive, which - forces each cache along the path to the origin server to revalidate - its own entry, if any, with the next cache or server. The initial - request does not include a cache-validating conditional; the first - cache along the path (if any) that holds a cache entry for this - resource includes a cache-validating conditional with its current - validator. - - When an intermediate cache is forced, by means of a max-age=0 - directive, to revalidate its own cache entry, and the client has - supplied its own validator in the request, the supplied validator may - differ from the validator currently stored with the cache entry. In - this case, the cache may use either validator in making its own - request without affecting semantic transparency. - - However, the choice of validator may affect performance. The best - approach is for the intermediate cache to use its own validator when - making its request. If the server replies with 304 (Not Modified), - then the cache should return its now validated copy to the client - with a 200 (OK) response. If the server replies with a new entity and - cache validator, however, the intermediate cache should compare the - returned validator with the one provided in the client's request, - using the strong comparison function. If the client's validator is - equal to the origin server's, then the intermediate cache simply - returns 304 (Not Modified). Otherwise, it returns the new entity with - a 200 (OK) response. - - If a request includes the no-cache directive, it should not include - min-fresh, max-stale, or max-age. - - In some cases, such as times of extremely poor network connectivity, - a client may want a cache to return only those responses that it - currently has stored, and not to reload or revalidate with the origin - server. To do this, the client may include the only-if-cached - directive in a request. If it receives this directive, a cache SHOULD - either respond using a cached entry that is consistent with the other - constraints of the request, or respond with a 504 (Gateway Timeout) - status. However, if a group of caches is being operated as a unified - system with good internal connectivity, such a request MAY be - forwarded within that group of caches. - - Because a cache may be configured to ignore a server's specified - expiration time, and because a client request may include a max-stale - directive (which has a similar effect), the protocol also includes a - - - -Fielding, et. al. Standards Track [Page 106] - -RFC 2068 HTTP/1.1 January 1997 - - - mechanism for the origin server to require revalidation of a cache - entry on any subsequent use. When the must-revalidate directive is - present in a response received by a cache, that cache MUST NOT use - the entry after it becomes stale to respond to a subsequent request - without first revalidating it with the origin server. (I.e., the - cache must do an end-to-end revalidation every time, if, based solely - on the origin server's Expires or max-age value, the cached response - is stale.) - - The must-revalidate directive is necessary to support reliable - operation for certain protocol features. In all circumstances an - HTTP/1.1 cache MUST obey the must-revalidate directive; in - particular, if the cache cannot reach the origin server for any - reason, it MUST generate a 504 (Gateway Timeout) response. - - Servers should send the must-revalidate directive if and only if - failure to revalidate a request on the entity could result in - incorrect operation, such as a silently unexecuted financial - transaction. Recipients MUST NOT take any automated action that - violates this directive, and MUST NOT automatically provide an - unvalidated copy of the entity if revalidation fails. - - Although this is not recommended, user agents operating under severe - connectivity constraints may violate this directive but, if so, MUST - explicitly warn the user that an unvalidated response has been - provided. The warning MUST be provided on each unvalidated access, - and SHOULD require explicit user confirmation. - - The proxy-revalidate directive has the same meaning as the must- - revalidate directive, except that it does not apply to non-shared - user agent caches. It can be used on a response to an authenticated - request to permit the user's cache to store and later return the - response without needing to revalidate it (since it has already been - authenticated once by that user), while still requiring proxies that - service many users to revalidate each time (in order to make sure - that each user has been authenticated). Note that such authenticated - responses also need the public cache control directive in order to - allow them to be cached at all. - -14.9.5 No-Transform Directive - - Implementers of intermediate caches (proxies) have found it useful to - convert the media type of certain entity bodies. A proxy might, for - example, convert between image formats in order to save cache space - or to reduce the amount of traffic on a slow link. HTTP has to date - been silent on these transformations. - - - - - -Fielding, et. al. Standards Track [Page 107] - -RFC 2068 HTTP/1.1 January 1997 - - - Serious operational problems have already occurred, however, when - these transformations have been applied to entity bodies intended for - certain kinds of applications. For example, applications for medical - imaging, scientific data analysis and those using end-to-end - authentication, all depend on receiving an entity body that is bit - for bit identical to the original entity-body. - - Therefore, if a response includes the no-transform directive, an - intermediate cache or proxy MUST NOT change those headers that are - listed in section 13.5.2 as being subject to the no-transform - directive. This implies that the cache or proxy must not change any - aspect of the entity-body that is specified by these headers. - -14.9.6 Cache Control Extensions - - The Cache-Control header field can be extended through the use of one - or more cache-extension tokens, each with an optional assigned value. - Informational extensions (those which do not require a change in - cache behavior) may be added without changing the semantics of other - directives. Behavioral extensions are designed to work by acting as - modifiers to the existing base of cache directives. Both the new - directive and the standard directive are supplied, such that - applications which do not understand the new directive will default - to the behavior specified by the standard directive, and those that - understand the new directive will recognize it as modifying the - requirements associated with the standard directive. In this way, - extensions to the Cache-Control directives can be made without - requiring changes to the base protocol. - - This extension mechanism depends on a HTTP cache obeying all of the - cache-control directives defined for its native HTTP-version, obeying - certain extensions, and ignoring all directives that it does not - understand. - - For example, consider a hypothetical new response directive called - "community" which acts as a modifier to the "private" directive. We - define this new directive to mean that, in addition to any non-shared - cache, any cache which is shared only by members of the community - named within its value may cache the response. An origin server - wishing to allow the "UCI" community to use an otherwise private - response in their shared cache(s) may do so by including - - Cache-Control: private, community="UCI" - - A cache seeing this header field will act correctly even if the cache - does not understand the "community" cache-extension, since it will - also see and understand the "private" directive and thus default to - the safe behavior. - - - -Fielding, et. al. Standards Track [Page 108] - -RFC 2068 HTTP/1.1 January 1997 - - - Unrecognized cache-directives MUST be ignored; it is assumed that any - cache-directive likely to be unrecognized by an HTTP/1.1 cache will - be combined with standard directives (or the response's default - cachability) such that the cache behavior will remain minimally - correct even if the cache does not understand the extension(s). - -14.10 Connection - - The Connection general-header field allows the sender to specify - options that are desired for that particular connection and MUST NOT - be communicated by proxies over further connections. - - The Connection header has the following grammar: - - Connection-header = "Connection" ":" 1#(connection-token) - connection-token = token - - HTTP/1.1 proxies MUST parse the Connection header field before a - message is forwarded and, for each connection-token in this field, - remove any header field(s) from the message with the same name as the - connection-token. Connection options are signaled by the presence of - a connection-token in the Connection header field, not by any - corresponding additional header field(s), since the additional header - field may not be sent if there are no parameters associated with that - connection option. HTTP/1.1 defines the "close" connection option - for the sender to signal that the connection will be closed after - completion of the response. For example, - - Connection: close - - in either the request or the response header fields indicates that - the connection should not be considered `persistent' (section 8.1) - after the current request/response is complete. - - HTTP/1.1 applications that do not support persistent connections MUST - include the "close" connection option in every message. - -14.11 Content-Base - - The Content-Base entity-header field may be used to specify the base - URI for resolving relative URLs within the entity. This header field - is described as Base in RFC 1808, which is expected to be revised. - - Content-Base = "Content-Base" ":" absoluteURI - - If no Content-Base field is present, the base URI of an entity is - defined either by its Content-Location (if that Content-Location URI - is an absolute URI) or the URI used to initiate the request, in that - - - -Fielding, et. al. Standards Track [Page 109] - -RFC 2068 HTTP/1.1 January 1997 - - - order of precedence. Note, however, that the base URI of the contents - within the entity-body may be redefined within that entity-body. - -14.12 Content-Encoding - - The Content-Encoding entity-header field is used as a modifier to the - media-type. When present, its value indicates what additional content - codings have been applied to the entity-body, and thus what decoding - mechanisms MUST be applied in order to obtain the media-type - referenced by the Content-Type header field. Content-Encoding is - primarily used to allow a document to be compressed without losing - the identity of its underlying media type. - - Content-Encoding = "Content-Encoding" ":" 1#content-coding - - Content codings are defined in section 3.5. An example of its use is - - Content-Encoding: gzip - - The Content-Encoding is a characteristic of the entity identified by - the Request-URI. Typically, the entity-body is stored with this - encoding and is only decoded before rendering or analogous usage. - - If multiple encodings have been applied to an entity, the content - codings MUST be listed in the order in which they were applied. - - Additional information about the encoding parameters MAY be provided - by other entity-header fields not defined by this specification. - -14.13 Content-Language - - The Content-Language entity-header field describes the natural - language(s) of the intended audience for the enclosed entity. Note - that this may not be equivalent to all the languages used within the - entity-body. - - Content-Language = "Content-Language" ":" 1#language-tag - - Language tags are defined in section 3.10. The primary purpose of - Content-Language is to allow a user to identify and differentiate - entities according to the user's own preferred language. Thus, if the - body content is intended only for a Danish-literate audience, the - appropriate field is - - Content-Language: da - - If no Content-Language is specified, the default is that the content - is intended for all language audiences. This may mean that the sender - - - -Fielding, et. al. Standards Track [Page 110] - -RFC 2068 HTTP/1.1 January 1997 - - - does not consider it to be specific to any natural language, or that - the sender does not know for which language it is intended. - - Multiple languages MAY be listed for content that is intended for - multiple audiences. For example, a rendition of the "Treaty of - Waitangi," presented simultaneously in the original Maori and English - versions, would call for - - Content-Language: mi, en - - However, just because multiple languages are present within an entity - does not mean that it is intended for multiple linguistic audiences. - An example would be a beginner's language primer, such as "A First - Lesson in Latin," which is clearly intended to be used by an - English-literate audience. In this case, the Content-Language should - only include "en". - - Content-Language may be applied to any media type -- it is not - limited to textual documents. - -14.14 Content-Length - - The Content-Length entity-header field indicates the size of the - message-body, in decimal number of octets, sent to the recipient or, - in the case of the HEAD method, the size of the entity-body that - would have been sent had the request been a GET. - - Content-Length = "Content-Length" ":" 1*DIGIT - - An example is - - Content-Length: 3495 - - Applications SHOULD use this field to indicate the size of the - message-body to be transferred, regardless of the media type of the - entity. It must be possible for the recipient to reliably determine - the end of HTTP/1.1 requests containing an entity-body, e.g., because - the request has a valid Content-Length field, uses Transfer-Encoding: - chunked or a multipart body. - - Any Content-Length greater than or equal to zero is a valid value. - Section 4.4 describes how to determine the length of a message-body - if a Content-Length is not given. - - - - - - - - -Fielding, et. al. Standards Track [Page 111] - -RFC 2068 HTTP/1.1 January 1997 - - - Note: The meaning of this field is significantly different from the - corresponding definition in MIME, where it is an optional field - used within the "message/external-body" content-type. In HTTP, it - SHOULD be sent whenever the message's length can be determined - prior to being transferred. - -14.15 Content-Location - - The Content-Location entity-header field may be used to supply the - resource location for the entity enclosed in the message. In the case - where a resource has multiple entities associated with it, and those - entities actually have separate locations by which they might be - individually accessed, the server should provide a Content-Location - for the particular variant which is returned. In addition, a server - SHOULD provide a Content-Location for the resource corresponding to - the response entity. - - Content-Location = "Content-Location" ":" - ( absoluteURI | relativeURI ) - - If no Content-Base header field is present, the value of Content- - Location also defines the base URL for the entity (see section - 14.11). - - The Content-Location value is not a replacement for the original - requested URI; it is only a statement of the location of the resource - corresponding to this particular entity at the time of the request. - Future requests MAY use the Content-Location URI if the desire is to - identify the source of that particular entity. - - A cache cannot assume that an entity with a Content-Location - different from the URI used to retrieve it can be used to respond to - later requests on that Content-Location URI. However, the Content- - Location can be used to differentiate between multiple entities - retrieved from a single requested resource, as described in section - 13.6. - - If the Content-Location is a relative URI, the URI is interpreted - relative to any Content-Base URI provided in the response. If no - Content-Base is provided, the relative URI is interpreted relative to - the Request-URI. - - - - - - - - - - -Fielding, et. al. Standards Track [Page 112] - -RFC 2068 HTTP/1.1 January 1997 - - -14.16 Content-MD5 - - The Content-MD5 entity-header field, as defined in RFC 1864 [23], is - an MD5 digest of the entity-body for the purpose of providing an - end-to-end message integrity check (MIC) of the entity-body. (Note: a - MIC is good for detecting accidental modification of the entity-body - in transit, but is not proof against malicious attacks.) - - Content-MD5 = "Content-MD5" ":" md5-digest - - md5-digest = <base64 of 128 bit MD5 digest as per RFC 1864> - - The Content-MD5 header field may be generated by an origin server to - function as an integrity check of the entity-body. Only origin - servers may generate the Content-MD5 header field; proxies and - gateways MUST NOT generate it, as this would defeat its value as an - end-to-end integrity check. Any recipient of the entity-body, - including gateways and proxies, MAY check that the digest value in - this header field matches that of the entity-body as received. - - The MD5 digest is computed based on the content of the entity-body, - including any Content-Encoding that has been applied, but not - including any Transfer-Encoding that may have been applied to the - message-body. If the message is received with a Transfer-Encoding, - that encoding must be removed prior to checking the Content-MD5 value - against the received entity. - - This has the result that the digest is computed on the octets of the - entity-body exactly as, and in the order that, they would be sent if - no Transfer-Encoding were being applied. - - HTTP extends RFC 1864 to permit the digest to be computed for MIME - composite media-types (e.g., multipart/* and message/rfc822), but - this does not change how the digest is computed as defined in the - preceding paragraph. - - Note: There are several consequences of this. The entity-body for - composite types may contain many body-parts, each with its own MIME - and HTTP headers (including Content-MD5, Content-Transfer-Encoding, - and Content-Encoding headers). If a body-part has a Content- - Transfer-Encoding or Content-Encoding header, it is assumed that - the content of the body-part has had the encoding applied, and the - body-part is included in the Content-MD5 digest as is -- i.e., - after the application. The Transfer-Encoding header field is not - allowed within body-parts. - - Note: while the definition of Content-MD5 is exactly the same for - HTTP as in RFC 1864 for MIME entity-bodies, there are several ways - - - -Fielding, et. al. Standards Track [Page 113] - -RFC 2068 HTTP/1.1 January 1997 - - - in which the application of Content-MD5 to HTTP entity-bodies - differs from its application to MIME entity-bodies. One is that - HTTP, unlike MIME, does not use Content-Transfer-Encoding, and does - use Transfer-Encoding and Content-Encoding. Another is that HTTP - more frequently uses binary content types than MIME, so it is worth - noting that, in such cases, the byte order used to compute the - digest is the transmission byte order defined for the type. Lastly, - HTTP allows transmission of text types with any of several line - break conventions and not just the canonical form using CRLF. - Conversion of all line breaks to CRLF should not be done before - computing or checking the digest: the line break convention used in - the text actually transmitted should be left unaltered when - computing the digest. - -14.17 Content-Range - - The Content-Range entity-header is sent with a partial entity-body to - specify where in the full entity-body the partial body should be - inserted. It also indicates the total size of the full entity-body. - When a server returns a partial response to a client, it must - describe both the extent of the range covered by the response, and - the length of the entire entity-body. - - Content-Range = "Content-Range" ":" content-range-spec - - content-range-spec = byte-content-range-spec - - byte-content-range-spec = bytes-unit SP first-byte-pos "-" - last-byte-pos "/" entity-length - - entity-length = 1*DIGIT - - Unlike byte-ranges-specifier values, a byte-content-range-spec may - only specify one range, and must contain absolute byte positions for - both the first and last byte of the range. - - A byte-content-range-spec whose last-byte-pos value is less than its - first-byte-pos value, or whose entity-length value is less than or - equal to its last-byte-pos value, is invalid. The recipient of an - invalid byte-content-range-spec MUST ignore it and any content - transferred along with it. - - - - - - - - - - -Fielding, et. al. Standards Track [Page 114] - -RFC 2068 HTTP/1.1 January 1997 - - - Examples of byte-content-range-spec values, assuming that the entity - contains a total of 1234 bytes: - - o The first 500 bytes: - - bytes 0-499/1234 - - o The second 500 bytes: - - bytes 500-999/1234 - - o All except for the first 500 bytes: - - bytes 500-1233/1234 - - o The last 500 bytes: - - bytes 734-1233/1234 - - When an HTTP message includes the content of a single range (for - example, a response to a request for a single range, or to a request - for a set of ranges that overlap without any holes), this content is - transmitted with a Content-Range header, and a Content-Length header - showing the number of bytes actually transferred. For example, - - HTTP/1.1 206 Partial content - Date: Wed, 15 Nov 1995 06:25:24 GMT - Last-modified: Wed, 15 Nov 1995 04:58:08 GMT - Content-Range: bytes 21010-47021/47022 - Content-Length: 26012 - Content-Type: image/gif - - When an HTTP message includes the content of multiple ranges (for - example, a response to a request for multiple non-overlapping - ranges), these are transmitted as a multipart MIME message. The - multipart MIME content-type used for this purpose is defined in this - specification to be "multipart/byteranges". See appendix 19.2 for its - definition. - - A client that cannot decode a MIME multipart/byteranges message - should not ask for multiple byte-ranges in a single request. - - When a client requests multiple byte-ranges in one request, the - server SHOULD return them in the order that they appeared in the - request. - - If the server ignores a byte-range-spec because it is invalid, the - server should treat the request as if the invalid Range header field - - - -Fielding, et. al. Standards Track [Page 115] - -RFC 2068 HTTP/1.1 January 1997 - - - did not exist. (Normally, this means return a 200 response containing - the full entity). The reason is that the only time a client will make - such an invalid request is when the entity is smaller than the entity - retrieved by a prior request. - -14.18 Content-Type - - The Content-Type entity-header field indicates the media type of the - entity-body sent to the recipient or, in the case of the HEAD method, - the media type that would have been sent had the request been a GET. - - Content-Type = "Content-Type" ":" media-type - Media types are defined in section 3.7. An example of the field is - - Content-Type: text/html; charset=ISO-8859-4 - - Further discussion of methods for identifying the media type of an - entity is provided in section 7.2.1. - -14.19 Date - - The Date general-header field represents the date and time at which - the message was originated, having the same semantics as orig-date in - RFC 822. The field value is an HTTP-date, as described in section - 3.3.1. - - Date = "Date" ":" HTTP-date - - An example is - - Date: Tue, 15 Nov 1994 08:12:31 GMT - - If a message is received via direct connection with the user agent - (in the case of requests) or the origin server (in the case of - responses), then the date can be assumed to be the current date at - the receiving end. However, since the date--as it is believed by the - origin--is important for evaluating cached responses, origin servers - MUST include a Date header field in all responses. Clients SHOULD - only send a Date header field in messages that include an entity- - body, as in the case of the PUT and POST requests, and even then it - is optional. A received message which does not have a Date header - field SHOULD be assigned one by the recipient if the message will be - cached by that recipient or gatewayed via a protocol which requires a - Date. - - - - - - - -Fielding, et. al. Standards Track [Page 116] - -RFC 2068 HTTP/1.1 January 1997 - - - In theory, the date SHOULD represent the moment just before the - entity is generated. In practice, the date can be generated at any - time during the message origination without affecting its semantic - value. - - The format of the Date is an absolute date and time as defined by - HTTP-date in section 3.3; it MUST be sent in RFC1123 [8]-date format. - -14.20 ETag - - The ETag entity-header field defines the entity tag for the - associated entity. The headers used with entity tags are described in - sections 14.20, 14.25, 14.26 and 14.43. The entity tag may be used - for comparison with other entities from the same resource (see - section 13.3.2). - - ETag = "ETag" ":" entity-tag - - Examples: - - ETag: "xyzzy" - ETag: W/"xyzzy" - ETag: "" - -14.21 Expires - - The Expires entity-header field gives the date/time after which the - response should be considered stale. A stale cache entry may not - normally be returned by a cache (either a proxy cache or an user - agent cache) unless it is first validated with the origin server (or - with an intermediate cache that has a fresh copy of the entity). See - section 13.2 for further discussion of the expiration model. - - The presence of an Expires field does not imply that the original - resource will change or cease to exist at, before, or after that - time. - - The format is an absolute date and time as defined by HTTP-date in - section 3.3; it MUST be in RFC1123-date format: - - Expires = "Expires" ":" HTTP-date - - - - - - - - - - -Fielding, et. al. Standards Track [Page 117] - -RFC 2068 HTTP/1.1 January 1997 - - - An example of its use is - - Expires: Thu, 01 Dec 1994 16:00:00 GMT - - Note: if a response includes a Cache-Control field with the max-age - directive, that directive overrides the Expires field. - - HTTP/1.1 clients and caches MUST treat other invalid date formats, - especially including the value "0", as in the past (i.e., "already - expired"). - - To mark a response as "already expired," an origin server should use - an Expires date that is equal to the Date header value. (See the - rules for expiration calculations in section 13.2.4.) - - To mark a response as "never expires," an origin server should use an - Expires date approximately one year from the time the response is - sent. HTTP/1.1 servers should not send Expires dates more than one - year in the future. - - The presence of an Expires header field with a date value of some - time in the future on an response that otherwise would by default be - non-cacheable indicates that the response is cachable, unless - indicated otherwise by a Cache-Control header field (section 14.9). - -14.22 From - - The From request-header field, if given, SHOULD contain an Internet - e-mail address for the human user who controls the requesting user - agent. The address SHOULD be machine-usable, as defined by mailbox - in RFC 822 (as updated by RFC 1123 ): - - From = "From" ":" mailbox - - An example is: - - From: webmaster@w3.org - - This header field MAY be used for logging purposes and as a means for - identifying the source of invalid or unwanted requests. It SHOULD NOT - be used as an insecure form of access protection. The interpretation - of this field is that the request is being performed on behalf of the - person given, who accepts responsibility for the method performed. In - particular, robot agents SHOULD include this header so that the - person responsible for running the robot can be contacted if problems - occur on the receiving end. - - - - - -Fielding, et. al. Standards Track [Page 118] - -RFC 2068 HTTP/1.1 January 1997 - - - The Internet e-mail address in this field MAY be separate from the - Internet host which issued the request. For example, when a request - is passed through a proxy the original issuer's address SHOULD be - used. - - Note: The client SHOULD not send the From header field without the - user's approval, as it may conflict with the user's privacy - interests or their site's security policy. It is strongly - recommended that the user be able to disable, enable, and modify - the value of this field at any time prior to a request. - -14.23 Host - - The Host request-header field specifies the Internet host and port - number of the resource being requested, as obtained from the original - URL given by the user or referring resource (generally an HTTP URL, - as described in section 3.2.2). The Host field value MUST represent - the network location of the origin server or gateway given by the - original URL. This allows the origin server or gateway to - differentiate between internally-ambiguous URLs, such as the root "/" - URL of a server for multiple host names on a single IP address. - - Host = "Host" ":" host [ ":" port ] ; Section 3.2.2 - - A "host" without any trailing port information implies the default - port for the service requested (e.g., "80" for an HTTP URL). For - example, a request on the origin server for - <http://www.w3.org/pub/WWW/> MUST include: - - GET /pub/WWW/ HTTP/1.1 - Host: www.w3.org - - A client MUST include a Host header field in all HTTP/1.1 request - messages on the Internet (i.e., on any message corresponding to a - request for a URL which includes an Internet host address for the - service being requested). If the Host field is not already present, - an HTTP/1.1 proxy MUST add a Host field to the request message prior - to forwarding it on the Internet. All Internet-based HTTP/1.1 servers - MUST respond with a 400 status code to any HTTP/1.1 request message - which lacks a Host header field. - - See sections 5.2 and 19.5.1 for other requirements relating to Host. - -14.24 If-Modified-Since - - The If-Modified-Since request-header field is used with the GET - method to make it conditional: if the requested variant has not been - modified since the time specified in this field, an entity will not - - - -Fielding, et. al. Standards Track [Page 119] - -RFC 2068 HTTP/1.1 January 1997 - - - be returned from the server; instead, a 304 (not modified) response - will be returned without any message-body. - - If-Modified-Since = "If-Modified-Since" ":" HTTP-date - - An example of the field is: - - If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT - - A GET method with an If-Modified-Since header and no Range header - requests that the identified entity be transferred only if it has - been modified since the date given by the If-Modified-Since header. - The algorithm for determining this includes the following cases: - - a)If the request would normally result in anything other than a 200 - (OK) status, or if the passed If-Modified-Since date is invalid, the - response is exactly the same as for a normal GET. A date which is - later than the server's current time is invalid. - - b)If the variant has been modified since the If-Modified-Since date, - the response is exactly the same as for a normal GET. - - c)If the variant has not been modified since a valid If-Modified-Since - date, the server MUST return a 304 (Not Modified) response. - - The purpose of this feature is to allow efficient updates of cached - information with a minimum amount of transaction overhead. - - Note that the Range request-header field modifies the meaning of - If-Modified-Since; see section 14.36 for full details. - - Note that If-Modified-Since times are interpreted by the server, - whose clock may not be synchronized with the client. - - Note that if a client uses an arbitrary date in the If-Modified-Since - header instead of a date taken from the Last-Modified header for the - same request, the client should be aware of the fact that this date - is interpreted in the server's understanding of time. The client - should consider unsynchronized clocks and rounding problems due to - the different encodings of time between the client and server. This - includes the possibility of race conditions if the document has - changed between the time it was first requested and the If-Modified- - Since date of a subsequent request, and the possibility of clock- - skew-related problems if the If-Modified-Since date is derived from - the client's clock without correction to the server's clock. - Corrections for different time bases between client and server are at - best approximate due to network latency. - - - - -Fielding, et. al. Standards Track [Page 120] - -RFC 2068 HTTP/1.1 January 1997 - - -14.25 If-Match - - The If-Match request-header field is used with a method to make it - conditional. A client that has one or more entities previously - obtained from the resource can verify that one of those entities is - current by including a list of their associated entity tags in the - If-Match header field. The purpose of this feature is to allow - efficient updates of cached information with a minimum amount of - transaction overhead. It is also used, on updating requests, to - prevent inadvertent modification of the wrong version of a resource. - As a special case, the value "*" matches any current entity of the - resource. - - If-Match = "If-Match" ":" ( "*" | 1#entity-tag ) - - If any of the entity tags match the entity tag of the entity that - would have been returned in the response to a similar GET request - (without the If-Match header) on that resource, or if "*" is given - and any current entity exists for that resource, then the server MAY - perform the requested method as if the If-Match header field did not - exist. - - A server MUST use the strong comparison function (see section 3.11) - to compare the entity tags in If-Match. - - If none of the entity tags match, or if "*" is given and no current - entity exists, the server MUST NOT perform the requested method, and - MUST return a 412 (Precondition Failed) response. This behavior is - most useful when the client wants to prevent an updating method, such - as PUT, from modifying a resource that has changed since the client - last retrieved it. - - If the request would, without the If-Match header field, result in - anything other than a 2xx status, then the If-Match header MUST be - ignored. - - The meaning of "If-Match: *" is that the method SHOULD be performed - if the representation selected by the origin server (or by a cache, - possibly using the Vary mechanism, see section 14.43) exists, and - MUST NOT be performed if the representation does not exist. - - - - - - - - - - - -Fielding, et. al. Standards Track [Page 121] - -RFC 2068 HTTP/1.1 January 1997 - - - A request intended to update a resource (e.g., a PUT) MAY include an - If-Match header field to signal that the request method MUST NOT be - applied if the entity corresponding to the If-Match value (a single - entity tag) is no longer a representation of that resource. This - allows the user to indicate that they do not wish the request to be - successful if the resource has been changed without their knowledge. - Examples: - - If-Match: "xyzzy" - If-Match: "xyzzy", "r2d2xxxx", "c3piozzzz" - If-Match: * - -14.26 If-None-Match - - The If-None-Match request-header field is used with a method to make - it conditional. A client that has one or more entities previously - obtained from the resource can verify that none of those entities is - current by including a list of their associated entity tags in the - If-None-Match header field. The purpose of this feature is to allow - efficient updates of cached information with a minimum amount of - transaction overhead. It is also used, on updating requests, to - prevent inadvertent modification of a resource which was not known to - exist. - - As a special case, the value "*" matches any current entity of the - resource. - - If-None-Match = "If-None-Match" ":" ( "*" | 1#entity-tag ) - - If any of the entity tags match the entity tag of the entity that - would have been returned in the response to a similar GET request - (without the If-None-Match header) on that resource, or if "*" is - given and any current entity exists for that resource, then the - server MUST NOT perform the requested method. Instead, if the request - method was GET or HEAD, the server SHOULD respond with a 304 (Not - Modified) response, including the cache-related entity-header fields - (particularly ETag) of one of the entities that matched. For all - other request methods, the server MUST respond with a status of 412 - (Precondition Failed). - - See section 13.3.3 for rules on how to determine if two entity tags - match. The weak comparison function can only be used with GET or HEAD - requests. - - If none of the entity tags match, or if "*" is given and no current - entity exists, then the server MAY perform the requested method as if - the If-None-Match header field did not exist. - - - - -Fielding, et. al. Standards Track [Page 122] - -RFC 2068 HTTP/1.1 January 1997 - - - If the request would, without the If-None-Match header field, result - in anything other than a 2xx status, then the If-None-Match header - MUST be ignored. - - The meaning of "If-None-Match: *" is that the method MUST NOT be - performed if the representation selected by the origin server (or by - a cache, possibly using the Vary mechanism, see section 14.43) - exists, and SHOULD be performed if the representation does not exist. - This feature may be useful in preventing races between PUT - operations. - - Examples: - - If-None-Match: "xyzzy" - If-None-Match: W/"xyzzy" - If-None-Match: "xyzzy", "r2d2xxxx", "c3piozzzz" - If-None-Match: W/"xyzzy", W/"r2d2xxxx", W/"c3piozzzz" - If-None-Match: * - -14.27 If-Range - - If a client has a partial copy of an entity in its cache, and wishes - to have an up-to-date copy of the entire entity in its cache, it - could use the Range request-header with a conditional GET (using - either or both of If-Unmodified-Since and If-Match.) However, if the - condition fails because the entity has been modified, the client - would then have to make a second request to obtain the entire current - entity-body. - - The If-Range header allows a client to "short-circuit" the second - request. Informally, its meaning is `if the entity is unchanged, send - me the part(s) that I am missing; otherwise, send me the entire new - entity.' - - If-Range = "If-Range" ":" ( entity-tag | HTTP-date ) - - If the client has no entity tag for an entity, but does have a Last- - Modified date, it may use that date in a If-Range header. (The server - can distinguish between a valid HTTP-date and any form of entity-tag - by examining no more than two characters.) The If-Range header should - only be used together with a Range header, and must be ignored if the - request does not include a Range header, or if the server does not - support the sub-range operation. - - - - - - - - -Fielding, et. al. Standards Track [Page 123] - -RFC 2068 HTTP/1.1 January 1997 - - - If the entity tag given in the If-Range header matches the current - entity tag for the entity, then the server should provide the - specified sub-range of the entity using a 206 (Partial content) - response. If the entity tag does not match, then the server should - return the entire entity using a 200 (OK) response. - -14.28 If-Unmodified-Since - - The If-Unmodified-Since request-header field is used with a method to - make it conditional. If the requested resource has not been modified - since the time specified in this field, the server should perform the - requested operation as if the If-Unmodified-Since header were not - present. - - If the requested variant has been modified since the specified time, - the server MUST NOT perform the requested operation, and MUST return - a 412 (Precondition Failed). - - If-Unmodified-Since = "If-Unmodified-Since" ":" HTTP-date - - An example of the field is: - - If-Unmodified-Since: Sat, 29 Oct 1994 19:43:31 GMT - - If the request normally (i.e., without the If-Unmodified-Since - header) would result in anything other than a 2xx status, the If- - Unmodified-Since header should be ignored. - - If the specified date is invalid, the header is ignored. - -14.29 Last-Modified - - The Last-Modified entity-header field indicates the date and time at - which the origin server believes the variant was last modified. - - Last-Modified = "Last-Modified" ":" HTTP-date - - An example of its use is - - Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT - - The exact meaning of this header field depends on the implementation - of the origin server and the nature of the original resource. For - files, it may be just the file system last-modified time. For - entities with dynamically included parts, it may be the most recent - of the set of last-modify times for its component parts. For database - gateways, it may be the last-update time stamp of the record. For - virtual objects, it may be the last time the internal state changed. - - - -Fielding, et. al. Standards Track [Page 124] - -RFC 2068 HTTP/1.1 January 1997 - - - An origin server MUST NOT send a Last-Modified date which is later - than the server's time of message origination. In such cases, where - the resource's last modification would indicate some time in the - future, the server MUST replace that date with the message - origination date. - - An origin server should obtain the Last-Modified value of the entity - as close as possible to the time that it generates the Date value of - its response. This allows a recipient to make an accurate assessment - of the entity's modification time, especially if the entity changes - near the time that the response is generated. - - HTTP/1.1 servers SHOULD send Last-Modified whenever feasible. - -14.30 Location - - The Location response-header field is used to redirect the recipient - to a location other than the Request-URI for completion of the - request or identification of a new resource. For 201 (Created) - responses, the Location is that of the new resource which was created - by the request. For 3xx responses, the location SHOULD indicate the - server's preferred URL for automatic redirection to the resource. The - field value consists of a single absolute URL. - - Location = "Location" ":" absoluteURI - - An example is - - Location: http://www.w3.org/pub/WWW/People.html - - Note: The Content-Location header field (section 14.15) differs - from Location in that the Content-Location identifies the original - location of the entity enclosed in the request. It is therefore - possible for a response to contain header fields for both Location - and Content-Location. Also see section 13.10 for cache requirements - of some methods. - -14.31 Max-Forwards - - The Max-Forwards request-header field may be used with the TRACE - method (section 14.31) to limit the number of proxies or gateways - that can forward the request to the next inbound server. This can be - useful when the client is attempting to trace a request chain which - appears to be failing or looping in mid-chain. - - Max-Forwards = "Max-Forwards" ":" 1*DIGIT - - - - - -Fielding, et. al. Standards Track [Page 125] - -RFC 2068 HTTP/1.1 January 1997 - - - The Max-Forwards value is a decimal integer indicating the remaining - number of times this request message may be forwarded. - - Each proxy or gateway recipient of a TRACE request containing a Max- - Forwards header field SHOULD check and update its value prior to - forwarding the request. If the received value is zero (0), the - recipient SHOULD NOT forward the request; instead, it SHOULD respond - as the final recipient with a 200 (OK) response containing the - received request message as the response entity-body (as described in - section 9.8). If the received Max-Forwards value is greater than - zero, then the forwarded message SHOULD contain an updated Max- - Forwards field with a value decremented by one (1). - - The Max-Forwards header field SHOULD be ignored for all other methods - defined by this specification and for any extension methods for which - it is not explicitly referred to as part of that method definition. - -14.32 Pragma - - The Pragma general-header field is used to include implementation- - specific directives that may apply to any recipient along the - request/response chain. All pragma directives specify optional - behavior from the viewpoint of the protocol; however, some systems - MAY require that behavior be consistent with the directives. - - Pragma = "Pragma" ":" 1#pragma-directive - - pragma-directive = "no-cache" | extension-pragma - extension-pragma = token [ "=" ( token | quoted-string ) ] - - When the no-cache directive is present in a request message, an - application SHOULD forward the request toward the origin server even - if it has a cached copy of what is being requested. This pragma - directive has the same semantics as the no-cache cache-directive (see - section 14.9) and is defined here for backwards compatibility with - HTTP/1.0. Clients SHOULD include both header fields when a no-cache - request is sent to a server not known to be HTTP/1.1 compliant. - - Pragma directives MUST be passed through by a proxy or gateway - application, regardless of their significance to that application, - since the directives may be applicable to all recipients along the - request/response chain. It is not possible to specify a pragma for a - specific recipient; however, any pragma directive not relevant to a - recipient SHOULD be ignored by that recipient. - - - - - - - -Fielding, et. al. Standards Track [Page 126] - -RFC 2068 HTTP/1.1 January 1997 - - - HTTP/1.1 clients SHOULD NOT send the Pragma request-header. HTTP/1.1 - caches SHOULD treat "Pragma: no-cache" as if the client had sent - "Cache-Control: no-cache". No new Pragma directives will be defined - in HTTP. - -14.33 Proxy-Authenticate - - The Proxy-Authenticate response-header field MUST be included as part - of a 407 (Proxy Authentication Required) response. The field value - consists of a challenge that indicates the authentication scheme and - parameters applicable to the proxy for this Request-URI. - - Proxy-Authenticate = "Proxy-Authenticate" ":" challenge - - The HTTP access authentication process is described in section 11. - Unlike WWW-Authenticate, the Proxy-Authenticate header field applies - only to the current connection and SHOULD NOT be passed on to - downstream clients. However, an intermediate proxy may need to obtain - its own credentials by requesting them from the downstream client, - which in some circumstances will appear as if the proxy is forwarding - the Proxy-Authenticate header field. - -14.34 Proxy-Authorization - - The Proxy-Authorization request-header field allows the client to - identify itself (or its user) to a proxy which requires - authentication. The Proxy-Authorization field value consists of - credentials containing the authentication information of the user - agent for the proxy and/or realm of the resource being requested. - - Proxy-Authorization = "Proxy-Authorization" ":" credentials - - The HTTP access authentication process is described in section 11. - Unlike Authorization, the Proxy-Authorization header field applies - only to the next outbound proxy that demanded authentication using - the Proxy-Authenticate field. When multiple proxies are used in a - chain, the Proxy-Authorization header field is consumed by the first - outbound proxy that was expecting to receive credentials. A proxy MAY - relay the credentials from the client request to the next proxy if - that is the mechanism by which the proxies cooperatively authenticate - a given request. - -14.35 Public - - The Public response-header field lists the set of methods supported - by the server. The purpose of this field is strictly to inform the - recipient of the capabilities of the server regarding unusual - methods. The methods listed may or may not be applicable to the - - - -Fielding, et. al. Standards Track [Page 127] - -RFC 2068 HTTP/1.1 January 1997 - - - Request-URI; the Allow header field (section 14.7) MAY be used to - indicate methods allowed for a particular URI. - - Public = "Public" ":" 1#method - - Example of use: - - Public: OPTIONS, MGET, MHEAD, GET, HEAD - - This header field applies only to the server directly connected to - the client (i.e., the nearest neighbor in a chain of connections). If - the response passes through a proxy, the proxy MUST either remove the - Public header field or replace it with one applicable to its own - capabilities. - -14.36 Range - -14.36.1 Byte Ranges - - Since all HTTP entities are represented in HTTP messages as sequences - of bytes, the concept of a byte range is meaningful for any HTTP - entity. (However, not all clients and servers need to support byte- - range operations.) - - Byte range specifications in HTTP apply to the sequence of bytes in - the entity-body (not necessarily the same as the message-body). - - A byte range operation may specify a single range of bytes, or a set - of ranges within a single entity. - - ranges-specifier = byte-ranges-specifier - - byte-ranges-specifier = bytes-unit "=" byte-range-set - - byte-range-set = 1#( byte-range-spec | suffix-byte-range-spec ) - - byte-range-spec = first-byte-pos "-" [last-byte-pos] - - first-byte-pos = 1*DIGIT - - last-byte-pos = 1*DIGIT - - The first-byte-pos value in a byte-range-spec gives the byte-offset - of the first byte in a range. The last-byte-pos value gives the - byte-offset of the last byte in the range; that is, the byte - positions specified are inclusive. Byte offsets start at zero. - - - - - -Fielding, et. al. Standards Track [Page 128] - -RFC 2068 HTTP/1.1 January 1997 - - - If the last-byte-pos value is present, it must be greater than or - equal to the first-byte-pos in that byte-range-spec, or the byte- - range-spec is invalid. The recipient of an invalid byte-range-spec - must ignore it. - - If the last-byte-pos value is absent, or if the value is greater than - or equal to the current length of the entity-body, last-byte-pos is - taken to be equal to one less than the current length of the entity- - body in bytes. - - By its choice of last-byte-pos, a client can limit the number of - bytes retrieved without knowing the size of the entity. - - suffix-byte-range-spec = "-" suffix-length - - suffix-length = 1*DIGIT - - A suffix-byte-range-spec is used to specify the suffix of the - entity-body, of a length given by the suffix-length value. (That is, - this form specifies the last N bytes of an entity-body.) If the - entity is shorter than the specified suffix-length, the entire - entity-body is used. - - Examples of byte-ranges-specifier values (assuming an entity-body of - length 10000): - - o The first 500 bytes (byte offsets 0-499, inclusive): - - bytes=0-499 - - o The second 500 bytes (byte offsets 500-999, inclusive): - - bytes=500-999 - - o The final 500 bytes (byte offsets 9500-9999, inclusive): - - bytes=-500 - - o Or - - bytes=9500- - - o The first and last bytes only (bytes 0 and 9999): - - bytes=0-0,-1 - - - - - - -Fielding, et. al. Standards Track [Page 129] - -RFC 2068 HTTP/1.1 January 1997 - - - o Several legal but not canonical specifications of the second - 500 bytes (byte offsets 500-999, inclusive): - - bytes=500-600,601-999 - - bytes=500-700,601-999 - -14.36.2 Range Retrieval Requests - - HTTP retrieval requests using conditional or unconditional GET - methods may request one or more sub-ranges of the entity, instead of - the entire entity, using the Range request header, which applies to - the entity returned as the result of the request: - - Range = "Range" ":" ranges-specifier - - A server MAY ignore the Range header. However, HTTP/1.1 origin - servers and intermediate caches SHOULD support byte ranges when - possible, since Range supports efficient recovery from partially - failed transfers, and supports efficient partial retrieval of large - entities. - - If the server supports the Range header and the specified range or - ranges are appropriate for the entity: - - o The presence of a Range header in an unconditional GET modifies - what is returned if the GET is otherwise successful. In other - words, the response carries a status code of 206 (Partial - Content) instead of 200 (OK). - - o The presence of a Range header in a conditional GET (a request - using one or both of If-Modified-Since and If-None-Match, or - one or both of If-Unmodified-Since and If-Match) modifies what - is returned if the GET is otherwise successful and the condition - is true. It does not affect the 304 (Not Modified) response - returned if the conditional is false. - - In some cases, it may be more appropriate to use the If-Range header - (see section 14.27) in addition to the Range header. - - If a proxy that supports ranges receives a Range request, forwards - the request to an inbound server, and receives an entire entity in - reply, it SHOULD only return the requested range to its client. It - SHOULD store the entire received response in its cache, if that is - consistent with its cache allocation policies. - - - - - - -Fielding, et. al. Standards Track [Page 130] - -RFC 2068 HTTP/1.1 January 1997 - - -14.37 Referer - - The Referer[sic] request-header field allows the client to specify, - for the server's benefit, the address (URI) of the resource from - which the Request-URI was obtained (the "referrer", although the - header field is misspelled.) The Referer request-header allows a - server to generate lists of back-links to resources for interest, - logging, optimized caching, etc. It also allows obsolete or mistyped - links to be traced for maintenance. The Referer field MUST NOT be - sent if the Request-URI was obtained from a source that does not have - its own URI, such as input from the user keyboard. - - Referer = "Referer" ":" ( absoluteURI | relativeURI ) - - Example: - - Referer: http://www.w3.org/hypertext/DataSources/Overview.html - - If the field value is a partial URI, it SHOULD be interpreted - relative to the Request-URI. The URI MUST NOT include a fragment. - - Note: Because the source of a link may be private information or - may reveal an otherwise private information source, it is strongly - recommended that the user be able to select whether or not the - Referer field is sent. For example, a browser client could have a - toggle switch for browsing openly/anonymously, which would - respectively enable/disable the sending of Referer and From - information. - -14.38 Retry-After - - The Retry-After response-header field can be used with a 503 (Service - Unavailable) response to indicate how long the service is expected to - be unavailable to the requesting client. The value of this field can - be either an HTTP-date or an integer number of seconds (in decimal) - after the time of the response. - - Retry-After = "Retry-After" ":" ( HTTP-date | delta-seconds ) - - Two examples of its use are - - Retry-After: Fri, 31 Dec 1999 23:59:59 GMT - Retry-After: 120 - - In the latter example, the delay is 2 minutes. - - - - - - -Fielding, et. al. Standards Track [Page 131] - -RFC 2068 HTTP/1.1 January 1997 - - -14.39 Server - - The Server response-header field contains information about the - software used by the origin server to handle the request. The field - can contain multiple product tokens (section 3.8) and comments - identifying the server and any significant subproducts. The product - tokens are listed in order of their significance for identifying the - application. - - Server = "Server" ":" 1*( product | comment ) - - Example: - - Server: CERN/3.0 libwww/2.17 - - If the response is being forwarded through a proxy, the proxy - application MUST NOT modify the Server response-header. Instead, it - SHOULD include a Via field (as described in section 14.44). - - Note: Revealing the specific software version of the server may - allow the server machine to become more vulnerable to attacks - against software that is known to contain security holes. Server - implementers are encouraged to make this field a configurable - option. - -14.40 Transfer-Encoding - - The Transfer-Encoding general-header field indicates what (if any) - type of transformation has been applied to the message body in order - to safely transfer it between the sender and the recipient. This - differs from the Content-Encoding in that the transfer coding is a - property of the message, not of the entity. - - Transfer-Encoding = "Transfer-Encoding" ":" 1#transfer- - coding - - Transfer codings are defined in section 3.6. An example is: - - Transfer-Encoding: chunked - - Many older HTTP/1.0 applications do not understand the Transfer- - Encoding header. - -14.41 Upgrade - - The Upgrade general-header allows the client to specify what - additional communication protocols it supports and would like to use - if the server finds it appropriate to switch protocols. The server - - - -Fielding, et. al. Standards Track [Page 132] - -RFC 2068 HTTP/1.1 January 1997 - - - MUST use the Upgrade header field within a 101 (Switching Protocols) - response to indicate which protocol(s) are being switched. - - Upgrade = "Upgrade" ":" 1#product - - For example, - - Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11 - - The Upgrade header field is intended to provide a simple mechanism - for transition from HTTP/1.1 to some other, incompatible protocol. It - does so by allowing the client to advertise its desire to use another - protocol, such as a later version of HTTP with a higher major version - number, even though the current request has been made using HTTP/1.1. - This eases the difficult transition between incompatible protocols by - allowing the client to initiate a request in the more commonly - supported protocol while indicating to the server that it would like - to use a "better" protocol if available (where "better" is determined - by the server, possibly according to the nature of the method and/or - resource being requested). - - The Upgrade header field only applies to switching application-layer - protocols upon the existing transport-layer connection. Upgrade - cannot be used to insist on a protocol change; its acceptance and use - by the server is optional. The capabilities and nature of the - application-layer communication after the protocol change is entirely - dependent upon the new protocol chosen, although the first action - after changing the protocol MUST be a response to the initial HTTP - request containing the Upgrade header field. - - The Upgrade header field only applies to the immediate connection. - Therefore, the upgrade keyword MUST be supplied within a Connection - header field (section 14.10) whenever Upgrade is present in an - HTTP/1.1 message. - - The Upgrade header field cannot be used to indicate a switch to a - protocol on a different connection. For that purpose, it is more - appropriate to use a 301, 302, 303, or 305 redirection response. - - This specification only defines the protocol name "HTTP" for use by - the family of Hypertext Transfer Protocols, as defined by the HTTP - version rules of section 3.1 and future updates to this - specification. Any token can be used as a protocol name; however, it - will only be useful if both the client and server associate the name - with the same protocol. - - - - - - -Fielding, et. al. Standards Track [Page 133] - -RFC 2068 HTTP/1.1 January 1997 - - -14.42 User-Agent - - The User-Agent request-header field contains information about the - user agent originating the request. This is for statistical purposes, - the tracing of protocol violations, and automated recognition of user - agents for the sake of tailoring responses to avoid particular user - agent limitations. User agents SHOULD include this field with - requests. The field can contain multiple product tokens (section 3.8) - and comments identifying the agent and any subproducts which form a - significant part of the user agent. By convention, the product tokens - are listed in order of their significance for identifying the - application. - - User-Agent = "User-Agent" ":" 1*( product | comment ) - - Example: - - User-Agent: CERN-LineMode/2.15 libwww/2.17b3 - -14.43 Vary - - The Vary response-header field is used by a server to signal that the - response entity was selected from the available representations of - the response using server-driven negotiation (section 12). Field- - names listed in Vary headers are those of request-headers. The Vary - field value indicates either that the given set of header fields - encompass the dimensions over which the representation might vary, or - that the dimensions of variance are unspecified ("*") and thus may - vary over any aspect of future requests. - - Vary = "Vary" ":" ( "*" | 1#field-name ) - - An HTTP/1.1 server MUST include an appropriate Vary header field with - any cachable response that is subject to server-driven negotiation. - Doing so allows a cache to properly interpret future requests on that - resource and informs the user agent about the presence of negotiation - on that resource. A server SHOULD include an appropriate Vary header - field with a non-cachable response that is subject to server-driven - negotiation, since this might provide the user agent with useful - information about the dimensions over which the response might vary. - - The set of header fields named by the Vary field value is known as - the "selecting" request-headers. - - When the cache receives a subsequent request whose Request-URI - specifies one or more cache entries including a Vary header, the - cache MUST NOT use such a cache entry to construct a response to the - new request unless all of the headers named in the cached Vary header - - - -Fielding, et. al. Standards Track [Page 134] - -RFC 2068 HTTP/1.1 January 1997 - - - are present in the new request, and all of the stored selecting - request-headers from the previous request match the corresponding - headers in the new request. - - The selecting request-headers from two requests are defined to match - if and only if the selecting request-headers in the first request can - be transformed to the selecting request-headers in the second request - by adding or removing linear whitespace (LWS) at places where this is - allowed by the corresponding BNF, and/or combining multiple message- - header fields with the same field name following the rules about - message headers in section 4.2. - - A Vary field value of "*" signals that unspecified parameters, - possibly other than the contents of request-header fields (e.g., the - network address of the client), play a role in the selection of the - response representation. Subsequent requests on that resource can - only be properly interpreted by the origin server, and thus a cache - MUST forward a (possibly conditional) request even when it has a - fresh response cached for the resource. See section 13.6 for use of - the Vary header by caches. - - A Vary field value consisting of a list of field-names signals that - the representation selected for the response is based on a selection - algorithm which considers ONLY the listed request-header field values - in selecting the most appropriate representation. A cache MAY assume - that the same selection will be made for future requests with the - same values for the listed field names, for the duration of time in - which the response is fresh. - - The field-names given are not limited to the set of standard - request-header fields defined by this specification. Field names are - case-insensitive. - -14.44 Via - - The Via general-header field MUST be used by gateways and proxies to - indicate the intermediate protocols and recipients between the user - agent and the server on requests, and between the origin server and - the client on responses. It is analogous to the "Received" field of - RFC 822 and is intended to be used for tracking message forwards, - avoiding request loops, and identifying the protocol capabilities of - all senders along the request/response chain. - - - - - - - - - -Fielding, et. al. Standards Track [Page 135] - -RFC 2068 HTTP/1.1 January 1997 - - - Via = "Via" ":" 1#( received-protocol received-by [ comment ] ) - - received-protocol = [ protocol-name "/" ] protocol-version - protocol-name = token - protocol-version = token - received-by = ( host [ ":" port ] ) | pseudonym - pseudonym = token - - The received-protocol indicates the protocol version of the message - received by the server or client along each segment of the - request/response chain. The received-protocol version is appended to - the Via field value when the message is forwarded so that information - about the protocol capabilities of upstream applications remains - visible to all recipients. - - The protocol-name is optional if and only if it would be "HTTP". The - received-by field is normally the host and optional port number of a - recipient server or client that subsequently forwarded the message. - However, if the real host is considered to be sensitive information, - it MAY be replaced by a pseudonym. If the port is not given, it MAY - be assumed to be the default port of the received-protocol. - - Multiple Via field values represent each proxy or gateway that has - forwarded the message. Each recipient MUST append its information - such that the end result is ordered according to the sequence of - forwarding applications. - - Comments MAY be used in the Via header field to identify the software - of the recipient proxy or gateway, analogous to the User-Agent and - Server header fields. However, all comments in the Via field are - optional and MAY be removed by any recipient prior to forwarding the - message. - - For example, a request message could be sent from an HTTP/1.0 user - agent to an internal proxy code-named "fred", which uses HTTP/1.1 to - forward the request to a public proxy at nowhere.com, which completes - the request by forwarding it to the origin server at www.ics.uci.edu. - The request received by www.ics.uci.edu would then have the following - Via header field: - - Via: 1.0 fred, 1.1 nowhere.com (Apache/1.1) - - Proxies and gateways used as a portal through a network firewall - SHOULD NOT, by default, forward the names and ports of hosts within - the firewall region. This information SHOULD only be propagated if - explicitly enabled. If not enabled, the received-by host of any host - behind the firewall SHOULD be replaced by an appropriate pseudonym - for that host. - - - -Fielding, et. al. Standards Track [Page 136] - -RFC 2068 HTTP/1.1 January 1997 - - - For organizations that have strong privacy requirements for hiding - internal structures, a proxy MAY combine an ordered subsequence of - Via header field entries with identical received-protocol values into - a single such entry. For example, - - Via: 1.0 ricky, 1.1 ethel, 1.1 fred, 1.0 lucy - - could be collapsed to - - Via: 1.0 ricky, 1.1 mertz, 1.0 lucy - - Applications SHOULD NOT combine multiple entries unless they are all - under the same organizational control and the hosts have already been - replaced by pseudonyms. Applications MUST NOT combine entries which - have different received-protocol values. - -14.45 Warning - - The Warning response-header field is used to carry additional - information about the status of a response which may not be reflected - by the response status code. This information is typically, though - not exclusively, used to warn about a possible lack of semantic - transparency from caching operations. - - Warning headers are sent with responses using: - - Warning = "Warning" ":" 1#warning-value - - warning-value = warn-code SP warn-agent SP warn-text - warn-code = 2DIGIT - warn-agent = ( host [ ":" port ] ) | pseudonym - ; the name or pseudonym of the server adding - ; the Warning header, for use in debugging - warn-text = quoted-string - - A response may carry more than one Warning header. - - The warn-text should be in a natural language and character set that - is most likely to be intelligible to the human user receiving the - response. This decision may be based on any available knowledge, - such as the location of the cache or user, the Accept-Language field - in a request, the Content-Language field in a response, etc. The - default language is English and the default character set is ISO- - 8859-1. - - If a character set other than ISO-8859-1 is used, it MUST be encoded - in the warn-text using the method described in RFC 1522 [14]. - - - - -Fielding, et. al. Standards Track [Page 137] - -RFC 2068 HTTP/1.1 January 1997 - - - Any server or cache may add Warning headers to a response. New - Warning headers should be added after any existing Warning headers. A - cache MUST NOT delete any Warning header that it received with a - response. However, if a cache successfully validates a cache entry, - it SHOULD remove any Warning headers previously attached to that - entry except as specified for specific Warning codes. It MUST then - add any Warning headers received in the validating response. In other - words, Warning headers are those that would be attached to the most - recent relevant response. - - When multiple Warning headers are attached to a response, the user - agent SHOULD display as many of them as possible, in the order that - they appear in the response. If it is not possible to display all of - the warnings, the user agent should follow these heuristics: - - o Warnings that appear early in the response take priority over those - appearing later in the response. - o Warnings in the user's preferred character set take priority over - warnings in other character sets but with identical warn-codes and - warn-agents. - - Systems that generate multiple Warning headers should order them with - this user agent behavior in mind. - - This is a list of the currently-defined warn-codes, each with a - recommended warn-text in English, and a description of its meaning. - -10 Response is stale - MUST be included whenever the returned response is stale. A cache may - add this warning to any response, but may never remove it until the - response is known to be fresh. - -11 Revalidation failed - MUST be included if a cache returns a stale response because an - attempt to revalidate the response failed, due to an inability to - reach the server. A cache may add this warning to any response, but - may never remove it until the response is successfully revalidated. - -12 Disconnected operation - SHOULD be included if the cache is intentionally disconnected from - the rest of the network for a period of time. - -13 Heuristic expiration - MUST be included if the cache heuristically chose a freshness - lifetime greater than 24 hours and the response's age is greater than - 24 hours. - - - - - -Fielding, et. al. Standards Track [Page 138] - -RFC 2068 HTTP/1.1 January 1997 - - -14 Transformation applied - MUST be added by an intermediate cache or proxy if it applies any - transformation changing the content-coding (as specified in the - Content-Encoding header) or media-type (as specified in the - Content-Type header) of the response, unless this Warning code - already appears in the response. MUST NOT be deleted from a response - even after revalidation. - -99 Miscellaneous warning - The warning text may include arbitrary information to be presented to - a human user, or logged. A system receiving this warning MUST NOT - take any automated action. - -14.46 WWW-Authenticate - - The WWW-Authenticate response-header field MUST be included in 401 - (Unauthorized) response messages. The field value consists of at - least one challenge that indicates the authentication scheme(s) and - parameters applicable to the Request-URI. - - WWW-Authenticate = "WWW-Authenticate" ":" 1#challenge - - The HTTP access authentication process is described in section 11. - User agents MUST take special care in parsing the WWW-Authenticate - field value if it contains more than one challenge, or if more than - one WWW-Authenticate header field is provided, since the contents of - a challenge may itself contain a comma-separated list of - authentication parameters. - -15 Security Considerations - - This section is meant to inform application developers, information - providers, and users of the security limitations in HTTP/1.1 as - described by this document. The discussion does not include - definitive solutions to the problems revealed, though it does make - some suggestions for reducing security risks. - -15.1 Authentication of Clients - - The Basic authentication scheme is not a secure method of user - authentication, nor does it in any way protect the entity, which is - transmitted in clear text across the physical network used as the - carrier. HTTP does not prevent additional authentication schemes and - encryption mechanisms from being employed to increase security or the - addition of enhancements (such as schemes to use one-time passwords) - to Basic authentication. - - - - - -Fielding, et. al. Standards Track [Page 139] - -RFC 2068 HTTP/1.1 January 1997 - - - The most serious flaw in Basic authentication is that it results in - the essentially clear text transmission of the user's password over - the physical network. It is this problem which Digest Authentication - attempts to address. - - Because Basic authentication involves the clear text transmission of - passwords it SHOULD never be used (without enhancements) to protect - sensitive or valuable information. - - A common use of Basic authentication is for identification purposes - -- requiring the user to provide a user name and password as a means - of identification, for example, for purposes of gathering accurate - usage statistics on a server. When used in this way it is tempting to - think that there is no danger in its use if illicit access to the - protected documents is not a major concern. This is only correct if - the server issues both user name and password to the users and in - particular does not allow the user to choose his or her own password. - The danger arises because naive users frequently reuse a single - password to avoid the task of maintaining multiple passwords. - - If a server permits users to select their own passwords, then the - threat is not only illicit access to documents on the server but also - illicit access to the accounts of all users who have chosen to use - their account password. If users are allowed to choose their own - password that also means the server must maintain files containing - the (presumably encrypted) passwords. Many of these may be the - account passwords of users perhaps at distant sites. The owner or - administrator of such a system could conceivably incur liability if - this information is not maintained in a secure fashion. - - Basic Authentication is also vulnerable to spoofing by counterfeit - servers. If a user can be led to believe that he is connecting to a - host containing information protected by basic authentication when in - fact he is connecting to a hostile server or gateway then the - attacker can request a password, store it for later use, and feign an - error. This type of attack is not possible with Digest Authentication - [32]. Server implementers SHOULD guard against the possibility of - this sort of counterfeiting by gateways or CGI scripts. In particular - it is very dangerous for a server to simply turn over a connection to - a gateway since that gateway can then use the persistent connection - mechanism to engage in multiple transactions with the client while - impersonating the original server in a way that is not detectable by - the client. - -15.2 Offering a Choice of Authentication Schemes - - An HTTP/1.1 server may return multiple challenges with a 401 - (Authenticate) response, and each challenge may use a different - - - -Fielding, et. al. Standards Track [Page 140] - -RFC 2068 HTTP/1.1 January 1997 - - - scheme. The order of the challenges returned to the user agent is in - the order that the server would prefer they be chosen. The server - should order its challenges with the "most secure" authentication - scheme first. A user agent should choose as the challenge to be made - to the user the first one that the user agent understands. - - When the server offers choices of authentication schemes using the - WWW-Authenticate header, the "security" of the authentication is only - as malicious user could capture the set of challenges and try to - authenticate him/herself using the weakest of the authentication - schemes. Thus, the ordering serves more to protect the user's - credentials than the server's information. - - A possible man-in-the-middle (MITM) attack would be to add a weak - authentication scheme to the set of choices, hoping that the client - will use one that exposes the user's credentials (e.g. password). For - this reason, the client should always use the strongest scheme that - it understands from the choices accepted. - - An even better MITM attack would be to remove all offered choices, - and to insert a challenge that requests Basic authentication. For - this reason, user agents that are concerned about this kind of attack - could remember the strongest authentication scheme ever requested by - a server and produce a warning message that requires user - confirmation before using a weaker one. A particularly insidious way - to mount such a MITM attack would be to offer a "free" proxy caching - service to gullible users. - -15.3 Abuse of Server Log Information - - A server is in the position to save personal data about a user's - requests which may identify their reading patterns or subjects of - interest. This information is clearly confidential in nature and its - handling may be constrained by law in certain countries. People using - the HTTP protocol to provide data are responsible for ensuring that - such material is not distributed without the permission of any - individuals that are identifiable by the published results. - -15.4 Transfer of Sensitive Information - - Like any generic data transfer protocol, HTTP cannot regulate the - content of the data that is transferred, nor is there any a priori - method of determining the sensitivity of any particular piece of - information within the context of any given request. Therefore, - applications SHOULD supply as much control over this information as - possible to the provider of that information. Four header fields are - worth special mention in this context: Server, Via, Referer and From. - - - - -Fielding, et. al. Standards Track [Page 141] - -RFC 2068 HTTP/1.1 January 1997 - - - Revealing the specific software version of the server may allow the - server machine to become more vulnerable to attacks against software - that is known to contain security holes. Implementers SHOULD make the - Server header field a configurable option. - - Proxies which serve as a portal through a network firewall SHOULD - take special precautions regarding the transfer of header information - that identifies the hosts behind the firewall. In particular, they - SHOULD remove, or replace with sanitized versions, any Via fields - generated behind the firewall. - - The Referer field allows reading patterns to be studied and reverse - links drawn. Although it can be very useful, its power can be abused - if user details are not separated from the information contained in - the Referer. Even when the personal information has been removed, the - Referer field may indicate a private document's URI whose publication - would be inappropriate. - - The information sent in the From field might conflict with the user's - privacy interests or their site's security policy, and hence it - SHOULD NOT be transmitted without the user being able to disable, - enable, and modify the contents of the field. The user MUST be able - to set the contents of this field within a user preference or - application defaults configuration. - - We suggest, though do not require, that a convenient toggle interface - be provided for the user to enable or disable the sending of From and - Referer information. - -15.5 Attacks Based On File and Path Names - - Implementations of HTTP origin servers SHOULD be careful to restrict - the documents returned by HTTP requests to be only those that were - intended by the server administrators. If an HTTP server translates - HTTP URIs directly into file system calls, the server MUST take - special care not to serve files that were not intended to be - delivered to HTTP clients. For example, UNIX, Microsoft Windows, and - other operating systems use ".." as a path component to indicate a - directory level above the current one. On such a system, an HTTP - server MUST disallow any such construct in the Request-URI if it - would otherwise allow access to a resource outside those intended to - be accessible via the HTTP server. Similarly, files intended for - reference only internally to the server (such as access control - files, configuration files, and script code) MUST be protected from - inappropriate retrieval, since they might contain sensitive - information. Experience has shown that minor bugs in such HTTP server - implementations have turned into security risks. - - - - -Fielding, et. al. Standards Track [Page 142] - -RFC 2068 HTTP/1.1 January 1997 - - -15.6 Personal Information - - HTTP clients are often privy to large amounts of personal information - (e.g. the user's name, location, mail address, passwords, encryption - keys, etc.), and SHOULD be very careful to prevent unintentional - leakage of this information via the HTTP protocol to other sources. - We very strongly recommend that a convenient interface be provided - for the user to control dissemination of such information, and that - designers and implementers be particularly careful in this area. - History shows that errors in this area are often both serious - security and/or privacy problems, and often generate highly adverse - publicity for the implementer's company. - -15.7 Privacy Issues Connected to Accept Headers - - Accept request-headers can reveal information about the user to all - servers which are accessed. The Accept-Language header in particular - can reveal information the user would consider to be of a private - nature, because the understanding of particular languages is often - strongly correlated to the membership of a particular ethnic group. - User agents which offer the option to configure the contents of an - Accept-Language header to be sent in every request are strongly - encouraged to let the configuration process include a message which - makes the user aware of the loss of privacy involved. - - An approach that limits the loss of privacy would be for a user agent - to omit the sending of Accept-Language headers by default, and to ask - the user whether it should start sending Accept-Language headers to a - server if it detects, by looking for any Vary response-header fields - generated by the server, that such sending could improve the quality - of service. - - Elaborate user-customized accept header fields sent in every request, - in particular if these include quality values, can be used by servers - as relatively reliable and long-lived user identifiers. Such user - identifiers would allow content providers to do click-trail tracking, - and would allow collaborating content providers to match cross-server - click-trails or form submissions of individual users. Note that for - many users not behind a proxy, the network address of the host - running the user agent will also serve as a long-lived user - identifier. In environments where proxies are used to enhance - privacy, user agents should be conservative in offering accept header - configuration options to end users. As an extreme privacy measure, - proxies could filter the accept headers in relayed requests. General - purpose user agents which provide a high degree of header - configurability should warn users about the loss of privacy which can - be involved. - - - - -Fielding, et. al. Standards Track [Page 143] - -RFC 2068 HTTP/1.1 January 1997 - - -15.8 DNS Spoofing - - Clients using HTTP rely heavily on the Domain Name Service, and are - thus generally prone to security attacks based on the deliberate - mis-association of IP addresses and DNS names. Clients need to be - cautious in assuming the continuing validity of an IP number/DNS name - association. - - In particular, HTTP clients SHOULD rely on their name resolver for - confirmation of an IP number/DNS name association, rather than - caching the result of previous host name lookups. Many platforms - already can cache host name lookups locally when appropriate, and - they SHOULD be configured to do so. These lookups should be cached, - however, only when the TTL (Time To Live) information reported by the - name server makes it likely that the cached information will remain - useful. - - If HTTP clients cache the results of host name lookups in order to - achieve a performance improvement, they MUST observe the TTL - information reported by DNS. - - If HTTP clients do not observe this rule, they could be spoofed when - a previously-accessed server's IP address changes. As network - renumbering is expected to become increasingly common, the - possibility of this form of attack will grow. Observing this - requirement thus reduces this potential security vulnerability. - - This requirement also improves the load-balancing behavior of clients - for replicated servers using the same DNS name and reduces the - likelihood of a user's experiencing failure in accessing sites which - use that strategy. - -15.9 Location Headers and Spoofing - - If a single server supports multiple organizations that do not trust - one another, then it must check the values of Location and Content- - Location headers in responses that are generated under control of - said organizations to make sure that they do not attempt to - invalidate resources over which they have no authority. - -16 Acknowledgments - - This specification makes heavy use of the augmented BNF and generic - constructs defined by David H. Crocker for RFC 822. Similarly, it - reuses many of the definitions provided by Nathaniel Borenstein and - Ned Freed for MIME. We hope that their inclusion in this - specification will help reduce past confusion over the relationship - between HTTP and Internet mail message formats. - - - -Fielding, et. al. Standards Track [Page 144] - -RFC 2068 HTTP/1.1 January 1997 - - - The HTTP protocol has evolved considerably over the past four years. - It has benefited from a large and active developer community--the - many people who have participated on the www-talk mailing list--and - it is that community which has been most responsible for the success - of HTTP and of the World-Wide Web in general. Marc Andreessen, Robert - Cailliau, Daniel W. Connolly, Bob Denny, John Franks, Jean-Francois - Groff, Phillip M. Hallam-Baker, Hakon W. Lie, Ari Luotonen, Rob - McCool, Lou Montulli, Dave Raggett, Tony Sanders, and Marc - VanHeyningen deserve special recognition for their efforts in - defining early aspects of the protocol. - - This document has benefited greatly from the comments of all those - participating in the HTTP-WG. In addition to those already mentioned, - the following individuals have contributed to this specification: - - Gary Adams Albert Lunde - Harald Tveit Alvestrand John C. Mallery - Keith Ball Jean-Philippe Martin-Flatin - Brian Behlendorf Larry Masinter - Paul Burchard Mitra - Maurizio Codogno David Morris - Mike Cowlishaw Gavin Nicol - Roman Czyborra Bill Perry - Michael A. Dolan Jeffrey Perry - David J. Fiander Scott Powers - Alan Freier Owen Rees - Marc Hedlund Luigi Rizzo - Greg Herlihy David Robinson - Koen Holtman Marc Salomon - Alex Hopmann Rich Salz - Bob Jernigan Allan M. Schiffman - Shel Kaphan Jim Seidman - Rohit Khare Chuck Shotton - John Klensin Eric W. Sink - Martijn Koster Simon E. Spero - Alexei Kosut Richard N. Taylor - David M. Kristol Robert S. Thau - Daniel LaLiberte Bill (BearHeart) Weinman - Ben Laurie Francois Yergeau - Paul J. Leach Mary Ellen Zurko - Daniel DuBois - - Much of the content and presentation of the caching design is due to - suggestions and comments from individuals including: Shel Kaphan, - Paul Leach, Koen Holtman, David Morris, and Larry Masinter. - - - - - - -Fielding, et. al. Standards Track [Page 145] - -RFC 2068 HTTP/1.1 January 1997 - - - Most of the specification of ranges is based on work originally done - by Ari Luotonen and John Franks, with additional input from Steve - Zilles. - - Thanks to the "cave men" of Palo Alto. You know who you are. - - Jim Gettys (the current editor of this document) wishes particularly - to thank Roy Fielding, the previous editor of this document, along - with John Klensin, Jeff Mogul, Paul Leach, Dave Kristol, Koen - Holtman, John Franks, Alex Hopmann, and Larry Masinter for their - help. - -17 References - - [1] Alvestrand, H., "Tags for the identification of languages", RFC - 1766, UNINETT, March 1995. - - [2] Anklesaria, F., McCahill, M., Lindner, P., Johnson, D., Torrey, - D., and B. Alberti. "The Internet Gopher Protocol: (a distributed - document search and retrieval protocol)", RFC 1436, University of - Minnesota, March 1993. - - [3] Berners-Lee, T., "Universal Resource Identifiers in WWW", A - Unifying Syntax for the Expression of Names and Addresses of Objects - on the Network as used in the World-Wide Web", RFC 1630, CERN, June - 1994. - - [4] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform Resource - Locators (URL)", RFC 1738, CERN, Xerox PARC, University of Minnesota, - December 1994. - - [5] Berners-Lee, T., and D. Connolly, "HyperText Markup Language - Specification - 2.0", RFC 1866, MIT/LCS, November 1995. - - [6] Berners-Lee, T., Fielding, R., and H. Frystyk, "Hypertext - Transfer Protocol -- HTTP/1.0.", RFC 1945 MIT/LCS, UC Irvine, May - 1996. - - [7] Freed, N., and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part One: Format of Internet Message Bodies", RFC - 2045, Innosoft, First Virtual, November 1996. - - [8] Braden, R., "Requirements for Internet hosts - application and - support", STD 3, RFC 1123, IETF, October 1989. - - [9] Crocker, D., "Standard for the Format of ARPA Internet Text - Messages", STD 11, RFC 822, UDEL, August 1982. - - - - -Fielding, et. al. Standards Track [Page 146] - -RFC 2068 HTTP/1.1 January 1997 - - - [10] Davis, F., Kahle, B., Morris, H., Salem, J., Shen, T., Wang, R., - Sui, J., and M. Grinbaum. "WAIS Interface Protocol Prototype - Functional Specification", (v1.5), Thinking Machines Corporation, - April 1990. - - [11] Fielding, R., "Relative Uniform Resource Locators", RFC 1808, UC - Irvine, June 1995. - - [12] Horton, M., and R. Adams. "Standard for interchange of USENET - messages", RFC 1036, AT&T Bell Laboratories, Center for Seismic - Studies, December 1987. - - [13] Kantor, B., and P. Lapsley. "Network News Transfer Protocol." A - Proposed Standard for the Stream-Based Transmission of News", RFC - 977, UC San Diego, UC Berkeley, February 1986. - - [14] Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part - Three: Message Header Extensions for Non-ASCII Text", RFC 2047, - University of Tennessee, November 1996. - - [15] Nebel, E., and L. Masinter. "Form-based File Upload in HTML", - RFC 1867, Xerox Corporation, November 1995. - - [16] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC 821, - USC/ISI, August 1982. - - [17] Postel, J., "Media Type Registration Procedure", RFC 2048, - USC/ISI, November 1996. - - [18] Postel, J., and J. Reynolds, "File Transfer Protocol (FTP)", STD - 9, RFC 959, USC/ISI, October 1985. - - [19] Reynolds, J., and J. Postel, "Assigned Numbers", STD 2, RFC - 1700, USC/ISI, October 1994. - - [20] Sollins, K., and L. Masinter, "Functional Requirements for - Uniform Resource Names", RFC 1737, MIT/LCS, Xerox Corporation, - December 1994. - - [21] US-ASCII. Coded Character Set - 7-Bit American Standard Code for - Information Interchange. Standard ANSI X3.4-1986, ANSI, 1986. - - [22] ISO-8859. International Standard -- Information Processing -- - 8-bit Single-Byte Coded Graphic Character Sets -- - Part 1: Latin alphabet No. 1, ISO 8859-1:1987. - Part 2: Latin alphabet No. 2, ISO 8859-2, 1987. - Part 3: Latin alphabet No. 3, ISO 8859-3, 1988. - Part 4: Latin alphabet No. 4, ISO 8859-4, 1988. - - - -Fielding, et. al. Standards Track [Page 147] - -RFC 2068 HTTP/1.1 January 1997 - - - Part 5: Latin/Cyrillic alphabet, ISO 8859-5, 1988. - Part 6: Latin/Arabic alphabet, ISO 8859-6, 1987. - Part 7: Latin/Greek alphabet, ISO 8859-7, 1987. - Part 8: Latin/Hebrew alphabet, ISO 8859-8, 1988. - Part 9: Latin alphabet No. 5, ISO 8859-9, 1990. - - [23] Meyers, J., and M. Rose "The Content-MD5 Header Field", RFC - 1864, Carnegie Mellon, Dover Beach Consulting, October, 1995. - - [24] Carpenter, B., and Y. Rekhter, "Renumbering Needs Work", RFC - 1900, IAB, February 1996. - - [25] Deutsch, P., "GZIP file format specification version 4.3." RFC - 1952, Aladdin Enterprises, May 1996. - - [26] Venkata N. Padmanabhan and Jeffrey C. Mogul. Improving HTTP - Latency. Computer Networks and ISDN Systems, v. 28, pp. 25-35, Dec. - 1995. Slightly revised version of paper in Proc. 2nd International - WWW Conf. '94: Mosaic and the Web, Oct. 1994, which is available at - http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/DDay/mogul/ - HTTPLatency.html. - - [27] Joe Touch, John Heidemann, and Katia Obraczka, "Analysis of HTTP - Performance", <URL: http://www.isi.edu/lsam/ib/http-perf/>, - USC/Information Sciences Institute, June 1996 - - [28] Mills, D., "Network Time Protocol, Version 3, Specification, - Implementation and Analysis", RFC 1305, University of Delaware, March - 1992. - - [29] Deutsch, P., "DEFLATE Compressed Data Format Specification - version 1.3." RFC 1951, Aladdin Enterprises, May 1996. - - [30] Spero, S., "Analysis of HTTP Performance Problems" - <URL:http://sunsite.unc.edu/mdma-release/http-prob.html>. - - [31] Deutsch, P., and J-L. Gailly, "ZLIB Compressed Data Format - Specification version 3.3", RFC 1950, Aladdin Enterprises, Info-ZIP, - May 1996. - - [32] Franks, J., Hallam-Baker, P., Hostetler, J., Leach, P., - Luotonen, A., Sink, E., and L. Stewart, "An Extension to HTTP : - Digest Access Authentication", RFC 2069, January 1997. - - - - - - - - -Fielding, et. al. Standards Track [Page 148] - -RFC 2068 HTTP/1.1 January 1997 - - -18 Authors' Addresses - - Roy T. Fielding - Department of Information and Computer Science - University of California - Irvine, CA 92717-3425, USA - - Fax: +1 (714) 824-4056 - EMail: fielding@ics.uci.edu - - - Jim Gettys - MIT Laboratory for Computer Science - 545 Technology Square - Cambridge, MA 02139, USA - - Fax: +1 (617) 258 8682 - EMail: jg@w3.org - - - Jeffrey C. Mogul - Western Research Laboratory - Digital Equipment Corporation - 250 University Avenue - Palo Alto, California, 94305, USA - - EMail: mogul@wrl.dec.com - - - Henrik Frystyk Nielsen - W3 Consortium - MIT Laboratory for Computer Science - 545 Technology Square - Cambridge, MA 02139, USA - - Fax: +1 (617) 258 8682 - EMail: frystyk@w3.org - - - Tim Berners-Lee - Director, W3 Consortium - MIT Laboratory for Computer Science - 545 Technology Square - Cambridge, MA 02139, USA - - Fax: +1 (617) 258 8682 - EMail: timbl@w3.org - - - - -Fielding, et. al. Standards Track [Page 149] - -RFC 2068 HTTP/1.1 January 1997 - - -19 Appendices - -19.1 Internet Media Type message/http - - In addition to defining the HTTP/1.1 protocol, this document serves - as the specification for the Internet media type "message/http". The - following is to be registered with IANA. - - Media Type name: message - Media subtype name: http - Required parameters: none - Optional parameters: version, msgtype - - version: The HTTP-Version number of the enclosed message - (e.g., "1.1"). If not present, the version can be - determined from the first line of the body. - - msgtype: The message type -- "request" or "response". If not - present, the type can be determined from the first - line of the body. - - Encoding considerations: only "7bit", "8bit", or "binary" are - permitted - - Security considerations: none - -19.2 Internet Media Type multipart/byteranges - - When an HTTP message includes the content of multiple ranges (for - example, a response to a request for multiple non-overlapping - ranges), these are transmitted as a multipart MIME message. The - multipart media type for this purpose is called - "multipart/byteranges". - - The multipart/byteranges media type includes two or more parts, each - with its own Content-Type and Content-Range fields. The parts are - separated using a MIME boundary parameter. - - Media Type name: multipart - Media subtype name: byteranges - Required parameters: boundary - Optional parameters: none - - Encoding considerations: only "7bit", "8bit", or "binary" are - permitted - - Security considerations: none - - - - -Fielding, et. al. Standards Track [Page 150] - -RFC 2068 HTTP/1.1 January 1997 - - -For example: - - HTTP/1.1 206 Partial content - Date: Wed, 15 Nov 1995 06:25:24 GMT - Last-modified: Wed, 15 Nov 1995 04:58:08 GMT - Content-type: multipart/byteranges; boundary=THIS_STRING_SEPARATES - - --THIS_STRING_SEPARATES - Content-type: application/pdf - Content-range: bytes 500-999/8000 - - ...the first range... - --THIS_STRING_SEPARATES - Content-type: application/pdf - Content-range: bytes 7000-7999/8000 - - ...the second range - --THIS_STRING_SEPARATES-- - -19.3 Tolerant Applications - - Although this document specifies the requirements for the generation - of HTTP/1.1 messages, not all applications will be correct in their - implementation. We therefore recommend that operational applications - be tolerant of deviations whenever those deviations can be - interpreted unambiguously. - - Clients SHOULD be tolerant in parsing the Status-Line and servers - tolerant when parsing the Request-Line. In particular, they SHOULD - accept any amount of SP or HT characters between fields, even though - only a single SP is required. - - The line terminator for message-header fields is the sequence CRLF. - However, we recommend that applications, when parsing such headers, - recognize a single LF as a line terminator and ignore the leading CR. - - The character set of an entity-body should be labeled as the lowest - common denominator of the character codes used within that body, with - the exception that no label is preferred over the labels US-ASCII or - ISO-8859-1. - - Additional rules for requirements on parsing and encoding of dates - and other potential problems with date encodings include: - - o HTTP/1.1 clients and caches should assume that an RFC-850 date - which appears to be more than 50 years in the future is in fact - in the past (this helps solve the "year 2000" problem). - - - - -Fielding, et. al. Standards Track [Page 151] - -RFC 2068 HTTP/1.1 January 1997 - - - o An HTTP/1.1 implementation may internally represent a parsed - Expires date as earlier than the proper value, but MUST NOT - internally represent a parsed Expires date as later than the - proper value. - - o All expiration-related calculations must be done in GMT. The - local time zone MUST NOT influence the calculation or comparison - of an age or expiration time. - - o If an HTTP header incorrectly carries a date value with a time - zone other than GMT, it must be converted into GMT using the most - conservative possible conversion. - -19.4 Differences Between HTTP Entities and MIME Entities - - HTTP/1.1 uses many of the constructs defined for Internet Mail (RFC - 822) and the Multipurpose Internet Mail Extensions (MIME ) to allow - entities to be transmitted in an open variety of representations and - with extensible mechanisms. However, MIME [7] discusses mail, and - HTTP has a few features that are different from those described in - MIME. These differences were carefully chosen to optimize - performance over binary connections, to allow greater freedom in the - use of new media types, to make date comparisons easier, and to - acknowledge the practice of some early HTTP servers and clients. - - This appendix describes specific areas where HTTP differs from MIME. - Proxies and gateways to strict MIME environments SHOULD be aware of - these differences and provide the appropriate conversions where - necessary. Proxies and gateways from MIME environments to HTTP also - need to be aware of the differences because some conversions may be - required. - -19.4.1 Conversion to Canonical Form - - MIME requires that an Internet mail entity be converted to canonical - form prior to being transferred. Section 3.7.1 of this document - describes the forms allowed for subtypes of the "text" media type - when transmitted over HTTP. MIME requires that content with a type of - "text" represent line breaks as CRLF and forbids the use of CR or LF - outside of line break sequences. HTTP allows CRLF, bare CR, and bare - LF to indicate a line break within text content when a message is - transmitted over HTTP. - - Where it is possible, a proxy or gateway from HTTP to a strict MIME - environment SHOULD translate all line breaks within the text media - types described in section 3.7.1 of this document to the MIME - canonical form of CRLF. Note, however, that this may be complicated - by the presence of a Content-Encoding and by the fact that HTTP - - - -Fielding, et. al. Standards Track [Page 152] - -RFC 2068 HTTP/1.1 January 1997 - - - allows the use of some character sets which do not use octets 13 and - 10 to represent CR and LF, as is the case for some multi-byte - character sets. - -19.4.2 Conversion of Date Formats - - HTTP/1.1 uses a restricted set of date formats (section 3.3.1) to - simplify the process of date comparison. Proxies and gateways from - other protocols SHOULD ensure that any Date header field present in a - message conforms to one of the HTTP/1.1 formats and rewrite the date - if necessary. - -19.4.3 Introduction of Content-Encoding - - MIME does not include any concept equivalent to HTTP/1.1's Content- - Encoding header field. Since this acts as a modifier on the media - type, proxies and gateways from HTTP to MIME-compliant protocols MUST - either change the value of the Content-Type header field or decode - the entity-body before forwarding the message. (Some experimental - applications of Content-Type for Internet mail have used a media-type - parameter of ";conversions=<content-coding>" to perform an equivalent - function as Content-Encoding. However, this parameter is not part of - MIME.) - -19.4.4 No Content-Transfer-Encoding - - HTTP does not use the Content-Transfer-Encoding (CTE) field of MIME. - Proxies and gateways from MIME-compliant protocols to HTTP MUST - remove any non-identity CTE ("quoted-printable" or "base64") encoding - prior to delivering the response message to an HTTP client. - - Proxies and gateways from HTTP to MIME-compliant protocols are - responsible for ensuring that the message is in the correct format - and encoding for safe transport on that protocol, where "safe - transport" is defined by the limitations of the protocol being used. - Such a proxy or gateway SHOULD label the data with an appropriate - Content-Transfer-Encoding if doing so will improve the likelihood of - safe transport over the destination protocol. - -19.4.5 HTTP Header Fields in Multipart Body-Parts - - In MIME, most header fields in multipart body-parts are generally - ignored unless the field name begins with "Content-". In HTTP/1.1, - multipart body-parts may contain any HTTP header fields which are - significant to the meaning of that part. - - - - - - -Fielding, et. al. Standards Track [Page 153] - -RFC 2068 HTTP/1.1 January 1997 - - -19.4.6 Introduction of Transfer-Encoding - - HTTP/1.1 introduces the Transfer-Encoding header field (section - 14.40). Proxies/gateways MUST remove any transfer coding prior to - forwarding a message via a MIME-compliant protocol. - - A process for decoding the "chunked" transfer coding (section 3.6) - can be represented in pseudo-code as: - - length := 0 - read chunk-size, chunk-ext (if any) and CRLF - while (chunk-size > 0) { - read chunk-data and CRLF - append chunk-data to entity-body - length := length + chunk-size - read chunk-size and CRLF - } - read entity-header - while (entity-header not empty) { - append entity-header to existing header fields - read entity-header - } - Content-Length := length - Remove "chunked" from Transfer-Encoding - -19.4.7 MIME-Version - - HTTP is not a MIME-compliant protocol (see appendix 19.4). However, - HTTP/1.1 messages may include a single MIME-Version general-header - field to indicate what version of the MIME protocol was used to - construct the message. Use of the MIME-Version header field indicates - that the message is in full compliance with the MIME protocol. - Proxies/gateways are responsible for ensuring full compliance (where - possible) when exporting HTTP messages to strict MIME environments. - - MIME-Version = "MIME-Version" ":" 1*DIGIT "." 1*DIGIT - - MIME version "1.0" is the default for use in HTTP/1.1. However, - HTTP/1.1 message parsing and semantics are defined by this document - and not the MIME specification. - -19.5 Changes from HTTP/1.0 - - This section summarizes major differences between versions HTTP/1.0 - and HTTP/1.1. - - - - - - -Fielding, et. al. Standards Track [Page 154] - -RFC 2068 HTTP/1.1 January 1997 - - -19.5.1 Changes to Simplify Multi-homed Web Servers and Conserve IP - Addresses - - The requirements that clients and servers support the Host request- - header, report an error if the Host request-header (section 14.23) is - missing from an HTTP/1.1 request, and accept absolute URIs (section - 5.1.2) are among the most important changes defined by this - specification. - - Older HTTP/1.0 clients assumed a one-to-one relationship of IP - addresses and servers; there was no other established mechanism for - distinguishing the intended server of a request than the IP address - to which that request was directed. The changes outlined above will - allow the Internet, once older HTTP clients are no longer common, to - support multiple Web sites from a single IP address, greatly - simplifying large operational Web servers, where allocation of many - IP addresses to a single host has created serious problems. The - Internet will also be able to recover the IP addresses that have been - allocated for the sole purpose of allowing special-purpose domain - names to be used in root-level HTTP URLs. Given the rate of growth of - the Web, and the number of servers already deployed, it is extremely - important that all implementations of HTTP (including updates to - existing HTTP/1.0 applications) correctly implement these - requirements: - - o Both clients and servers MUST support the Host request-header. - - o Host request-headers are required in HTTP/1.1 requests. - - o Servers MUST report a 400 (Bad Request) error if an HTTP/1.1 - request does not include a Host request-header. - - o Servers MUST accept absolute URIs. - - - - - - - - - - - - - - - - - - -Fielding, et. al. Standards Track [Page 155] - -RFC 2068 HTTP/1.1 January 1997 - - -19.6 Additional Features - - This appendix documents protocol elements used by some existing HTTP - implementations, but not consistently and correctly across most - HTTP/1.1 applications. Implementers should be aware of these - features, but cannot rely upon their presence in, or interoperability - with, other HTTP/1.1 applications. Some of these describe proposed - experimental features, and some describe features that experimental - deployment found lacking that are now addressed in the base HTTP/1.1 - specification. - -19.6.1 Additional Request Methods - -19.6.1.1 PATCH - - The PATCH method is similar to PUT except that the entity contains a - list of differences between the original version of the resource - identified by the Request-URI and the desired content of the resource - after the PATCH action has been applied. The list of differences is - in a format defined by the media type of the entity (e.g., - "application/diff") and MUST include sufficient information to allow - the server to recreate the changes necessary to convert the original - version of the resource to the desired version. - - If the request passes through a cache and the Request-URI identifies - a currently cached entity, that entity MUST be removed from the - cache. Responses to this method are not cachable. - - The actual method for determining how the patched resource is placed, - and what happens to its predecessor, is defined entirely by the - origin server. If the original version of the resource being patched - included a Content-Version header field, the request entity MUST - include a Derived-From header field corresponding to the value of the - original Content-Version header field. Applications are encouraged to - use these fields for constructing versioning relationships and - resolving version conflicts. - - PATCH requests must obey the message transmission requirements set - out in section 8.2. - - Caches that implement PATCH should invalidate cached responses as - defined in section 13.10 for PUT. - -19.6.1.2 LINK - - The LINK method establishes one or more Link relationships between - the existing resource identified by the Request-URI and other - existing resources. The difference between LINK and other methods - - - -Fielding, et. al. Standards Track [Page 156] - -RFC 2068 HTTP/1.1 January 1997 - - - allowing links to be established between resources is that the LINK - method does not allow any message-body to be sent in the request and - does not directly result in the creation of new resources. - - If the request passes through a cache and the Request-URI identifies - a currently cached entity, that entity MUST be removed from the - cache. Responses to this method are not cachable. - - Caches that implement LINK should invalidate cached responses as - defined in section 13.10 for PUT. - -19.6.1.3 UNLINK - - The UNLINK method removes one or more Link relationships from the - existing resource identified by the Request-URI. These relationships - may have been established using the LINK method or by any other - method supporting the Link header. The removal of a link to a - resource does not imply that the resource ceases to exist or becomes - inaccessible for future references. - - If the request passes through a cache and the Request-URI identifies - a currently cached entity, that entity MUST be removed from the - cache. Responses to this method are not cachable. - - Caches that implement UNLINK should invalidate cached responses as - defined in section 13.10 for PUT. - -19.6.2 Additional Header Field Definitions - -19.6.2.1 Alternates - - The Alternates response-header field has been proposed as a means for - the origin server to inform the client about other available - representations of the requested resource, along with their - distinguishing attributes, and thus providing a more reliable means - for a user agent to perform subsequent selection of another - representation which better fits the desires of its user (described - as agent-driven negotiation in section 12). - - - - - - - - - - - - - -Fielding, et. al. Standards Track [Page 157] - -RFC 2068 HTTP/1.1 January 1997 - - - The Alternates header field is orthogonal to the Vary header field in - that both may coexist in a message without affecting the - interpretation of the response or the available representations. It - is expected that Alternates will provide a significant improvement - over the server-driven negotiation provided by the Vary field for - those resources that vary over common dimensions like type and - language. - - The Alternates header field will be defined in a future - specification. - -19.6.2.2 Content-Version - - The Content-Version entity-header field defines the version tag - associated with a rendition of an evolving entity. Together with the - Derived-From field described in section 19.6.2.3, it allows a group - of people to work simultaneously on the creation of a work as an - iterative process. The field should be used to allow evolution of a - particular work along a single path rather than derived works or - renditions in different representations. - - Content-Version = "Content-Version" ":" quoted-string - - Examples of the Content-Version field include: - - Content-Version: "2.1.2" - Content-Version: "Fred 19950116-12:26:48" - Content-Version: "2.5a4-omega7" - -19.6.2.3 Derived-From - - The Derived-From entity-header field can be used to indicate the - version tag of the resource from which the enclosed entity was - derived before modifications were made by the sender. This field is - used to help manage the process of merging successive changes to a - resource, particularly when such changes are being made in parallel - and from multiple sources. - - Derived-From = "Derived-From" ":" quoted-string - - An example use of the field is: - - Derived-From: "2.1.1" - - The Derived-From field is required for PUT and PATCH requests if the - entity being sent was previously retrieved from the same URI and a - Content-Version header was included with the entity when it was last - retrieved. - - - -Fielding, et. al. Standards Track [Page 158] - -RFC 2068 HTTP/1.1 January 1997 - - -19.6.2.4 Link - - The Link entity-header field provides a means for describing a - relationship between two resources, generally between the requested - resource and some other resource. An entity MAY include multiple Link - values. Links at the metainformation level typically indicate - relationships like hierarchical structure and navigation paths. The - Link field is semantically equivalent to the <LINK> element in - HTML.[5] - - Link = "Link" ":" #("<" URI ">" *( ";" link-param ) - - link-param = ( ( "rel" "=" relationship ) - | ( "rev" "=" relationship ) - | ( "title" "=" quoted-string ) - | ( "anchor" "=" <"> URI <"> ) - | ( link-extension ) ) - - link-extension = token [ "=" ( token | quoted-string ) ] - - relationship = sgml-name - | ( <"> sgml-name *( SP sgml-name) <"> ) - - sgml-name = ALPHA *( ALPHA | DIGIT | "." | "-" ) - - Relationship values are case-insensitive and MAY be extended within - the constraints of the sgml-name syntax. The title parameter MAY be - used to label the destination of a link such that it can be used as - identification within a human-readable menu. The anchor parameter MAY - be used to indicate a source anchor other than the entire current - resource, such as a fragment of this resource or a third resource. - - Examples of usage include: - - Link: <http://www.cern.ch/TheBook/chapter2>; rel="Previous" - - Link: <mailto:timbl@w3.org>; rev="Made"; title="Tim Berners-Lee" - - The first example indicates that chapter2 is previous to this - resource in a logical navigation path. The second indicates that the - person responsible for making the resource available is identified by - the given e-mail address. - -19.6.2.5 URI - - The URI header field has, in past versions of this specification, - been used as a combination of the existing Location, Content- - Location, and Vary header fields as well as the future Alternates - - - -Fielding, et. al. Standards Track [Page 159] - -RFC 2068 HTTP/1.1 January 1997 - - - field (above). Its primary purpose has been to include a list of - additional URIs for the resource, including names and mirror - locations. However, it has become clear that the combination of many - different functions within this single field has been a barrier to - consistently and correctly implementing any of those functions. - Furthermore, we believe that the identification of names and mirror - locations would be better performed via the Link header field. The - URI header field is therefore deprecated in favor of those other - fields. - - URI-header = "URI" ":" 1#( "<" URI ">" ) - -19.7 Compatibility with Previous Versions - - It is beyond the scope of a protocol specification to mandate - compliance with previous versions. HTTP/1.1 was deliberately - designed, however, to make supporting previous versions easy. It is - worth noting that at the time of composing this specification, we - would expect commercial HTTP/1.1 servers to: - - o recognize the format of the Request-Line for HTTP/0.9, 1.0, and 1.1 - requests; - - o understand any valid request in the format of HTTP/0.9, 1.0, or - 1.1; - - o respond appropriately with a message in the same major version used - by the client. - - And we would expect HTTP/1.1 clients to: - - o recognize the format of the Status-Line for HTTP/1.0 and 1.1 - responses; - - o understand any valid response in the format of HTTP/0.9, 1.0, or - 1.1. - - For most implementations of HTTP/1.0, each connection is established - by the client prior to the request and closed by the server after - sending the response. A few implementations implement the Keep-Alive - version of persistent connections described in section 19.7.1.1. - - - - - - - - - - -Fielding, et. al. Standards Track [Page 160] - -RFC 2068 HTTP/1.1 January 1997 - - -19.7.1 Compatibility with HTTP/1.0 Persistent Connections - - Some clients and servers may wish to be compatible with some previous - implementations of persistent connections in HTTP/1.0 clients and - servers. Persistent connections in HTTP/1.0 must be explicitly - negotiated as they are not the default behavior. HTTP/1.0 - experimental implementations of persistent connections are faulty, - and the new facilities in HTTP/1.1 are designed to rectify these - problems. The problem was that some existing 1.0 clients may be - sending Keep-Alive to a proxy server that doesn't understand - Connection, which would then erroneously forward it to the next - inbound server, which would establish the Keep-Alive connection and - result in a hung HTTP/1.0 proxy waiting for the close on the - response. The result is that HTTP/1.0 clients must be prevented from - using Keep-Alive when talking to proxies. - - However, talking to proxies is the most important use of persistent - connections, so that prohibition is clearly unacceptable. Therefore, - we need some other mechanism for indicating a persistent connection - is desired, which is safe to use even when talking to an old proxy - that ignores Connection. Persistent connections are the default for - HTTP/1.1 messages; we introduce a new keyword (Connection: close) for - declaring non-persistence. - - The following describes the original HTTP/1.0 form of persistent - connections. - - When it connects to an origin server, an HTTP client MAY send the - Keep-Alive connection-token in addition to the Persist connection- - token: - - Connection: Keep-Alive - - An HTTP/1.0 server would then respond with the Keep-Alive connection - token and the client may proceed with an HTTP/1.0 (or Keep-Alive) - persistent connection. - - An HTTP/1.1 server may also establish persistent connections with - HTTP/1.0 clients upon receipt of a Keep-Alive connection token. - However, a persistent connection with an HTTP/1.0 client cannot make - use of the chunked transfer-coding, and therefore MUST use a - Content-Length for marking the ending boundary of each message. - - A client MUST NOT send the Keep-Alive connection token to a proxy - server as HTTP/1.0 proxy servers do not obey the rules of HTTP/1.1 - for parsing the Connection header field. - - - - - -Fielding, et. al. Standards Track [Page 161] - -RFC 2068 HTTP/1.1 January 1997 - - -19.7.1.1 The Keep-Alive Header - - When the Keep-Alive connection-token has been transmitted with a - request or a response, a Keep-Alive header field MAY also be - included. The Keep-Alive header field takes the following form: - - Keep-Alive-header = "Keep-Alive" ":" 0# keepalive-param - - keepalive-param = param-name "=" value - - The Keep-Alive header itself is optional, and is used only if a - parameter is being sent. HTTP/1.1 does not define any parameters. - - If the Keep-Alive header is sent, the corresponding connection token - MUST be transmitted. The Keep-Alive header MUST be ignored if - received without the connection token. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Fielding, et. al. Standards Track [Page 162] - diff --git a/docs/specs/rfc2109.txt b/docs/specs/rfc2109.txt deleted file mode 100644 index 432fdcc6..00000000 --- a/docs/specs/rfc2109.txt +++ /dev/null @@ -1,1179 +0,0 @@ - - - - - - -Network Working Group D. Kristol -Request for Comments: 2109 Bell Laboratories, Lucent Technologies -Category: Standards Track L. Montulli - Netscape Communications - February 1997 - - - HTTP State Management Mechanism - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -1. ABSTRACT - - This document specifies a way to create a stateful session with HTTP - requests and responses. It describes two new headers, Cookie and - Set-Cookie, which carry state information between participating - origin servers and user agents. The method described here differs - from Netscape's Cookie proposal, but it can interoperate with - HTTP/1.0 user agents that use Netscape's method. (See the HISTORICAL - section.) - -2. TERMINOLOGY - - The terms user agent, client, server, proxy, and origin server have - the same meaning as in the HTTP/1.0 specification. - - Fully-qualified host name (FQHN) means either the fully-qualified - domain name (FQDN) of a host (i.e., a completely specified domain - name ending in a top-level domain such as .com or .uk), or the - numeric Internet Protocol (IP) address of a host. The fully - qualified domain name is preferred; use of numeric IP addresses is - strongly discouraged. - - The terms request-host and request-URI refer to the values the client - would send to the server as, respectively, the host (but not port) - and abs_path portions of the absoluteURI (http_URL) of the HTTP - request line. Note that request-host must be a FQHN. - - - - - - - - -Kristol & Montulli Standards Track [Page 1] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - Hosts names can be specified either as an IP address or a FQHN - string. Sometimes we compare one host name with another. Host A's - name domain-matches host B's if - - * both host names are IP addresses and their host name strings match - exactly; or - - * both host names are FQDN strings and their host name strings match - exactly; or - - * A is a FQDN string and has the form NB, where N is a non-empty name - string, B has the form .B', and B' is a FQDN string. (So, x.y.com - domain-matches .y.com but not y.com.) - - Note that domain-match is not a commutative operation: a.b.c.com - domain-matches .c.com, but not the reverse. - - Because it was used in Netscape's original implementation of state - management, we will use the term cookie to refer to the state - information that passes between an origin server and user agent, and - that gets stored by the user agent. - -3. STATE AND SESSIONS - - This document describes a way to create stateful sessions with HTTP - requests and responses. Currently, HTTP servers respond to each - client request without relating that request to previous or - subsequent requests; the technique allows clients and servers that - wish to exchange state information to place HTTP requests and - responses within a larger context, which we term a "session". This - context might be used to create, for example, a "shopping cart", in - which user selections can be aggregated before purchase, or a - magazine browsing system, in which a user's previous reading affects - which offerings are presented. - - There are, of course, many different potential contexts and thus many - different potential types of session. The designers' paradigm for - sessions created by the exchange of cookies has these key attributes: - - 1. Each session has a beginning and an end. - - 2. Each session is relatively short-lived. - - 3. Either the user agent or the origin server may terminate a - session. - - 4. The session is implicit in the exchange of state information. - - - - -Kristol & Montulli Standards Track [Page 2] - -RFC 2109 HTTP State Management Mechanism February 1997 - - -4. OUTLINE - - We outline here a way for an origin server to send state information - to the user agent, and for the user agent to return the state - information to the origin server. The goal is to have a minimal - impact on HTTP and user agents. Only origin servers that need to - maintain sessions would suffer any significant impact, and that - impact can largely be confined to Common Gateway Interface (CGI) - programs, unless the server provides more sophisticated state - management support. (See Implementation Considerations, below.) - -4.1 Syntax: General - - The two state management headers, Set-Cookie and Cookie, have common - syntactic properties involving attribute-value pairs. The following - grammar uses the notation, and tokens DIGIT (decimal digits) and - token (informally, a sequence of non-special, non-white space - characters) from the HTTP/1.1 specification [RFC 2068] to describe - their syntax. - - av-pairs = av-pair *(";" av-pair) - av-pair = attr ["=" value] ; optional value - attr = token - value = word - word = token | quoted-string - - Attributes (names) (attr) are case-insensitive. White space is - permitted between tokens. Note that while the above syntax - description shows value as optional, most attrs require them. - - NOTE: The syntax above allows whitespace between the attribute and - the = sign. - -4.2 Origin Server Role - -4.2.1 General - - The origin server initiates a session, if it so desires. (Note that - "session" here does not refer to a persistent network connection but - to a logical session created from HTTP requests and responses. The - presence or absence of a persistent connection should have no effect - on the use of cookie-derived sessions). To initiate a session, the - origin server returns an extra response header to the client, Set- - Cookie. (The details follow later.) - - A user agent returns a Cookie request header (see below) to the - origin server if it chooses to continue a session. The origin server - may ignore it or use it to determine the current state of the - - - -Kristol & Montulli Standards Track [Page 3] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - session. It may send back to the client a Set-Cookie response header - with the same or different information, or it may send no Set-Cookie - header at all. The origin server effectively ends a session by - sending the client a Set-Cookie header with Max-Age=0. - - Servers may return a Set-Cookie response headers with any response. - User agents should send Cookie request headers, subject to other - rules detailed below, with every request. - - An origin server may include multiple Set-Cookie headers in a - response. Note that an intervening gateway could fold multiple such - headers into a single header. - -4.2.2 Set-Cookie Syntax - - The syntax for the Set-Cookie response header is - - set-cookie = "Set-Cookie:" cookies - cookies = 1#cookie - cookie = NAME "=" VALUE *(";" cookie-av) - NAME = attr - VALUE = value - cookie-av = "Comment" "=" value - | "Domain" "=" value - | "Max-Age" "=" value - | "Path" "=" value - | "Secure" - | "Version" "=" 1*DIGIT - - Informally, the Set-Cookie response header comprises the token Set- - Cookie:, followed by a comma-separated list of one or more cookies. - Each cookie begins with a NAME=VALUE pair, followed by zero or more - semi-colon-separated attribute-value pairs. The syntax for - attribute-value pairs was shown earlier. The specific attributes and - the semantics of their values follows. The NAME=VALUE attribute- - value pair must come first in each cookie. The others, if present, - can occur in any order. If an attribute appears more than once in a - cookie, the behavior is undefined. - - NAME=VALUE - Required. The name of the state information ("cookie") is NAME, - and its value is VALUE. NAMEs that begin with $ are reserved for - other uses and must not be used by applications. - - - - - - - - -Kristol & Montulli Standards Track [Page 4] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - The VALUE is opaque to the user agent and may be anything the - origin server chooses to send, possibly in a server-selected - printable ASCII encoding. "Opaque" implies that the content is of - interest and relevance only to the origin server. The content - may, in fact, be readable by anyone that examines the Set-Cookie - header. - - Comment=comment - Optional. Because cookies can contain private information about a - user, the Cookie attribute allows an origin server to document its - intended use of a cookie. The user can inspect the information to - decide whether to initiate or continue a session with this cookie. - - Domain=domain - Optional. The Domain attribute specifies the domain for which the - cookie is valid. An explicitly specified domain must always start - with a dot. - - Max-Age=delta-seconds - Optional. The Max-Age attribute defines the lifetime of the - cookie, in seconds. The delta-seconds value is a decimal non- - negative integer. After delta-seconds seconds elapse, the client - should discard the cookie. A value of zero means the cookie - should be discarded immediately. - - Path=path - Optional. The Path attribute specifies the subset of URLs to - which this cookie applies. - - Secure - Optional. The Secure attribute (with no value) directs the user - agent to use only (unspecified) secure means to contact the origin - server whenever it sends back this cookie. - - The user agent (possibly under the user's control) may determine - what level of security it considers appropriate for "secure" - cookies. The Secure attribute should be considered security - advice from the server to the user agent, indicating that it is in - the session's interest to protect the cookie contents. - - Version=version - Required. The Version attribute, a decimal integer, identifies to - which version of the state management specification the cookie - conforms. For this specification, Version=1 applies. - - - - - - - -Kristol & Montulli Standards Track [Page 5] - -RFC 2109 HTTP State Management Mechanism February 1997 - - -4.2.3 Controlling Caching - - An origin server must be cognizant of the effect of possible caching - of both the returned resource and the Set-Cookie header. Caching - "public" documents is desirable. For example, if the origin server - wants to use a public document such as a "front door" page as a - sentinel to indicate the beginning of a session for which a Set- - Cookie response header must be generated, the page should be stored - in caches "pre-expired" so that the origin server will see further - requests. "Private documents", for example those that contain - information strictly private to a session, should not be cached in - shared caches. - - If the cookie is intended for use by a single user, the Set-cookie - header should not be cached. A Set-cookie header that is intended to - be shared by multiple users may be cached. - - The origin server should send the following additional HTTP/1.1 - response headers, depending on circumstances: - - * To suppress caching of the Set-Cookie header: Cache-control: no- - cache="set-cookie". - - and one of the following: - - * To suppress caching of a private document in shared caches: Cache- - control: private. - - * To allow caching of a document and require that it be validated - before returning it to the client: Cache-control: must-revalidate. - - * To allow caching of a document, but to require that proxy caches - (not user agent caches) validate it before returning it to the - client: Cache-control: proxy-revalidate. - - * To allow caching of a document and request that it be validated - before returning it to the client (by "pre-expiring" it): - Cache-control: max-age=0. Not all caches will revalidate the - document in every case. - - HTTP/1.1 servers must send Expires: old-date (where old-date is a - date long in the past) on responses containing Set-Cookie response - headers unless they know for certain (by out of band means) that - there are no downsteam HTTP/1.0 proxies. HTTP/1.1 servers may send - other Cache-Control directives that permit caching by HTTP/1.1 - proxies in addition to the Expires: old-date directive; the Cache- - Control directive will override the Expires: old-date for HTTP/1.1 - proxies. - - - -Kristol & Montulli Standards Track [Page 6] - -RFC 2109 HTTP State Management Mechanism February 1997 - - -4.3 User Agent Role - -4.3.1 Interpreting Set-Cookie - - The user agent keeps separate track of state information that arrives - via Set-Cookie response headers from each origin server (as - distinguished by name or IP address and port). The user agent - applies these defaults for optional attributes that are missing: - - VersionDefaults to "old cookie" behavior as originally specified by - Netscape. See the HISTORICAL section. - - Domain Defaults to the request-host. (Note that there is no dot at - the beginning of request-host.) - - Max-AgeThe default behavior is to discard the cookie when the user - agent exits. - - Path Defaults to the path of the request URL that generated the - Set-Cookie response, up to, but not including, the - right-most /. - - Secure If absent, the user agent may send the cookie over an - insecure channel. - -4.3.2 Rejecting Cookies - - To prevent possible security or privacy violations, a user agent - rejects a cookie (shall not store its information) if any of the - following is true: - - * The value for the Path attribute is not a prefix of the request- - URI. - - * The value for the Domain attribute contains no embedded dots or - does not start with a dot. - - * The value for the request-host does not domain-match the Domain - attribute. - - * The request-host is a FQDN (not IP address) and has the form HD, - where D is the value of the Domain attribute, and H is a string - that contains one or more dots. - - Examples: - - * A Set-Cookie from request-host y.x.foo.com for Domain=.foo.com - would be rejected, because H is y.x and contains a dot. - - - -Kristol & Montulli Standards Track [Page 7] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - * A Set-Cookie from request-host x.foo.com for Domain=.foo.com would - be accepted. - - * A Set-Cookie with Domain=.com or Domain=.com., will always be - rejected, because there is no embedded dot. - - * A Set-Cookie with Domain=ajax.com will be rejected because the - value for Domain does not begin with a dot. - -4.3.3 Cookie Management - - If a user agent receives a Set-Cookie response header whose NAME is - the same as a pre-existing cookie, and whose Domain and Path - attribute values exactly (string) match those of a pre-existing - cookie, the new cookie supersedes the old. However, if the Set- - Cookie has a value for Max-Age of zero, the (old and new) cookie is - discarded. Otherwise cookies accumulate until they expire (resources - permitting), at which time they are discarded. - - Because user agents have finite space in which to store cookies, they - may also discard older cookies to make space for newer ones, using, - for example, a least-recently-used algorithm, along with constraints - on the maximum number of cookies that each origin server may set. - - If a Set-Cookie response header includes a Comment attribute, the - user agent should store that information in a human-readable form - with the cookie and should display the comment text as part of a - cookie inspection user interface. - - User agents should allow the user to control cookie destruction. An - infrequently-used cookie may function as a "preferences file" for - network applications, and a user may wish to keep it even if it is - the least-recently-used cookie. One possible implementation would be - an interface that allows the permanent storage of a cookie through a - checkbox (or, conversely, its immediate destruction). - - Privacy considerations dictate that the user have considerable - control over cookie management. The PRIVACY section contains more - information. - -4.3.4 Sending Cookies to the Origin Server - - When it sends a request to an origin server, the user agent sends a - Cookie request header to the origin server if it has cookies that are - applicable to the request, based on - - * the request-host; - - - - -Kristol & Montulli Standards Track [Page 8] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - * the request-URI; - - * the cookie's age. - - The syntax for the header is: - - cookie = "Cookie:" cookie-version - 1*((";" | ",") cookie-value) - cookie-value = NAME "=" VALUE [";" path] [";" domain] - cookie-version = "$Version" "=" value - NAME = attr - VALUE = value - path = "$Path" "=" value - domain = "$Domain" "=" value - - The value of the cookie-version attribute must be the value from the - Version attribute, if any, of the corresponding Set-Cookie response - header. Otherwise the value for cookie-version is 0. The value for - the path attribute must be the value from the Path attribute, if any, - of the corresponding Set-Cookie response header. Otherwise the - attribute should be omitted from the Cookie request header. The - value for the domain attribute must be the value from the Domain - attribute, if any, of the corresponding Set-Cookie response header. - Otherwise the attribute should be omitted from the Cookie request - header. - - Note that there is no Comment attribute in the Cookie request header - corresponding to the one in the Set-Cookie response header. The user - agent does not return the comment information to the origin server. - - The following rules apply to choosing applicable cookie-values from - among all the cookies the user agent has. - - Domain Selection - The origin server's fully-qualified host name must domain-match - the Domain attribute of the cookie. - - Path Selection - The Path attribute of the cookie must match a prefix of the - request-URI. - - Max-Age Selection - Cookies that have expired should have been discarded and thus - are not forwarded to an origin server. - - - - - - - -Kristol & Montulli Standards Track [Page 9] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - If multiple cookies satisfy the criteria above, they are ordered in - the Cookie header such that those with more specific Path attributes - precede those with less specific. Ordering with respect to other - attributes (e.g., Domain) is unspecified. - - Note: For backward compatibility, the separator in the Cookie header - is semi-colon (;) everywhere. A server should also accept comma (,) - as the separator between cookie-values for future compatibility. - -4.3.5 Sending Cookies in Unverifiable Transactions - - Users must have control over sessions in order to ensure privacy. - (See PRIVACY section below.) To simplify implementation and to - prevent an additional layer of complexity where adequate safeguards - exist, however, this document distinguishes between transactions that - are verifiable and those that are unverifiable. A transaction is - verifiable if the user has the option to review the request-URI prior - to its use in the transaction. A transaction is unverifiable if the - user does not have that option. Unverifiable transactions typically - arise when a user agent automatically requests inlined or embedded - entities or when it resolves redirection (3xx) responses from an - origin server. Typically the origin transaction, the transaction - that the user initiates, is verifiable, and that transaction may - directly or indirectly induce the user agent to make unverifiable - transactions. - - When it makes an unverifiable transaction, a user agent must enable a - session only if a cookie with a domain attribute D was sent or - received in its origin transaction, such that the host name in the - Request-URI of the unverifiable transaction domain-matches D. - - This restriction prevents a malicious service author from using - unverifiable transactions to induce a user agent to start or continue - a session with a server in a different domain. The starting or - continuation of such sessions could be contrary to the privacy - expectations of the user, and could also be a security problem. - - User agents may offer configurable options that allow the user agent, - or any autonomous programs that the user agent executes, to ignore - the above rule, so long as these override options default to "off". - - Many current user agents already provide a review option that would - render many links verifiable. For instance, some user agents display - the URL that would be referenced for a particular link when the mouse - pointer is placed over that link. The user can therefore determine - whether to visit that site before causing the browser to do so. - (Though not implemented on current user agents, a similar technique - could be used for a button used to submit a form -- the user agent - - - -Kristol & Montulli Standards Track [Page 10] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - could display the action to be taken if the user were to select that - button.) However, even this would not make all links verifiable; for - example, links to automatically loaded images would not normally be - subject to "mouse pointer" verification. - - Many user agents also provide the option for a user to view the HTML - source of a document, or to save the source to an external file where - it can be viewed by another application. While such an option does - provide a crude review mechanism, some users might not consider it - acceptable for this purpose. - -4.4 How an Origin Server Interprets the Cookie Header - - A user agent returns much of the information in the Set-Cookie header - to the origin server when the Path attribute matches that of a new - request. When it receives a Cookie header, the origin server should - treat cookies with NAMEs whose prefix is $ specially, as an attribute - for the adjacent cookie. The value for such a NAME is to be - interpreted as applying to the lexically (left-to-right) most recent - cookie whose name does not have the $ prefix. If there is no - previous cookie, the value applies to the cookie mechanism as a - whole. For example, consider the cookie - - Cookie: $Version="1"; Customer="WILE_E_COYOTE"; - $Path="/acme" - - $Version applies to the cookie mechanism as a whole (and gives the - version number for the cookie mechanism). $Path is an attribute - whose value (/acme) defines the Path attribute that was used when the - Customer cookie was defined in a Set-Cookie response header. - -4.5 Caching Proxy Role - - One reason for separating state information from both a URL and - document content is to facilitate the scaling that caching permits. - To support cookies, a caching proxy must obey these rules already in - the HTTP specification: - - * Honor requests from the cache, if possible, based on cache validity - rules. - - * Pass along a Cookie request header in any request that the proxy - must make of another server. - - * Return the response to the client. Include any Set-Cookie response - header. - - - - - -Kristol & Montulli Standards Track [Page 11] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - * Cache the received response subject to the control of the usual - headers, such as Expires, Cache-control: no-cache, and Cache- - control: private, - - * Cache the Set-Cookie subject to the control of the usual header, - Cache-control: no-cache="set-cookie". (The Set-Cookie header - should usually not be cached.) - - Proxies must not introduce Set-Cookie (Cookie) headers of their own - in proxy responses (requests). - -5. EXAMPLES - -5.1 Example 1 - - Most detail of request and response headers has been omitted. Assume - the user agent has no stored cookies. - - 1. User Agent -> Server - - POST /acme/login HTTP/1.1 - [form data] - - User identifies self via a form. - - 2. Server -> User Agent - - HTTP/1.1 200 OK - Set-Cookie: Customer="WILE_E_COYOTE"; Version="1"; Path="/acme" - - Cookie reflects user's identity. - - 3. User Agent -> Server - - POST /acme/pickitem HTTP/1.1 - Cookie: $Version="1"; Customer="WILE_E_COYOTE"; $Path="/acme" - [form data] - - User selects an item for "shopping basket." - - 4. Server -> User Agent - - HTTP/1.1 200 OK - Set-Cookie: Part_Number="Rocket_Launcher_0001"; Version="1"; - Path="/acme" - - Shopping basket contains an item. - - - - -Kristol & Montulli Standards Track [Page 12] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - 5. User Agent -> Server - - POST /acme/shipping HTTP/1.1 - Cookie: $Version="1"; - Customer="WILE_E_COYOTE"; $Path="/acme"; - Part_Number="Rocket_Launcher_0001"; $Path="/acme" - [form data] - - User selects shipping method from form. - - 6. Server -> User Agent - - HTTP/1.1 200 OK - Set-Cookie: Shipping="FedEx"; Version="1"; Path="/acme" - - New cookie reflects shipping method. - - 7. User Agent -> Server - - POST /acme/process HTTP/1.1 - Cookie: $Version="1"; - Customer="WILE_E_COYOTE"; $Path="/acme"; - Part_Number="Rocket_Launcher_0001"; $Path="/acme"; - Shipping="FedEx"; $Path="/acme" - [form data] - - User chooses to process order. - - 8. Server -> User Agent - - HTTP/1.1 200 OK - - Transaction is complete. - - The user agent makes a series of requests on the origin server, after - each of which it receives a new cookie. All the cookies have the - same Path attribute and (default) domain. Because the request URLs - all have /acme as a prefix, and that matches the Path attribute, each - request contains all the cookies received so far. - -5.2 Example 2 - - This example illustrates the effect of the Path attribute. All - detail of request and response headers has been omitted. Assume the - user agent has no stored cookies. - - Imagine the user agent has received, in response to earlier requests, - the response headers - - - -Kristol & Montulli Standards Track [Page 13] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - Set-Cookie: Part_Number="Rocket_Launcher_0001"; Version="1"; - Path="/acme" - - and - - Set-Cookie: Part_Number="Riding_Rocket_0023"; Version="1"; - Path="/acme/ammo" - - A subsequent request by the user agent to the (same) server for URLs - of the form /acme/ammo/... would include the following request - header: - - Cookie: $Version="1"; - Part_Number="Riding_Rocket_0023"; $Path="/acme/ammo"; - Part_Number="Rocket_Launcher_0001"; $Path="/acme" - - Note that the NAME=VALUE pair for the cookie with the more specific - Path attribute, /acme/ammo, comes before the one with the less - specific Path attribute, /acme. Further note that the same cookie - name appears more than once. - - A subsequent request by the user agent to the (same) server for a URL - of the form /acme/parts/ would include the following request header: - - Cookie: $Version="1"; Part_Number="Rocket_Launcher_0001"; $Path="/acme" - - Here, the second cookie's Path attribute /acme/ammo is not a prefix - of the request URL, /acme/parts/, so the cookie does not get - forwarded to the server. - -6. IMPLEMENTATION CONSIDERATIONS - - Here we speculate on likely or desirable details for an origin server - that implements state management. - -6.1 Set-Cookie Content - - An origin server's content should probably be divided into disjoint - application areas, some of which require the use of state - information. The application areas can be distinguished by their - request URLs. The Set-Cookie header can incorporate information - about the application areas by setting the Path attribute for each - one. - - The session information can obviously be clear or encoded text that - describes state. However, if it grows too large, it can become - unwieldy. Therefore, an implementor might choose for the session - information to be a key to a server-side resource. Of course, using - - - -Kristol & Montulli Standards Track [Page 14] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - a database creates some problems that this state management - specification was meant to avoid, namely: - - 1. keeping real state on the server side; - - 2. how and when to garbage-collect the database entry, in case the - user agent terminates the session by, for example, exiting. - -6.2 Stateless Pages - - Caching benefits the scalability of WWW. Therefore it is important - to reduce the number of documents that have state embedded in them - inherently. For example, if a shopping-basket-style application - always displays a user's current basket contents on each page, those - pages cannot be cached, because each user's basket's contents would - be different. On the other hand, if each page contains just a link - that allows the user to "Look at My Shopping Basket", the page can be - cached. - -6.3 Implementation Limits - - Practical user agent implementations have limits on the number and - size of cookies that they can store. In general, user agents' cookie - support should have no fixed limits. They should strive to store as - many frequently-used cookies as possible. Furthermore, general-use - user agents should provide each of the following minimum capabilities - individually, although not necessarily simultaneously: - - * at least 300 cookies - - * at least 4096 bytes per cookie (as measured by the size of the - characters that comprise the cookie non-terminal in the syntax - description of the Set-Cookie header) - - * at least 20 cookies per unique host or domain name - - User agents created for specific purposes or for limited-capacity - devices should provide at least 20 cookies of 4096 bytes, to ensure - that the user can interact with a session-based origin server. - - The information in a Set-Cookie response header must be retained in - its entirety. If for some reason there is inadequate space to store - the cookie, it must be discarded, not truncated. - - Applications should use as few and as small cookies as possible, and - they should cope gracefully with the loss of a cookie. - - - - - -Kristol & Montulli Standards Track [Page 15] - -RFC 2109 HTTP State Management Mechanism February 1997 - - -6.3.1 Denial of Service Attacks - - User agents may choose to set an upper bound on the number of cookies - to be stored from a given host or domain name or on the size of the - cookie information. Otherwise a malicious server could attempt to - flood a user agent with many cookies, or large cookies, on successive - responses, which would force out cookies the user agent had received - from other servers. However, the minima specified above should still - be supported. - -7. PRIVACY - -7.1 User Agent Control - - An origin server could create a Set-Cookie header to track the path - of a user through the server. Users may object to this behavior as - an intrusive accumulation of information, even if their identity is - not evident. (Identity might become evident if a user subsequently - fills out a form that contains identifying information.) This state - management specification therefore requires that a user agent give - the user control over such a possible intrusion, although the - interface through which the user is given this control is left - unspecified. However, the control mechanisms provided shall at least - allow the user - - * to completely disable the sending and saving of cookies. - - * to determine whether a stateful session is in progress. - - * to control the saving of a cookie on the basis of the cookie's - Domain attribute. - - Such control could be provided by, for example, mechanisms - - * to notify the user when the user agent is about to send a cookie - to the origin server, offering the option not to begin a session. - - * to display a visual indication that a stateful session is in - progress. - - * to let the user decide which cookies, if any, should be saved - when the user concludes a window or user agent session. - - * to let the user examine the contents of a cookie at any time. - - A user agent usually begins execution with no remembered state - information. It should be possible to configure a user agent never - to send Cookie headers, in which case it can never sustain state with - - - -Kristol & Montulli Standards Track [Page 16] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - an origin server. (The user agent would then behave like one that is - unaware of how to handle Set-Cookie response headers.) - - When the user agent terminates execution, it should let the user - discard all state information. Alternatively, the user agent may ask - the user whether state information should be retained; the default - should be "no". If the user chooses to retain state information, it - would be restored the next time the user agent runs. - - NOTE: User agents should probably be cautious about using files to - store cookies long-term. If a user runs more than one instance of - the user agent, the cookies could be commingled or otherwise messed - up. - -7.2 Protocol Design - - The restrictions on the value of the Domain attribute, and the rules - concerning unverifiable transactions, are meant to reduce the ways - that cookies can "leak" to the "wrong" site. The intent is to - restrict cookies to one, or a closely related set of hosts. - Therefore a request-host is limited as to what values it can set for - Domain. We consider it acceptable for hosts host1.foo.com and - host2.foo.com to share cookies, but not a.com and b.com. - - Similarly, a server can only set a Path for cookies that are related - to the request-URI. - -8. SECURITY CONSIDERATIONS - -8.1 Clear Text - - The information in the Set-Cookie and Cookie headers is unprotected. - Two consequences are: - - 1. Any sensitive information that is conveyed in them is exposed - to intruders. - - 2. A malicious intermediary could alter the headers as they travel - in either direction, with unpredictable results. - - These facts imply that information of a personal and/or financial - nature should only be sent over a secure channel. For less sensitive - information, or when the content of the header is a database key, an - origin server should be vigilant to prevent a bad Cookie value from - causing failures. - - - - - - -Kristol & Montulli Standards Track [Page 17] - -RFC 2109 HTTP State Management Mechanism February 1997 - - -8.2 Cookie Spoofing - - Proper application design can avoid spoofing attacks from related - domains. Consider: - - 1. User agent makes request to victim.cracker.edu, gets back - cookie session_id="1234" and sets the default domain - victim.cracker.edu. - - 2. User agent makes request to spoof.cracker.edu, gets back - cookie session-id="1111", with Domain=".cracker.edu". - - 3. User agent makes request to victim.cracker.edu again, and - passes - - Cookie: $Version="1"; - session_id="1234"; - session_id="1111"; $Domain=".cracker.edu" - - The server at victim.cracker.edu should detect that the second - cookie was not one it originated by noticing that the Domain - attribute is not for itself and ignore it. - -8.3 Unexpected Cookie Sharing - - A user agent should make every attempt to prevent the sharing of - session information between hosts that are in different domains. - Embedded or inlined objects may cause particularly severe privacy - problems if they can be used to share cookies between disparate - hosts. For example, a malicious server could embed cookie - information for host a.com in a URI for a CGI on host b.com. User - agent implementors are strongly encouraged to prevent this sort of - exchange whenever possible. - -9. OTHER, SIMILAR, PROPOSALS - - Three other proposals have been made to accomplish similar goals. - This specification is an amalgam of Kristol's State-Info proposal and - Netscape's Cookie proposal. - - Brian Behlendorf proposed a Session-ID header that would be user- - agent-initiated and could be used by an origin server to track - "clicktrails". It would not carry any origin-server-defined state, - however. Phillip Hallam-Baker has proposed another client-defined - session ID mechanism for similar purposes. - - - - - - -Kristol & Montulli Standards Track [Page 18] - -RFC 2109 HTTP State Management Mechanism February 1997 - - - While both session IDs and cookies can provide a way to sustain - stateful sessions, their intended purpose is different, and, - consequently, the privacy requirements for them are different. A - user initiates session IDs to allow servers to track progress through - them, or to distinguish multiple users on a shared machine. Cookies - are server-initiated, so the cookie mechanism described here gives - users control over something that would otherwise take place without - the users' awareness. Furthermore, cookies convey rich, server- - selected information, whereas session IDs comprise user-selected, - simple information. - -10. HISTORICAL - -10.1 Compatibility With Netscape's Implementation - - HTTP/1.0 clients and servers may use Set-Cookie and Cookie headers - that reflect Netscape's original cookie proposal. These notes cover - inter-operation between "old" and "new" cookies. - -10.1.1 Extended Cookie Header - - This proposal adds attribute-value pairs to the Cookie request header - in a compatible way. An "old" client that receives a "new" cookie - will ignore attributes it does not understand; it returns what it - does understand to the origin server. A "new" client always sends - cookies in the new form. - - An "old" server that receives a "new" cookie will see what it thinks - are many cookies with names that begin with a $, and it will ignore - them. (The "old" server expects these cookies to be separated by - semi-colon, not comma.) A "new" server can detect cookies that have - passed through an "old" client, because they lack a $Version - attribute. - -10.1.2 Expires and Max-Age - - Netscape's original proposal defined an Expires header that took a - date value in a fixed-length variant format in place of Max-Age: - - Wdy, DD-Mon-YY HH:MM:SS GMT - - Note that the Expires date format contains embedded spaces, and that - "old" cookies did not have quotes around values. Clients that - implement to this specification should be aware of "old" cookies and - Expires. - - - - - - -Kristol & Montulli Standards Track [Page 19] - -RFC 2109 HTTP State Management Mechanism February 1997 - - -10.1.3 Punctuation - - In Netscape's original proposal, the values in attribute-value pairs - did not accept "-quoted strings. Origin servers should be cautious - about sending values that require quotes unless they know the - receiving user agent understands them (i.e., "new" cookies). A - ("new") user agent should only use quotes around values in Cookie - headers when the cookie's version(s) is (are) all compliant with this - specification or later. - - In Netscape's original proposal, no whitespace was permitted around - the = that separates attribute-value pairs. Therefore such - whitespace should be used with caution in new implementations. - -10.2 Caching and HTTP/1.0 - - Some caches, such as those conforming to HTTP/1.0, will inevitably - cache the Set-Cookie header, because there was no mechanism to - suppress caching of headers prior to HTTP/1.1. This caching can lead - to security problems. Documents transmitted by an origin server - along with Set-Cookie headers will usually either be uncachable, or - will be "pre-expired". As long as caches obey instructions not to - cache documents (following Expires: <a date in the past> or Pragma: - no-cache (HTTP/1.0), or Cache-control: no-cache (HTTP/1.1)) - uncachable documents present no problem. However, pre-expired - documents may be stored in caches. They require validation (a - conditional GET) on each new request, but some cache operators loosen - the rules for their caches, and sometimes serve expired documents - without first validating them. This combination of factors can lead - to cookies meant for one user later being sent to another user. The - Set-Cookie header is stored in the cache, and, although the document - is stale (expired), the cache returns the document in response to - later requests, including cached headers. - -11. ACKNOWLEDGEMENTS - - This document really represents the collective efforts of the - following people, in addition to the authors: Roy Fielding, Marc - Hedlund, Ted Hardie, Koen Holtman, Shel Kaphan, Rohit Khare. - - - - - - - - - - - - -Kristol & Montulli Standards Track [Page 20] - -RFC 2109 HTTP State Management Mechanism February 1997 - - -12. AUTHORS' ADDRESSES - - David M. Kristol - Bell Laboratories, Lucent Technologies - 600 Mountain Ave. Room 2A-227 - Murray Hill, NJ 07974 - - Phone: (908) 582-2250 - Fax: (908) 582-5809 - EMail: dmk@bell-labs.com - - - Lou Montulli - Netscape Communications Corp. - 501 E. Middlefield Rd. - Mountain View, CA 94043 - - Phone: (415) 528-2600 - EMail: montulli@netscape.com - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Kristol & Montulli Standards Track [Page 21] - diff --git a/docs/specs/rfc2145.txt b/docs/specs/rfc2145.txt deleted file mode 100644 index b6db4d58..00000000 --- a/docs/specs/rfc2145.txt +++ /dev/null @@ -1,395 +0,0 @@ - - - - - - -Network Working Group J. C. Mogul -Request for Comments: 2145 DEC -Category: Informational R. Fielding - UC Irvine - J. Gettys - DEC - H. Frystyk - MIT/LCS - May 1997 - - Use and Interpretation of - HTTP Version Numbers - -Status of this Memo - - This memo provides information for the Internet community. This memo - does not specify an Internet standard of any kind. Distribution of - this memo is unlimited. - - Distribution of this document is unlimited. Please send comments to - the HTTP working group at <http-wg@cuckoo.hpl.hp.com>. Discussions - of the working group are archived at - <URL:http://www.ics.uci.edu/pub/ietf/http/>. General discussions - about HTTP and the applications which use HTTP should take place on - the <www-talk@w3.org> mailing list. - -Abstract - - HTTP request and response messages include an HTTP protocol version - number. Some confusion exists concerning the proper use and - interpretation of HTTP version numbers, and concerning - interoperability of HTTP implementations of different protocol - versions. This document is an attempt to clarify the situation. It - is not a modification of the intended meaning of the existing - HTTP/1.0 and HTTP/1.1 documents, but it does describe the intention - of the authors of those documents, and can be considered definitive - when there is any ambiguity in those documents concerning HTTP - version numbers, for all versions of HTTP. - - - - - - - - - - - - - -Mogul, et. al. Informational [Page 1] - -RFC 2145 HTTP Version Numbers May 1997 - - -TABLE OF CONTENTS - - 1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . 2 - 1.1 Robustness Principle . . . . . . . . . . . . . . . . . . 3 - 2 HTTP version numbers. . . . . . . . . . . . . . . . . . . . . . 3 - 2.1 Proxy behavior. . . . . . . . . . . . . . . . . . . . . . . . 4 - 2.2 Compatibility between minor versions of the same major - version. . . . . . . . . . . . . . . . . . . . . . . . 4 - 2.3 Which version number to send in a message. . . . . . . . 5 - 3 Security Considerations . . . . . . . . . . . . . . . . . . . . 6 - 4 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 5 Authors' addresses. . . . . . . . . . . . . . . . . . . . . . . 6 - -1 Introduction - - HTTP request and response messages include an HTTP protocol version - number. According to section 3.1 of the HTTP/1.1 specification [2], - - HTTP uses a "<major>.<minor>" numbering scheme to indicate - versions of the protocol. The protocol versioning policy is - intended to allow the sender to indicate the format of a message - and its capacity for understanding further HTTP communication, - rather than the features obtained via that communication. No - change is made to the version number for the addition of message - components which do not affect communication behavior or which - only add to extensible field values. The <minor> number is - incremented when the changes made to the protocol add features - which do not change the general message parsing algorithm, but - which may add to the message semantics and imply additional - capabilities of the sender. The <major> number is incremented when - the format of a message within the protocol is changed. - - The same language appears in the description of HTTP/1.0 [1]. - - Many readers of these documents have expressed some confusion about - the intended meaning of this policy. Also, some people who wrote - HTTP implementations before RFC1945 [1] was issued were not aware of - the intentions behind the introduction of version numbers in - HTTP/1.0. This has led to debate and inconsistency regarding the use - and interpretation of HTTP version numbers, and has led to - interoperability problems in certain cases. - - - - - - - - - - -Mogul, et. al. Informational [Page 2] - -RFC 2145 HTTP Version Numbers May 1997 - - - This document is an attempt to clarify the situation. It is not a - modification of the intended meaning of the existing HTTP/1.0 and - HTTP/1.1 documents, but it does describe the intention of the authors - of those documents. In any case where either of those two documents - is ambiguous regarding the use and interpretation of HTTP version - numbers, this document should be considered the definitive as to the - intentions of the designers of HTTP. - - The specification described in this document is not part of the - specification of any individual version of HTTP, such as HTTP/1.0 or - HTTP/1.1. Rather, this document describes the use of HTTP version - numbers in any version of HTTP (except for HTTP/0.9, which did not - include version numbers). - - No vendor or other provider of an HTTP implementation should claim - any compliance with any IETF HTTP specification unless the - implementation conditionally complies with the rules in this - document. - -1.1 Robustness Principle - - RFC791 [4] defines the "robustness principle" in section 3.2: - - an implementation must be conservative in its sending - behavior, and liberal in its receiving behavior. - - This principle applies to HTTP, as well. It is the fundamental basis - for interpreting any part of the HTTP specification that might still - be ambiguous. In particular, implementations of HTTP SHOULD NOT - reject messages or generate errors unnecessarily. - -2 HTTP version numbers - - We start by restating the language quoted above from section 3.1 of - the HTTP/1.1 specification [2]: - - It is, and has always been, the explicit intent of the - HTTP specification that the interpretation of an HTTP message - header does not change between minor versions of the same major - version. - - It is, and has always been, the explicit intent of the - HTTP specification that an implementation receiving a message - header that it does not understand MUST ignore that header. (The - word "ignore" has a special meaning for proxies; see section 2.1 - below.) - - - - - -Mogul, et. al. Informational [Page 3] - -RFC 2145 HTTP Version Numbers May 1997 - - - To make this as clear as possible: The major version sent in a - message MAY indicate the interpretation of other header fields. The - minor version sent in a message MUST NOT indicate the interpretation - of other header fields. This reflects the principle that the minor - version labels the capability of the sender, not the interpretation - of the message. - - Note: In a future version of HTTP, we may introduce a mechanism - that explicitly requires a receiving implementation to reject a - message if it does not understand certain headers. For example, - this might be implemented by means of a header that lists a set of - other message headers that must be understood by the recipient. - Any implementation claiming at least conditional compliance with - this future version of HTTP would have to implement this - mechanism. However, no implementation claiming compliance with a - lower HTTP version (in particular, HTTP/1.1) will have to - implement this mechanism. - - This future change may be required to support the Protocol - Extension Protocol (PEP) [3]. - - One consequence of these rules is that an HTTP/1.1 message sent to an - HTTP/1.0 recipient (or a recipient whose version is unknown) MUST be - constructed so that it remains a valid HTTP/1.0 message when all - headers not defined in the HTTP/1.0 specification [1] are removed. - -2.1 Proxy behavior - - A proxy MUST forward an unknown header, unless it is protected by a - Connection header. A proxy implementing an HTTP version >= 1.1 MUST - NOT forward unknown headers that are protected by a Connection - header, as described in section 14.10 of the HTTP/1.1 specification - [2]. - - We remind the reader that that HTTP version numbers are hop-by-hop - components of HTTP messages, and are not end-to-end. That is, an - HTTP proxy never "forwards" an HTTP version number in either a - request or response. - -2.2 Compatibility between minor versions of the same major version - - An implementation of HTTP/x.b sending a message to a recipient whose - version is known to be HTTP/x.a, a < b, MAY send a header that is not - defined in the specification for HTTP/x.a. For example, an HTTP/1.1 - server may send a "Cache-control" header to an HTTP/1.0 client; this - may be useful if the immediate recipient is an HTTP/1.0 proxy, but - the ultimate recipient is an HTTP/1.1 client. - - - - -Mogul, et. al. Informational [Page 4] - -RFC 2145 HTTP Version Numbers May 1997 - - - An implementation of HTTP/x.b sending a message to a recipient whose - version is known to be HTTP/x.a, a < b, MUST NOT depend on the - recipient understanding a header not defined in the specification for - HTTP/x.a. For example, HTTP/1.0 clients cannot be expected to - understand chunked encodings, and so an HTTP/1.1 server must never - send "Transfer-Encoding: chunked" in response to an HTTP/1.0 request. - -2.3 Which version number to send in a message - - The most strenuous debate over the use of HTTP version numbers has - centered on the problem of implementations that do not follow the - robustness principle, and which fail to produce useful results when - they receive a message with an HTTP minor version higher than the - minor version they implement. We consider these implementations - buggy, but we recognize that the robustness principle also implies - that message senders should make concessions to buggy implementations - when this is truly necessary for interoperation. - - An HTTP client SHOULD send a request version equal to the highest - version for which the client is at least conditionally compliant, and - whose major version is no higher than the highest version supported - by the server, if this is known. An HTTP client MUST NOT send a - version for which it is not at least conditionally compliant. - - An HTTP client MAY send a lower request version, if it is known that - the server incorrectly implements the HTTP specification, but only - after the client has determined that the server is actually buggy. - - An HTTP server SHOULD send a response version equal to the highest - version for which the server is at least conditionally compliant, and - whose major version is less than or equal to the one received in the - request. An HTTP server MUST NOT send a version for which it is not - at least conditionally compliant. A server MAY send a 505 (HTTP - Version Not Supported) response if cannot send a response using the - major version used in the client's request. - - An HTTP server MAY send a lower response version, if it is known or - suspected that the client incorrectly implements the HTTP - specification, but this should not be the default, and this SHOULD - NOT be done if the request version is HTTP/1.1 or greater. - - - - - - - - - - - -Mogul, et. al. Informational [Page 5] - -RFC 2145 HTTP Version Numbers May 1997 - - -3 Security Considerations - - None, except to the extent that security mechanisms introduced in one - version of HTTP might depend on the proper interpretation of HTTP - version numbers in older implementations. - -4 References - - 1. Berners-Lee, T., R. Fielding, and H. Frystyk. Hypertext - Transfer Protocol -- HTTP/1.0. RFC 1945, HTTP Working Group, May, - 1996. - - 2. Fielding, Roy T., Jim Gettys, Jeffrey C. Mogul, Henrik Frystyk - Nielsen, and Tim Berners-Lee. Hypertext Transfer Protocol -- - HTTP/1.1. RFC 2068, HTTP Working Group, January, 1997. - - 3. Khare, Rohit. HTTP/1.2 Extension Protocol (PEP). HTTP Working - Group, Work in Progress. - - 4. Postel, Jon. Internet Protocol. RFC 791, NIC, September, 1981. - -5 Authors' addresses - - Jeffrey C. Mogul - Western Research Laboratory - Digital Equipment Corporation - 250 University Avenue - Palo Alto, California, 94305, USA - Email: mogul@wrl.dec.com - - Roy T. Fielding - Department of Information and Computer Science - University of California - Irvine, CA 92717-3425, USA - Fax: +1 (714) 824-4056 - Email: fielding@ics.uci.edu - - Jim Gettys - MIT Laboratory for Computer Science - 545 Technology Square - Cambridge, MA 02139, USA - Fax: +1 (617) 258 8682 - Email: jg@w3.org - - - - - - - - -Mogul, et. al. Informational [Page 6] - -RFC 2145 HTTP Version Numbers May 1997 - - - Henrik Frystyk Nielsen - W3 Consortium - MIT Laboratory for Computer Science - 545 Technology Square - Cambridge, MA 02139, USA - Fax: +1 (617) 258 8682 - Email: frystyk@w3.org - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Mogul, et. al. Informational [Page 7] - diff --git a/docs/specs/rfc2324.txt b/docs/specs/rfc2324.txt deleted file mode 100644 index a85921a9..00000000 --- a/docs/specs/rfc2324.txt +++ /dev/null @@ -1,563 +0,0 @@ - - - - - - -Network Working Group L. Masinter -Request for Comments: 2324 1 April 1998 -Category: Informational - - - Hyper Text Coffee Pot Control Protocol (HTCPCP/1.0) - -Status of this Memo - - This memo provides information for the Internet community. It does - not specify an Internet standard of any kind. Distribution of this - memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (1998). All Rights Reserved. - -Abstract - - This document describes HTCPCP, a protocol for controlling, - monitoring, and diagnosing coffee pots. - -1. Rationale and Scope - - There is coffee all over the world. Increasingly, in a world in which - computing is ubiquitous, the computists want to make coffee. Coffee - brewing is an art, but the distributed intelligence of the web- - connected world transcends art. Thus, there is a strong, dark, rich - requirement for a protocol designed espressoly for the brewing of - coffee. Coffee is brewed using coffee pots. Networked coffee pots - require a control protocol if they are to be controlled. - - Increasingly, home and consumer devices are being connected to the - Internet. Early networking experiments demonstrated vending devices - connected to the Internet for status monitoring [COKE]. One of the - first remotely _operated_ machine to be hooked up to the Internet, - the Internet Toaster, (controlled via SNMP) was debuted in 1990 - [RFC2235]. - - The demand for ubiquitous appliance connectivity that is causing the - consumption of the IPv4 address space. Consumers want remote control - of devices such as coffee pots so that they may wake up to freshly - brewed coffee, or cause coffee to be prepared at a precise time after - the completion of dinner preparations. - - - - - - - -Masinter Informational [Page 1] - -RFC 2324 HTCPCP/1.0 1 April 1998 - - - This document specifies a Hyper Text Coffee Pot Control Protocol - (HTCPCP), which permits the full request and responses necessary to - control all devices capable of making the popular caffeinated hot - beverages. - - HTTP 1.1 ([RFC2068]) permits the transfer of web objects from origin - servers to clients. The web is world-wide. HTCPCP is based on HTTP. - This is because HTTP is everywhere. It could not be so pervasive - without being good. Therefore, HTTP is good. If you want good coffee, - HTCPCP needs to be good. To make HTCPCP good, it is good to base - HTCPCP on HTTP. - - Future versions of this protocol may include extensions for espresso - machines and similar devices. - -2. HTCPCP Protocol - - The HTCPCP protocol is built on top of HTTP, with the addition of a - few new methods, header fields and return codes. All HTCPCP servers - should be referred to with the "coffee:" URI scheme (Section 4). - -2.1 HTCPCP Added Methods - -2.1.1 The BREW method, and the use of POST - - Commands to control a coffee pot are sent from client to coffee - server using either the BREW or POST method, and a message body with - Content-Type set to "application/coffee-pot-command". - - A coffee pot server MUST accept both the BREW and POST method - equivalently. However, the use of POST for causing actions to happen - is deprecated. - - Coffee pots heat water using electronic mechanisms, so there is no - fire. Thus, no firewalls are necessary, and firewall control policy - is irrelevant. However, POST may be a trademark for coffee, and so - the BREW method has been added. The BREW method may be used with - other HTTP-based protocols (e.g., the Hyper Text Brewery Control - Protocol). - -2.1.2 GET method - - In HTTP, the GET method is used to mean "retrieve whatever - information (in the form of an entity) identified by the Request- - URI." If the Request-URI refers to a data-producing process, it is - the produced data which shall be returned as the entity in the - response and not the source text of the process, unless that text - happens to be the output of the process. - - - -Masinter Informational [Page 2] - -RFC 2324 HTCPCP/1.0 1 April 1998 - - - In HTCPCP, the resources associated with a coffee pot are physical, - and not information resources. The "data" for most coffee URIs - contain no caffeine. - -2.1.3 PROPFIND method - - If a cup of coffee is data, metadata about the brewed resource is - discovered using the PROPFIND method [WEBDAV]. - -2.1.4 WHEN method - - When coffee is poured, and milk is offered, it is necessary for the - holder of the recipient of milk to say "when" at the time when - sufficient milk has been introduced into the coffee. For this - purpose, the "WHEN" method has been added to HTCPCP. Enough? Say - WHEN. - -2.2 Coffee Pot Header fields - - HTCPCP recommends several HTTP header fields and defines some new - ones. - -2.2.1 Recommended header fields - -2.2.1.1 The "safe" response header field. - - [SAFE] defines a HTTP response header field, "Safe", which can be - used to indicate that repeating a HTTP request is safe. The inclusion - of a "Safe: Yes" header field allows a client to repeat a previous - request if the result of the request might be repeated. - - The actual safety of devices for brewing coffee varies widely, and - may depend, in fact, on conditions in the client rather than just in - the server. Thus, this protocol includes an extension to the "Safe" - response header: - - Safe = "Safe" ":" safe-nature - safe-nature = "yes" | "no" | conditionally-safe - conditionally-safe = "if-" safe-condition - safe-condition = "user-awake" | token - - indication will allow user agents to handle retries of some safe - requests, in particular safe POST requests, in a more user-friendly - way. - - - - - - - -Masinter Informational [Page 3] - -RFC 2324 HTCPCP/1.0 1 April 1998 - - -2.2.2 New header fields - -2.2.2.1 The Accept-Additions header field - - In HTTP, the "Accept" request-header field is used to specify media - types which are acceptable for the response. However, in HTCPCP, the - response may result in additional actions on the part of the - automated pot. For this reason, HTCPCP adds a new header field, - "Accept-Additions": - - - Accept-Additions = "Accept-Additions" ":" - #( addition-range [ accept-params ] ) - - addition-type = ( "*" - | milk-type - | syrup-type - | sweetener-type - | spice-type - | alcohol-type - ) *( ";" parameter ) - milk-type = ( "Cream" | "Half-and-half" | "Whole-milk" - | "Part-Skim" | "Skim" | "Non-Dairy" ) - syrup-type = ( "Vanilla" | "Almond" | "Raspberry" - | "Chocolate" ) - alcohol-type = ( "Whisky" | "Rum" | "Kahlua" | "Aquavit" ) - -2.2.3 Omitted Header Fields - - No options were given for decaffeinated coffee. What's the point? - -2.3 HTCPCP return codes - - Normal HTTP return codes are used to indicate difficulties of the - HTCPCP server. This section identifies special interpretations and - new return codes. - -2.3.1 406 Not Acceptable - - This return code is normally interpreted as "The resource identified - by the request is only capable of generating response entities which - have content characteristics not acceptable according to the accept - headers sent in the request. In HTCPCP, this response code MAY be - returned if the operator of the coffee pot cannot comply with the - Accept-Addition request. Unless the request was a HEAD request, the - response SHOULD include an entity containing a list of available - coffee additions. - - - - -Masinter Informational [Page 4] - -RFC 2324 HTCPCP/1.0 1 April 1998 - - - In practice, most automated coffee pots cannot currently provide - additions. - -2.3.2 418 I'm a teapot - - Any attempt to brew coffee with a teapot should result in the error - code "418 I'm a teapot". The resulting entity body MAY be short and - stout. - -3. The "coffee" URI scheme - - Because coffee is international, there are international coffee URI - schemes. All coffee URL schemes are written with URL encoding of the - UTF-8 encoding of the characters that spell the word for "coffee" in - any of 29 languages, following the conventions for - internationalization in URIs [URLI18N]. - -coffee-url = coffee-scheme ":" [ "//" host ] - ["/" pot-designator ] ["?" additions-list ] - -coffee-scheme = ( "koffie" ; Afrikaans, Dutch - | "q%C3%A6hv%C3%A6" ; Azerbaijani - | "%D9%82%D9%87%D9%88%D8%A9" ; Arabic - | "akeita" ; Basque - | "koffee" ; Bengali - | "kahva" ; Bosnian - | "kafe" ; Bulgarian, Czech - | "caf%C3%E8" ; Catalan, French, Galician - | "%E5%92%96%E5%95%A1" ; Chinese - | "kava" ; Croatian - | "k%C3%A1va ; Czech - | "kaffe" ; Danish, Norwegian, Swedish - | "coffee" ; English - | "kafo" ; Esperanto - | "kohv" ; Estonian - | "kahvi" ; Finnish - | "%4Baffee" ; German - | "%CE%BA%CE%B1%CF%86%CE%AD" ; Greek - | "%E0%A4%95%E0%A5%8C%E0%A4%AB%E0%A5%80" ; Hindi - | "%E3%82%B3%E3%83%BC%E3%83%92%E3%83%BC" ; Japanese - | "%EC%BB%A4%ED%94%BC" ; Korean - | "%D0%BA%D0%BE%D1%84%D0%B5" ; Russian - | "%E0%B8%81%E0%B8%B2%E0%B9%81%E0%B8%9F" ; Thai - ) - - pot-designator = "pot-" integer ; for machines with multiple pots - additions-list = #( addition ) - - - - -Masinter Informational [Page 5] - -RFC 2324 HTCPCP/1.0 1 April 1998 - - - All alternative coffee-scheme forms are equivalent. However, the use - of coffee-scheme in various languages MAY be interpreted as an - indication of the kind of coffee produced by the coffee pot. Note - that while URL scheme names are case-independent, capitalization is - important for German and thus the initial "K" must be encoded. - -4. The "message/coffeepot" media type - - The entity body of a POST or BREW request MUST be of Content-Type - "message/coffeepot". Since most of the information for controlling - the coffee pot is conveyed by the additional headers, the content of - "message/coffeepot" contains only a coffee-message-body: - - coffee-message-body = "start" | "stop" - -5. Operational constraints - - This section lays out some of the operational issues with deployment - of HTCPCP ubiquitously. - -5.1 Timing Considerations - - A robust quality of service is required between the coffee pot user - and the coffee pot service. Coffee pots SHOULD use the Network Time - Protocol [NTP] to synchronize their clocks to a globally accurate - time standard. - - Telerobotics has been an expensive technology. However, with the - advent of the Cambridge Coffee Pot [CAM], the use of the web (rather - than SNMP) for remote system monitoring and management has been - proven. Additional coffee pot maintenance tasks might be - accomplished by remote robotics. - - Web data is normally static. Therefore to save data transmission and - time, Web browser programs store each Web page retrieved by a user on - the user's computer. Thus, if the user wants to return to that page, - it is now stored locally and does not need to be requested again from - the server. An image used for robot control or for monitoring a - changing scene is dynamic. A fresh version needs to be retrieved from - the server each time it is accessed. - -5.2 Crossing firewalls - - In most organizations HTTP traffic crosses firewalls fairly easily. - Modern coffee pots do not use fire. However, a "firewall" is useful - for protection of any source from any manner of heat, and not just - fire. Every home computer network SHOULD be protected by a firewall - from sources of heat. However, remote control of coffee pots is - - - -Masinter Informational [Page 6] - -RFC 2324 HTCPCP/1.0 1 April 1998 - - - important from outside the home. Thus, it is important that HTCPCP - cross firewalls easily. - - By basing HTCPCP on HTTP and using port 80, it will get all of HTTP's - firewall-crossing virtues. Of course, the home firewalls will require - reconfiguration or new versions in order to accommodate HTCPCP- - specific methods, headers and trailers, but such upgrades will be - easily accommodated. Most home network system administrators drink - coffee, and are willing to accommodate the needs of tunnelling - HTCPCP. - -6. System management considerations - - Coffee pot monitoring using HTTP protocols has been an early - application of the web. In the earliest instance, coffee pot - monitoring was an early (and appropriate) use of ATM networks [CAM]. - - The traditional technique [CAM] was to attach a frame-grabber to a - video camera, and feed the images to a web server. This was an - appropriate application of ATM networks. In this coffee pot - installation, the Trojan Room of Cambridge University laboratories - was used to give a web interface to monitor a common coffee pot. of - us involved in related research and, being poor, impoverished - academics, we only had one coffee filter machine between us, which - lived in the corridor just outside the Trojan Room. However, being - highly dedicated and hard-working academics, we got through a lot of - coffee, and when a fresh pot was brewed, it often didn't last long. - - This service was created as the first application to use a new RPC - mechanism designed in the Cambridge Computer Laboratory - MSRPC2. It - runs over MSNL (Multi-Service Network Layer) - a network layer - protocol designed for ATM networks. - - Coffee pots on the Internet may be managed using the Coffee Pot MIB - [CPMIB]. - -7. Security Considerations - - Anyone who gets in between me and my morning coffee should be - insecure. - - Unmoderated access to unprotected coffee pots from Internet users - might lead to several kinds of "denial of coffee service" attacks. - The improper use of filtration devices might admit trojan grounds. - Filtration is not a good virus protection method. - - - - - - -Masinter Informational [Page 7] - -RFC 2324 HTCPCP/1.0 1 April 1998 - - - Putting coffee grounds into Internet plumbing may result in clogged - plumbing, which would entail the services of an Internet Plumber - [PLUMB], who would, in turn, require an Internet Plumber's Helper. - - Access authentication will be discussed in a separate memo. - -8. Acknowledgements - - Many thanks to the many contributors to this standard, including Roy - Fielding, Mark Day, Keith Moore, Carl Uno-Manros, Michael Slavitch, - and Martin Duerst. The inspiration of the Prancing Pony, the CMU - Coke Machine, the Cambridge Coffee Pot, the Internet Toaster, and - other computer controlled remote devices have led to this valuable - creation. - -9. References - - [RFC2068] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., and T. - Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2068, - January 1997. - - [RFC2186] Wessels, D., and K. Claffy, "Internet Cache Protocol (ICP), - version 2," RFC 2186, September 1997 - - [CPMIB] Slavitch, M., "Definitions of Managed Objects for Drip-Type - Heated Beverage Hardware Devices using SMIv2", RFC 2325, 1 April - 1998. - - [HTSVMP] Q. Stafford-Fraser, "Hyper Text Sandwich Van Monitoring - Protocol, Version 3.2". In preparation. - - [RFC2295] Holtman, K., and A. Mutz, "Transparent Content Negotiation - in HTTP", RFC 2295, March 1998. - - [SAFE] K. Holtman. "The Safe Response Header Field", September 1997. - - [CAM] "The Trojan Room Coffee Machine", D. Gordon and M. Johnson, - University of Cambridge Computer Lab, - <http://www.cl.cam.ac.uk/coffee/coffee.html> - - [CBIO] "The Trojan Room Coffee Pot, a (non-technical) biography", Q. - Stafford-Fraser, University of Cambridge Computer Lab, - <http://www.cl.cam.ac.uk/coffee/qsf/coffee.html>. - - [RFC2235] Zakon, R., "Hobbes' Internet Timeline", FYI 32, RFC 2230, - November 1997. See also - <http://www.internode.com.au/images/toaster2.jpg> - - - - -Masinter Informational [Page 8] - -RFC 2324 HTCPCP/1.0 1 April 1998 - - - [NTP] Mills, D., "Network Time Protocol (Version 3) Specification, - Implementation and Analysis", RFC 1305, March 1992. - - [URLI18N] Masinter, L., "Using UTF8 for non-ASCII Characters in - Extended URIs" Work in Progress. - - [PLUMB] B. Metcalfe, "Internet Plumber of the Year: Jim Gettys", - Infoworld, February 2, 1998. - - [COKE] D. Nichols, "Coke machine history", C. Everhart, "Interesting - uses of networking", <http://www- - cse.ucsd.edu/users/bsy/coke.history.txt>. - -10. Author's Address - - Larry Masinter - Xerox Palo Alto Research Center - 3333 Coyote Hill Road - Palo Alto, CA 94304 - - EMail: masinter@parc.xerox.com - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Masinter Informational [Page 9] - -RFC 2324 HTCPCP/1.0 1 April 1998 - - -11. Full Copyright Statement - - Copyright (C) The Internet Society (1998). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - - - - - - - - - - - - - - - - - - - - - - - - -Masinter Informational [Page 10] - diff --git a/docs/specs/rfc2388.txt b/docs/specs/rfc2388.txt deleted file mode 100644 index ffb9b6c9..00000000 --- a/docs/specs/rfc2388.txt +++ /dev/null @@ -1,507 +0,0 @@ - - - - - - -Network Working Group L. Masinter -Request for Comments: 2388 Xerox Corporation -Category: Standards Track August 1998 - - - Returning Values from Forms: multipart/form-data - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (1998). All Rights Reserved. - -1. Abstract - - This specification defines an Internet Media Type, multipart/form- - data, which can be used by a wide variety of applications and - transported by a wide variety of protocols as a way of returning a - set of values as the result of a user filling out a form. - -2. Introduction - - In many applications, it is possible for a user to be presented with - a form. The user will fill out the form, including information that - is typed, generated by user input, or included from files that the - user has selected. When the form is filled out, the data from the - form is sent from the user to the receiving application. - - The definition of MultiPart/Form-Data is derived from one of those - applications, originally set out in [RFC1867] and subsequently - incorporated into [HTML40], where forms are expressed in HTML, and in - which the form values are sent via HTTP or electronic mail. This - representation is widely implemented in numerous web browsers and web - servers. - - However, multipart/form-data can be used for forms that are presented - using representations other than HTML (spreadsheets, Portable - Document Format, etc), and for transport using other means than - electronic mail or HTTP. This document defines the representation of - form values independently of the application for which it is used. - - - - - -Masinter Standards Track [Page 1] - -RFC 2388 multipart/form-data August 1998 - - -3. Definition of multipart/form-data - - The media-type multipart/form-data follows the rules of all multipart - MIME data streams as outlined in [RFC 2046]. In forms, there are a - series of fields to be supplied by the user who fills out the form. - Each field has a name. Within a given form, the names are unique. - - "multipart/form-data" contains a series of parts. Each part is - expected to contain a content-disposition header [RFC 2183] where the - disposition type is "form-data", and where the disposition contains - an (additional) parameter of "name", where the value of that - parameter is the original field name in the form. For example, a part - might contain a header: - - Content-Disposition: form-data; name="user" - - with the value corresponding to the entry of the "user" field. - - Field names originally in non-ASCII character sets may be encoded - within the value of the "name" parameter using the standard method - described in RFC 2047. - - As with all multipart MIME types, each part has an optional - "Content-Type", which defaults to text/plain. If the contents of a - file are returned via filling out a form, then the file input is - identified as the appropriate media type, if known, or - "application/octet-stream". If multiple files are to be returned as - the result of a single form entry, they should be represented as a - "multipart/mixed" part embedded within the "multipart/form-data". - - Each part may be encoded and the "content-transfer-encoding" header - supplied if the value of that part does not conform to the default - encoding. - -4. Use of multipart/form-data - -4.1 Boundary - - As with other multipart types, a boundary is selected that does not - occur in any of the data. Each field of the form is sent, in the - order defined by the sending appliction and form, as a part of the - multipart stream. Each part identifies the INPUT name within the - original form. Each part should be labelled with an appropriate - content-type if the media type is known (e.g., inferred from the file - extension or operating system typing information) or as - "application/octet-stream". - - - - - -Masinter Standards Track [Page 2] - -RFC 2388 multipart/form-data August 1998 - - -4.2 Sets of files - - If the value of a form field is a set of files rather than a single - file, that value can be transferred together using the - "multipart/mixed" format. - -4.3 Encoding - - While the HTTP protocol can transport arbitrary binary data, the - default for mail transport is the 7BIT encoding. The value supplied - for a part may need to be encoded and the "content-transfer-encoding" - header supplied if the value does not conform to the default - encoding. [See section 5 of RFC 2046 for more details.] - -4.4 Other attributes - - Forms may request file inputs from the user; the form software may - include the file name and other file attributes, as specified in [RFC - 2184]. - - The original local file name may be supplied as well, either as a - "filename" parameter either of the "content-disposition: form-data" - header or, in the case of multiple files, in a "content-disposition: - file" header of the subpart. The sending application MAY supply a - file name; if the file name of the sender's operating system is not - in US-ASCII, the file name might be approximated, or encoded using - the method of RFC 2231. - - This is a convenience for those cases where the files supplied by the - form might contain references to each other, e.g., a TeX file and its - .sty auxiliary style description. - -4.5 Charset of text in form data - - Each part of a multipart/form-data is supposed to have a content- - type. In the case where a field element is text, the charset - parameter for the text indicates the character encoding used. - - For example, a form with a text field in which a user typed 'Joe owes - <eu>100' where <eu> is the Euro symbol might have form data returned - as: - - --AaB03x - content-disposition: form-data; name="field1" - content-type: text/plain;charset=windows-1250 - content-transfer-encoding: quoted-printable - - - - - -Masinter Standards Track [Page 3] - -RFC 2388 multipart/form-data August 1998 - - - Joe owes =80100. - --AaB03x - -5. Operability considerations - -5.1 Compression, encryption - - Some of the data in forms may be compressed or encrypted, using other - MIME mechanisms. This is a function of the application that is - generating the form-data. - -5.2 Other data encodings rather than multipart - - Various people have suggested using new mime top-level type - "aggregate", e.g., aggregate/mixed or a content-transfer-encoding of - "packet" to express indeterminate-length binary data, rather than - relying on the multipart-style boundaries. While this would be - useful, the "multipart" mechanisms are well established, simple to - implement on both the sending client and receiving server, and as - efficient as other methods of dealing with multiple combinations of - binary data. - - The multipart/form-data encoding has a high overhead and performance - impact if there are many fields with short values. However, in - practice, for the forms in use, for example, in HTML, the average - overhead is not significant. - -5.3 Remote files with third-party transfer - - In some scenarios, the user operating the form software might want to - specify a URL for remote data rather than a local file. In this case, - is there a way to allow the browser to send to the client a pointer - to the external data rather than the entire contents? This capability - could be implemented, for example, by having the client send to the - server data of type "message/external-body" with "access-type" set - to, say, "uri", and the URL of the remote data in the body of the - message. - -5.4 Non-ASCII field names - - Note that MIME headers are generally required to consist only of 7- - bit data in the US-ASCII character set. Hence field names should be - encoded according to the method in RFC 2047 if they contain - characters outside of that set. - - - - - - - -Masinter Standards Track [Page 4] - -RFC 2388 multipart/form-data August 1998 - - -5.5 Ordered fields and duplicated field names - - The relationship of the ordering of fields within a form and the - ordering of returned values within "multipart/form-data" is not - defined by this specification, nor is the handling of the case where - a form has multiple fields with the same name. While HTML-based forms - may send back results in the order received, and intermediaries - should not reorder the results, there are some systems which might - not define a natural order for form fields. - -5.6 Interoperability with web applications - - Many web applications use the "application/x-url-encoded" method for - returning data from forms. This format is quite compact, e.g.: - - name=Xavier+Xantico&verdict=Yes&colour=Blue&happy=sad&Utf%F6r=Send - - however, there is no opportunity to label the enclosed data with - content type, apply a charset, or use other encoding mechanisms. - - Many form-interpreting programs (primarly web browsers) now implement - and generate multipart/form-data, but an existing application might - need to optionally support both the application/x-url-encoded format - as well. - -5.7 Correlating form data with the original form - - This specification provides no specific mechanism by which - multipart/form-data can be associated with the form that caused it to - be transmitted. This separation is intentional; many different forms - might be used for transmitting the same data. In practice, - applications may supply a specific form processing resource (in HTML, - the ACTION attribute in a FORM tag) for each different form. - Alternatively, data about the form might be encoded in a "hidden - field" (a field which is part of the form but which has a fixed value - to be transmitted back to the form-data processor.) - -6. Security Considerations - - The data format described in this document introduces no new security - considerations outside of those introduced by the protocols that use - it and of the component elements. It is important when interpreting - content-disposition to not overwrite files in the recipients address - space inadvertently. - - User applications that request form information from users must be - careful not to cause a user to send information to the requestor or a - third party unwillingly or unwittingly. For example, a form might - - - -Masinter Standards Track [Page 5] - -RFC 2388 multipart/form-data August 1998 - - - request 'spam' information to be sent to an unintended third party, - or private information to be sent to someone that the user might not - actually intend. While this is primarily an issue for the - representation and interpretation of forms themselves, rather than - the data representation of the result of form transmission, the - transportation of private information must be done in a way that does - not expose it to unwanted prying. - - With the introduction of form-data that can reasonably send back the - content of files from user's file space, the possibility that a user - might be sent an automated script that fills out a form and then - sends the user's local file to another address arises. Thus, - additional caution is required when executing automated scripting - where form-data might include user's files. - -7. Author's Address - - Larry Masinter - Xerox Palo Alto Research Center - 3333 Coyote Hill Road - Palo Alto, CA 94304 - - Fax: +1 650 812 4333 - EMail: masinter@parc.xerox.com - - - - - - - - - - - - - - - - - - - - - - - - - - - -Masinter Standards Track [Page 6] - -RFC 2388 multipart/form-data August 1998 - - -Appendix A. Media type registration for multipart/form-data - - Media Type name: - multipart - - Media subtype name: - form-data - - Required parameters: - none - - Optional parameters: - none - - Encoding considerations: - No additional considerations other than as for other multipart - types. - - Security Considerations - Applications which receive forms and process them must be careful - not to supply data back to the requesting form processing site that - was not intended to be sent by the recipient. This is a - consideration for any application that generates a multipart/form- - data. - - The multipart/form-data type introduces no new security - considerations for recipients beyond what might occur with any of - the enclosed parts. - - - - - - - - - - - - - - - - - - - - - - - -Masinter Standards Track [Page 7] - -RFC 2388 multipart/form-data August 1998 - - -References - - [RFC 2046] Freed, N., and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part Two: Media Types", RFC 2046, - November 1996. - - [RFC 2047] Moore, K., "MIME (Multipurpose Internet Mail Extensions) - Part Three: Message Header Extensions for Non-ASCII Text", - RFC 2047, November 1996. - - [RFC 2231] Freed, N., and K. Moore, "MIME Parameter Value and Encoded - Word Extensions: Character Sets, Languages, and - Continuations", RFC 2231, November 1997. - - [RFC 1806] Troost, R., and S. Dorner, "Communicating Presentation - Information in Internet Messages: The Content-Disposition - Header", RFC 1806, June 1995. - - [RFC 1867] Nebel, E., and L. Masinter, "Form-based File Upload in - HTML", RFC 1867, November 1995. - - [RFC 2183] Troost, R., Dorner, S., and K. Moore, "Communicating - Presentation Information in Internet Messages: The - Content-Disposition Header Field", RFC 2183, August 1997. - - [RFC 2184] Freed, N., and K. Moore, "MIME Parameter Value and Encoded - Word Extensions: Character Sets, Languages, and - Continuations", RFC 2184, August 1997. - - [HTML40] D. Raggett, A. Le Hors, I. Jacobs. "HTML 4.0 - Specification", World Wide Web Consortium Technical Report - "REC-html40", December, 1997. <http://www.w3.org/TR/REC- - html40/> - - - - - - - - - - - - - - - - - - -Masinter Standards Track [Page 8] - -RFC 2388 multipart/form-data August 1998 - - -Full Copyright Statement - - Copyright (C) The Internet Society (1998). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - - - - - - - - - - - - - - - - - - - - - - - - -Masinter Standards Track [Page 9] - diff --git a/docs/specs/rfc2518.txt b/docs/specs/rfc2518.txt deleted file mode 100644 index 81d40387..00000000 --- a/docs/specs/rfc2518.txt +++ /dev/null @@ -1,5267 +0,0 @@ - - - - - - -Network Working Group Y. Goland -Request for Comments: 2518 Microsoft -Category: Standards Track E. Whitehead - UC Irvine - A. Faizi - Netscape - S. Carter - Novell - D. Jensen - Novell - February 1999 - - - HTTP Extensions for Distributed Authoring -- WEBDAV - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (1999). All Rights Reserved. - -Abstract - - This document specifies a set of methods, headers, and content-types - ancillary to HTTP/1.1 for the management of resource properties, - creation and management of resource collections, namespace - manipulation, and resource locking (collision avoidance). - -Table of Contents - - ABSTRACT............................................................1 - 1 INTRODUCTION .....................................................5 - 2 NOTATIONAL CONVENTIONS ...........................................7 - 3 TERMINOLOGY ......................................................7 - 4 DATA MODEL FOR RESOURCE PROPERTIES ...............................8 - 4.1 The Resource Property Model ...................................8 - 4.2 Existing Metadata Proposals ...................................8 - 4.3 Properties and HTTP Headers ...................................9 - 4.4 Property Values ...............................................9 - 4.5 Property Names ...............................................10 - 4.6 Media Independent Links ......................................10 - 5 COLLECTIONS OF WEB RESOURCES ....................................11 - - - -Goland, et al. Standards Track [Page 1] - -RFC 2518 WEBDAV February 1999 - - - 5.1 HTTP URL Namespace Model .....................................11 - 5.2 Collection Resources .........................................11 - 5.3 Creation and Retrieval of Collection Resources ...............12 - 5.4 Source Resources and Output Resources ........................13 - 6 LOCKING .........................................................14 - 6.1 Exclusive Vs. Shared Locks ...................................14 - 6.2 Required Support .............................................16 - 6.3 Lock Tokens ..................................................16 - 6.4 opaquelocktoken Lock Token URI Scheme ........................16 - 6.4.1 Node Field Generation Without the IEEE 802 Address ........17 - 6.5 Lock Capability Discovery ....................................19 - 6.6 Active Lock Discovery ........................................19 - 6.7 Usage Considerations .........................................19 - 7 WRITE LOCK ......................................................20 - 7.1 Methods Restricted by Write Locks ............................20 - 7.2 Write Locks and Lock Tokens ..................................20 - 7.3 Write Locks and Properties ...................................20 - 7.4 Write Locks and Null Resources ...............................21 - 7.5 Write Locks and Collections ..................................21 - 7.6 Write Locks and the If Request Header ........................22 - 7.6.1 Example - Write Lock ......................................22 - 7.7 Write Locks and COPY/MOVE ....................................23 - 7.8 Refreshing Write Locks .......................................23 - 8 HTTP METHODS FOR DISTRIBUTED AUTHORING ..........................23 - 8.1 PROPFIND .....................................................24 - 8.1.1 Example - Retrieving Named Properties .....................25 - 8.1.2 Example - Using allprop to Retrieve All Properties ........26 - 8.1.3 Example - Using propname to Retrieve all Property Names ...29 - 8.2 PROPPATCH ....................................................31 - 8.2.1 Status Codes for use with 207 (Multi-Status) ..............31 - 8.2.2 Example - PROPPATCH .......................................32 - 8.3 MKCOL Method .................................................33 - 8.3.1 Request ...................................................33 - 8.3.2 Status Codes ..............................................33 - 8.3.3 Example - MKCOL ...........................................34 - 8.4 GET, HEAD for Collections ....................................34 - 8.5 POST for Collections .........................................35 - 8.6 DELETE .......................................................35 - 8.6.1 DELETE for Non-Collection Resources .......................35 - 8.6.2 DELETE for Collections ....................................36 - 8.7 PUT ..........................................................36 - 8.7.1 PUT for Non-Collection Resources ..........................36 - 8.7.2 PUT for Collections .......................................37 - 8.8 COPY Method ..................................................37 - 8.8.1 COPY for HTTP/1.1 resources ...............................37 - 8.8.2 COPY for Properties .......................................38 - 8.8.3 COPY for Collections ......................................38 - 8.8.4 COPY and the Overwrite Header .............................39 - - - -Goland, et al. Standards Track [Page 2] - -RFC 2518 WEBDAV February 1999 - - - 8.8.5 Status Codes ..............................................39 - 8.8.6 Example - COPY with Overwrite .............................40 - 8.8.7 Example - COPY with No Overwrite ..........................40 - 8.8.8 Example - COPY of a Collection ............................41 - 8.9 MOVE Method ..................................................42 - 8.9.1 MOVE for Properties .......................................42 - 8.9.2 MOVE for Collections ......................................42 - 8.9.3 MOVE and the Overwrite Header .............................43 - 8.9.4 Status Codes ..............................................43 - 8.9.5 Example - MOVE of a Non-Collection ........................44 - 8.9.6 Example - MOVE of a Collection ............................44 - 8.10 LOCK Method ..................................................45 - 8.10.1 Operation .................................................46 - 8.10.2 The Effect of Locks on Properties and Collections .........46 - 8.10.3 Locking Replicated Resources ..............................46 - 8.10.4 Depth and Locking .........................................46 - 8.10.5 Interaction with other Methods ............................47 - 8.10.6 Lock Compatibility Table ..................................47 - 8.10.7 Status Codes ..............................................48 - 8.10.8 Example - Simple Lock Request .............................48 - 8.10.9 Example - Refreshing a Write Lock .........................49 - 8.10.10 Example - Multi-Resource Lock Request ....................50 - 8.11 UNLOCK Method ................................................51 - 8.11.1 Example - UNLOCK ..........................................52 - 9 HTTP HEADERS FOR DISTRIBUTED AUTHORING ..........................52 - 9.1 DAV Header ...................................................52 - 9.2 Depth Header .................................................52 - 9.3 Destination Header ...........................................54 - 9.4 If Header ....................................................54 - 9.4.1 No-tag-list Production ....................................55 - 9.4.2 Tagged-list Production ....................................55 - 9.4.3 not Production ............................................56 - 9.4.4 Matching Function .........................................56 - 9.4.5 If Header and Non-DAV Compliant Proxies ...................57 - 9.5 Lock-Token Header ............................................57 - 9.6 Overwrite Header .............................................57 - 9.7 Status-URI Response Header ...................................57 - 9.8 Timeout Request Header .......................................58 - 10 STATUS CODE EXTENSIONS TO HTTP/1.1 ............................59 - 10.1 102 Processing ...............................................59 - 10.2 207 Multi-Status .............................................59 - 10.3 422 Unprocessable Entity .....................................60 - 10.4 423 Locked ...................................................60 - 10.5 424 Failed Dependency ........................................60 - 10.6 507 Insufficient Storage .....................................60 - 11 MULTI-STATUS RESPONSE .........................................60 - 12 XML ELEMENT DEFINITIONS .......................................61 - 12.1 activelock XML Element .......................................61 - - - -Goland, et al. Standards Track [Page 3] - -RFC 2518 WEBDAV February 1999 - - - 12.1.1 depth XML Element .........................................61 - 12.1.2 locktoken XML Element .....................................61 - 12.1.3 timeout XML Element .......................................61 - 12.2 collection XML Element .......................................62 - 12.3 href XML Element .............................................62 - 12.4 link XML Element .............................................62 - 12.4.1 dst XML Element ...........................................62 - 12.4.2 src XML Element ...........................................62 - 12.5 lockentry XML Element ........................................63 - 12.6 lockinfo XML Element .........................................63 - 12.7 lockscope XML Element ........................................63 - 12.7.1 exclusive XML Element .....................................63 - 12.7.2 shared XML Element ........................................63 - 12.8 locktype XML Element .........................................64 - 12.8.1 write XML Element .........................................64 - 12.9 multistatus XML Element ......................................64 - 12.9.1 response XML Element ......................................64 - 12.9.2 responsedescription XML Element ...........................65 - 12.10 owner XML Element ...........................................65 - 12.11 prop XML element ............................................66 - 12.12 propertybehavior XML element ................................66 - 12.12.1 keepalive XML element ....................................66 - 12.12.2 omit XML element .........................................67 - 12.13 propertyupdate XML element ..................................67 - 12.13.1 remove XML element .......................................67 - 12.13.2 set XML element ..........................................67 - 12.14 propfind XML Element ........................................68 - 12.14.1 allprop XML Element ......................................68 - 12.14.2 propname XML Element .....................................68 - 13 DAV PROPERTIES ................................................68 - 13.1 creationdate Property ........................................69 - 13.2 displayname Property .........................................69 - 13.3 getcontentlanguage Property ..................................69 - 13.4 getcontentlength Property ....................................69 - 13.5 getcontenttype Property ......................................70 - 13.6 getetag Property .............................................70 - 13.7 getlastmodified Property .....................................70 - 13.8 lockdiscovery Property .......................................71 - 13.8.1 Example - Retrieving the lockdiscovery Property ...........71 - 13.9 resourcetype Property ........................................72 - 13.10 source Property .............................................72 - 13.10.1 Example - A source Property ..............................72 - 13.11 supportedlock Property ......................................73 - 13.11.1 Example - Retrieving the supportedlock Property ..........73 - 14 INSTRUCTIONS FOR PROCESSING XML IN DAV ........................74 - 15 DAV COMPLIANCE CLASSES ........................................75 - 15.1 Class 1 ......................................................75 - 15.2 Class 2 ......................................................75 - - - -Goland, et al. Standards Track [Page 4] - -RFC 2518 WEBDAV February 1999 - - - 16 INTERNATIONALIZATION CONSIDERATIONS ...........................76 - 17 SECURITY CONSIDERATIONS .......................................77 - 17.1 Authentication of Clients ....................................77 - 17.2 Denial of Service ............................................78 - 17.3 Security through Obscurity ...................................78 - 17.4 Privacy Issues Connected to Locks ............................78 - 17.5 Privacy Issues Connected to Properties .......................79 - 17.6 Reduction of Security due to Source Link .....................79 - 17.7 Implications of XML External Entities ........................79 - 17.8 Risks Connected with Lock Tokens .............................80 - 18 IANA CONSIDERATIONS ...........................................80 - 19 INTELLECTUAL PROPERTY .........................................81 - 20 ACKNOWLEDGEMENTS ..............................................82 - 21 REFERENCES ....................................................82 - 21.1 Normative References .........................................82 - 21.2 Informational References .....................................83 - 22 AUTHORS' ADDRESSES ............................................84 - 23 APPENDICES ....................................................86 - 23.1 Appendix 1 - WebDAV Document Type Definition .................86 - 23.2 Appendix 2 - ISO 8601 Date and Time Profile ..................88 - 23.3 Appendix 3 - Notes on Processing XML Elements ................89 - 23.3.1 Notes on Empty XML Elements ...............................89 - 23.3.2 Notes on Illegal XML Processing ...........................89 - 23.4 Appendix 4 -- XML Namespaces for WebDAV ......................92 - 23.4.1 Introduction ..............................................92 - 23.4.2 Meaning of Qualified Names ................................92 - 24 FULL COPYRIGHT STATEMENT ......................................94 - - - -1 Introduction - - This document describes an extension to the HTTP/1.1 protocol that - allows clients to perform remote web content authoring operations. - This extension provides a coherent set of methods, headers, request - entity body formats, and response entity body formats that provide - operations for: - - Properties: The ability to create, remove, and query information - about Web pages, such as their authors, creation dates, etc. Also, - the ability to link pages of any media type to related pages. - - Collections: The ability to create sets of documents and to retrieve - a hierarchical membership listing (like a directory listing in a file - system). - - - - - - -Goland, et al. Standards Track [Page 5] - -RFC 2518 WEBDAV February 1999 - - - Locking: The ability to keep more than one person from working on a - document at the same time. This prevents the "lost update problem," - in which modifications are lost as first one author then another - writes changes without merging the other author's changes. - - Namespace Operations: The ability to instruct the server to copy and - move Web resources. - - Requirements and rationale for these operations are described in a - companion document, "Requirements for a Distributed Authoring and - Versioning Protocol for the World Wide Web" [RFC2291]. - - The sections below provide a detailed introduction to resource - properties (section 4), collections of resources (section 5), and - locking operations (section 6). These sections introduce the - abstractions manipulated by the WebDAV-specific HTTP methods - described in section 8, "HTTP Methods for Distributed Authoring". - - In HTTP/1.1, method parameter information was exclusively encoded in - HTTP headers. Unlike HTTP/1.1, WebDAV encodes method parameter - information either in an Extensible Markup Language (XML) [REC-XML] - request entity body, or in an HTTP header. The use of XML to encode - method parameters was motivated by the ability to add extra XML - elements to existing structures, providing extensibility; and by - XML's ability to encode information in ISO 10646 character sets, - providing internationalization support. As a rule of thumb, - parameters are encoded in XML entity bodies when they have unbounded - length, or when they may be shown to a human user and hence require - encoding in an ISO 10646 character set. Otherwise, parameters are - encoded within HTTP headers. Section 9 describes the new HTTP - headers used with WebDAV methods. - - In addition to encoding method parameters, XML is used in WebDAV to - encode the responses from methods, providing the extensibility and - internationalization advantages of XML for method output, as well as - input. - - XML elements used in this specification are defined in section 12. - - The XML namespace extension (Appendix 4) is also used in this - specification in order to allow for new XML elements to be added - without fear of colliding with other element names. - - While the status codes provided by HTTP/1.1 are sufficient to - describe most error conditions encountered by WebDAV methods, there - are some errors that do not fall neatly into the existing categories. - New status codes developed for the WebDAV methods are defined in - section 10. Since some WebDAV methods may operate over many - - - -Goland, et al. Standards Track [Page 6] - -RFC 2518 WEBDAV February 1999 - - - resources, the Multi-Status response has been introduced to return - status information for multiple resources. The Multi-Status response - is described in section 11. - - WebDAV employs the property mechanism to store information about the - current state of the resource. For example, when a lock is taken out - on a resource, a lock information property describes the current - state of the lock. Section 13 defines the properties used within the - WebDAV specification. - - Finishing off the specification are sections on what it means to be - compliant with this specification (section 15), on - internationalization support (section 16), and on security (section - 17). - -2 Notational Conventions - - Since this document describes a set of extensions to the HTTP/1.1 - protocol, the augmented BNF used herein to describe protocol elements - is exactly the same as described in section 2.1 of [RFC2068]. Since - this augmented BNF uses the basic production rules provided in - section 2.2 of [RFC2068], these rules apply to this document as well. - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this - document are to be interpreted as described in RFC 2119 [RFC2119]. - -3 Terminology - - URI/URL - A Uniform Resource Identifier and Uniform Resource Locator, - respectively. These terms (and the distinction between them) are - defined in [RFC2396]. - - Collection - A resource that contains a set of URIs, termed member - URIs, which identify member resources and meets the requirements in - section 5 of this specification. - - Member URI - A URI which is a member of the set of URIs contained by - a collection. - - Internal Member URI - A Member URI that is immediately relative to - the URI of the collection (the definition of immediately relative is - given in section 5.2). - - Property - A name/value pair that contains descriptive information - about a resource. - - - - - -Goland, et al. Standards Track [Page 7] - -RFC 2518 WEBDAV February 1999 - - - Live Property - A property whose semantics and syntax are enforced by - the server. For example, the live "getcontentlength" property has - its value, the length of the entity returned by a GET request, - automatically calculated by the server. - - Dead Property - A property whose semantics and syntax are not - enforced by the server. The server only records the value of a dead - property; the client is responsible for maintaining the consistency - of the syntax and semantics of a dead property. - - Null Resource - A resource which responds with a 404 (Not Found) to - any HTTP/1.1 or DAV method except for PUT, MKCOL, OPTIONS and LOCK. - A NULL resource MUST NOT appear as a member of its parent collection. - -4 Data Model for Resource Properties - -4.1 The Resource Property Model - - Properties are pieces of data that describe the state of a resource. - Properties are data about data. - - Properties are used in distributed authoring environments to provide - for efficient discovery and management of resources. For example, a - 'subject' property might allow for the indexing of all resources by - their subject, and an 'author' property might allow for the discovery - of what authors have written which documents. - - The DAV property model consists of name/value pairs. The name of a - property identifies the property's syntax and semantics, and provides - an address by which to refer to its syntax and semantics. - - There are two categories of properties: "live" and "dead". A live - property has its syntax and semantics enforced by the server. Live - properties include cases where a) the value of a property is read- - only, maintained by the server, and b) the value of the property is - maintained by the client, but the server performs syntax checking on - submitted values. All instances of a given live property MUST comply - with the definition associated with that property name. A dead - property has its syntax and semantics enforced by the client; the - server merely records the value of the property verbatim. - -4.2 Existing Metadata Proposals - - Properties have long played an essential role in the maintenance of - large document repositories, and many current proposals contain some - notion of a property, or discuss web metadata more generally. These - include PICS [REC-PICS], PICS-NG, XML, Web Collections, and several - proposals on representing relationships within HTML. Work on PICS-NG - - - -Goland, et al. Standards Track [Page 8] - -RFC 2518 WEBDAV February 1999 - - - and Web Collections has been subsumed by the Resource Description - Framework (RDF) metadata activity of the World Wide Web Consortium. - RDF consists of a network-based data model and an XML representation - of that model. - - Some proposals come from a digital library perspective. These - include the Dublin Core [RFC2413] metadata set and the Warwick - Framework [WF], a container architecture for different metadata - schemas. The literature includes many examples of metadata, - including MARC [USMARC], a bibliographic metadata format, and a - technical report bibliographic format employed by the Dienst system - [RFC1807]. Additionally, the proceedings from the first IEEE Metadata - conference describe many community-specific metadata sets. - - Participants of the 1996 Metadata II Workshop in Warwick, UK [WF], - noted that "new metadata sets will develop as the networked - infrastructure matures" and "different communities will propose, - design, and be responsible for different types of metadata." These - observations can be corroborated by noting that many community- - specific sets of metadata already exist, and there is significant - motivation for the development of new forms of metadata as many - communities increasingly make their data available in digital form, - requiring a metadata format to assist data location and cataloging. - -4.3 Properties and HTTP Headers - - Properties already exist, in a limited sense, in HTTP message - headers. However, in distributed authoring environments a relatively - large number of properties are needed to describe the state of a - resource, and setting/returning them all through HTTP headers is - inefficient. Thus a mechanism is needed which allows a principal to - identify a set of properties in which the principal is interested and - to set or retrieve just those properties. - -4.4 Property Values - - The value of a property when expressed in XML MUST be well formed. - - XML has been chosen because it is a flexible, self-describing, - structured data format that supports rich schema definitions, and - because of its support for multiple character sets. XML's self- - describing nature allows any property's value to be extended by - adding new elements. Older clients will not break when they - encounter extensions because they will still have the data specified - in the original schema and will ignore elements they do not - understand. XML's support for multiple character sets allows any - human-readable property to be encoded and read in a character set - familiar to the user. XML's support for multiple human languages, - - - -Goland, et al. Standards Track [Page 9] - -RFC 2518 WEBDAV February 1999 - - - using the "xml:lang" attribute, handles cases where the same - character set is employed by multiple human languages. - -4.5 Property Names - - A property name is a universally unique identifier that is associated - with a schema that provides information about the syntax and - semantics of the property. - - Because a property's name is universally unique, clients can depend - upon consistent behavior for a particular property across multiple - resources, on the same and across different servers, so long as that - property is "live" on the resources in question, and the - implementation of the live property is faithful to its definition. - - The XML namespace mechanism, which is based on URIs [RFC2396], is - used to name properties because it prevents namespace collisions and - provides for varying degrees of administrative control. - - The property namespace is flat; that is, no hierarchy of properties - is explicitly recognized. Thus, if a property A and a property A/B - exist on a resource, there is no recognition of any relationship - between the two properties. It is expected that a separate - specification will eventually be produced which will address issues - relating to hierarchical properties. - - Finally, it is not possible to define the same property twice on a - single resource, as this would cause a collision in the resource's - property namespace. - -4.6 Media Independent Links - - Although HTML resources support links to other resources, the Web - needs more general support for links between resources of any media - type (media types are also known as MIME types, or content types). - WebDAV provides such links. A WebDAV link is a special type of - property value, formally defined in section 12.4, that allows typed - connections to be established between resources of any media type. - The property value consists of source and destination Uniform - Resource Identifiers (URIs); the property name identifies the link - type. - - - - - - - - - - -Goland, et al. Standards Track [Page 10] - -RFC 2518 WEBDAV February 1999 - - -5 Collections of Web Resources - - This section provides a description of a new type of Web resource, - the collection, and discusses its interactions with the HTTP URL - namespace. The purpose of a collection resource is to model - collection-like objects (e.g., file system directories) within a - server's namespace. - - All DAV compliant resources MUST support the HTTP URL namespace model - specified herein. - -5.1 HTTP URL Namespace Model - - The HTTP URL namespace is a hierarchical namespace where the - hierarchy is delimited with the "/" character. - - An HTTP URL namespace is said to be consistent if it meets the - following conditions: for every URL in the HTTP hierarchy there - exists a collection that contains that URL as an internal member. - The root, or top-level collection of the namespace under - consideration is exempt from the previous rule. - - Neither HTTP/1.1 nor WebDAV require that the entire HTTP URL - namespace be consistent. However, certain WebDAV methods are - prohibited from producing results that cause namespace - inconsistencies. - - Although implicit in [RFC2068] and [RFC2396], any resource, including - collection resources, MAY be identified by more than one URI. For - example, a resource could be identified by multiple HTTP URLs. - -5.2 Collection Resources - - A collection is a resource whose state consists of at least a list of - internal member URIs and a set of properties, but which may have - additional state such as entity bodies returned by GET. An internal - member URI MUST be immediately relative to a base URI of the - collection. That is, the internal member URI is equal to a - containing collection's URI plus an additional segment for non- - collection resources, or additional segment plus trailing slash "/" - for collection resources, where segment is defined in section 3.3 of - [RFC2396]. - - Any given internal member URI MUST only belong to the collection - once, i.e., it is illegal to have multiple instances of the same URI - in a collection. Properties defined on collections behave exactly as - do properties on non-collection resources. - - - - -Goland, et al. Standards Track [Page 11] - -RFC 2518 WEBDAV February 1999 - - - For all WebDAV compliant resources A and B, identified by URIs U and - V, for which U is immediately relative to V, B MUST be a collection - that has U as an internal member URI. So, if the resource with URL - http://foo.com/bar/blah is WebDAV compliant and if the resource with - URL http://foo.com/bar/ is WebDAV compliant then the resource with - URL http://foo.com/bar/ must be a collection and must contain URL - http://foo.com/bar/blah as an internal member. - - Collection resources MAY list the URLs of non-WebDAV compliant - children in the HTTP URL namespace hierarchy as internal members but - are not required to do so. For example, if the resource with URL - http://foo.com/bar/blah is not WebDAV compliant and the URL - http://foo.com/bar/ identifies a collection then URL - http://foo.com/bar/blah may or may not be an internal member of the - collection with URL http://foo.com/bar/. - - If a WebDAV compliant resource has no WebDAV compliant children in - the HTTP URL namespace hierarchy then the WebDAV compliant resource - is not required to be a collection. - - There is a standing convention that when a collection is referred to - by its name without a trailing slash, the trailing slash is - automatically appended. Due to this, a resource may accept a URI - without a trailing "/" to point to a collection. In this case it - SHOULD return a content-location header in the response pointing to - the URI ending with the "/". For example, if a client invokes a - method on http://foo.bar/blah (no trailing slash), the resource - http://foo.bar/blah/ (trailing slash) may respond as if the operation - were invoked on it, and should return a content-location header with - http://foo.bar/blah/ in it. In general clients SHOULD use the "/" - form of collection names. - - A resource MAY be a collection but not be WebDAV compliant. That is, - the resource may comply with all the rules set out in this - specification regarding how a collection is to behave without - necessarily supporting all methods that a WebDAV compliant resource - is required to support. In such a case the resource may return the - DAV:resourcetype property with the value DAV:collection but MUST NOT - return a DAV header containing the value "1" on an OPTIONS response. - -5.3 Creation and Retrieval of Collection Resources - - This document specifies the MKCOL method to create new collection - resources, rather than using the existing HTTP/1.1 PUT or POST - method, for the following reasons: - - - - - - -Goland, et al. Standards Track [Page 12] - -RFC 2518 WEBDAV February 1999 - - - In HTTP/1.1, the PUT method is defined to store the request body at - the location specified by the Request-URI. While a description - format for a collection can readily be constructed for use with PUT, - the implications of sending such a description to the server are - undesirable. For example, if a description of a collection that - omitted some existing resources were PUT to a server, this might be - interpreted as a command to remove those members. This would extend - PUT to perform DELETE functionality, which is undesirable since it - changes the semantics of PUT, and makes it difficult to control - DELETE functionality with an access control scheme based on methods. - - While the POST method is sufficiently open-ended that a "create a - collection" POST command could be constructed, this is undesirable - because it would be difficult to separate access control for - collection creation from other uses of POST. - - The exact definition of the behavior of GET and PUT on collections is - defined later in this document. - -5.4 Source Resources and Output Resources - - For many resources, the entity returned by a GET method exactly - matches the persistent state of the resource, for example, a GIF file - stored on a disk. For this simple case, the URI at which a resource - is accessed is identical to the URI at which the source (the - persistent state) of the resource is accessed. This is also the case - for HTML source files that are not processed by the server prior to - transmission. - - However, the server can sometimes process HTML resources before they - are transmitted as a return entity body. For example, a server- - side-include directive within an HTML file might instruct a server to - replace the directive with another value, such as the current date. - In this case, what is returned by GET (HTML plus date) differs from - the persistent state of the resource (HTML plus directive). - Typically there is no way to access the HTML resource containing the - unprocessed directive. - - Sometimes the entity returned by GET is the output of a data- - producing process that is described by one or more source resources - (that may not even have a location in the URI namespace). A single - data-producing process may dynamically generate the state of a - potentially large number of output resources. An example of this is - a CGI script that describes a "finger" gateway process that maps part - of the namespace of a server into finger requests, such as - http://www.foo.bar.org/finger_gateway/user@host. - - - - - -Goland, et al. Standards Track [Page 13] - -RFC 2518 WEBDAV February 1999 - - - In the absence of distributed authoring capabilities, it is - acceptable to have no mapping of source resource(s) to the URI - namespace. In fact, preventing access to the source resource(s) has - desirable security benefits. However, if remote editing of the - source resource(s) is desired, the source resource(s) should be given - a location in the URI namespace. This source location should not be - one of the locations at which the generated output is retrievable, - since in general it is impossible for the server to differentiate - requests for source resources from requests for process output - resources. There is often a many-to-many relationship between source - resources and output resources. - - On WebDAV compliant servers the URI of the source resource(s) may be - stored in a link on the output resource with type DAV:source (see - section 13.10 for a description of the source link property). - Storing the source URIs in links on the output resources places the - burden of discovering the source on the authoring client. Note that - the value of a source link is not guaranteed to point to the correct - source. Source links may break or incorrect values may be entered. - Also note that not all servers will allow the client to set the - source link value. For example a server which generates source links - on the fly for its CGI files will most likely not allow a client to - set the source link value. - -6 Locking - - The ability to lock a resource provides a mechanism for serializing - access to that resource. Using a lock, an authoring client can - provide a reasonable guarantee that another principal will not modify - a resource while it is being edited. In this way, a client can - prevent the "lost update" problem. - - This specification allows locks to vary over two client-specified - parameters, the number of principals involved (exclusive vs. shared) - and the type of access to be granted. This document defines locking - for only one access type, write. However, the syntax is extensible, - and permits the eventual specification of locking for other access - types. - -6.1 Exclusive Vs. Shared Locks - - The most basic form of lock is an exclusive lock. This is a lock - where the access right in question is only granted to a single - principal. The need for this arbitration results from a desire to - avoid having to merge results. - - - - - - -Goland, et al. Standards Track [Page 14] - -RFC 2518 WEBDAV February 1999 - - - However, there are times when the goal of a lock is not to exclude - others from exercising an access right but rather to provide a - mechanism for principals to indicate that they intend to exercise - their access rights. Shared locks are provided for this case. A - shared lock allows multiple principals to receive a lock. Hence any - principal with appropriate access can get the lock. - - With shared locks there are two trust sets that affect a resource. - The first trust set is created by access permissions. Principals who - are trusted, for example, may have permission to write to the - resource. Among those who have access permission to write to the - resource, the set of principals who have taken out a shared lock also - must trust each other, creating a (typically) smaller trust set - within the access permission write set. - - Starting with every possible principal on the Internet, in most - situations the vast majority of these principals will not have write - access to a given resource. Of the small number who do have write - access, some principals may decide to guarantee their edits are free - from overwrite conflicts by using exclusive write locks. Others may - decide they trust their collaborators will not overwrite their work - (the potential set of collaborators being the set of principals who - have write permission) and use a shared lock, which informs their - collaborators that a principal may be working on the resource. - - The WebDAV extensions to HTTP do not need to provide all of the - communications paths necessary for principals to coordinate their - activities. When using shared locks, principals may use any out of - band communication channel to coordinate their work (e.g., face-to- - face interaction, written notes, post-it notes on the screen, - telephone conversation, Email, etc.) The intent of a shared lock is - to let collaborators know who else may be working on a resource. - - Shared locks are included because experience from web distributed - authoring systems has indicated that exclusive locks are often too - rigid. An exclusive lock is used to enforce a particular editing - process: take out an exclusive lock, read the resource, perform - edits, write the resource, release the lock. This editing process - has the problem that locks are not always properly released, for - example when a program crashes, or when a lock owner leaves without - unlocking a resource. While both timeouts and administrative action - can be used to remove an offending lock, neither mechanism may be - available when needed; the timeout may be long or the administrator - may not be available. - - - - - - - -Goland, et al. Standards Track [Page 15] - -RFC 2518 WEBDAV February 1999 - - -6.2 Required Support - - A WebDAV compliant server is not required to support locking in any - form. If the server does support locking it may choose to support - any combination of exclusive and shared locks for any access types. - - The reason for this flexibility is that locking policy strikes to the - very heart of the resource management and versioning systems employed - by various storage repositories. These repositories require control - over what sort of locking will be made available. For example, some - repositories only support shared write locks while others only - provide support for exclusive write locks while yet others use no - locking at all. As each system is sufficiently different to merit - exclusion of certain locking features, this specification leaves - locking as the sole axis of negotiation within WebDAV. - -6.3 Lock Tokens - - A lock token is a type of state token, represented as a URI, which - identifies a particular lock. A lock token is returned by every - successful LOCK operation in the lockdiscovery property in the - response body, and can also be found through lock discovery on a - resource. - - Lock token URIs MUST be unique across all resources for all time. - This uniqueness constraint allows lock tokens to be submitted across - resources and servers without fear of confusion. - - This specification provides a lock token URI scheme called - opaquelocktoken that meets the uniqueness requirements. However - resources are free to return any URI scheme so long as it meets the - uniqueness requirements. - - Having a lock token provides no special access rights. Anyone can - find out anyone else's lock token by performing lock discovery. - Locks MUST be enforced based upon whatever authentication mechanism - is used by the server, not based on the secrecy of the token values. - -6.4 opaquelocktoken Lock Token URI Scheme - - The opaquelocktoken URI scheme is designed to be unique across all - resources for all time. Due to this uniqueness quality, a client may - submit an opaque lock token in an If header on a resource other than - the one that returned it. - - All resources MUST recognize the opaquelocktoken scheme and, at - minimum, recognize that the lock token does not refer to an - outstanding lock on the resource. - - - -Goland, et al. Standards Track [Page 16] - -RFC 2518 WEBDAV February 1999 - - - In order to guarantee uniqueness across all resources for all time - the opaquelocktoken requires the use of the Universal Unique - Identifier (UUID) mechanism, as described in [ISO-11578]. - - Opaquelocktoken generators, however, have a choice of how they create - these tokens. They can either generate a new UUID for every lock - token they create or they can create a single UUID and then add - extension characters. If the second method is selected then the - program generating the extensions MUST guarantee that the same - extension will never be used twice with the associated UUID. - - OpaqueLockToken-URI = "opaquelocktoken:" UUID [Extension] ; The UUID - production is the string representation of a UUID, as defined in - [ISO-11578]. Note that white space (LWS) is not allowed between - elements of this production. - - Extension = path ; path is defined in section 3.2.1 of RFC 2068 - [RFC2068] - -6.4.1 Node Field Generation Without the IEEE 802 Address - - UUIDs, as defined in [ISO-11578], contain a "node" field that - contains one of the IEEE 802 addresses for the server machine. As - noted in section 17.8, there are several security risks associated - with exposing a machine's IEEE 802 address. This section provides an - alternate mechanism for generating the "node" field of a UUID which - does not employ an IEEE 802 address. WebDAV servers MAY use this - algorithm for creating the node field when generating UUIDs. The - text in this section is originally from an Internet-Draft by Paul - Leach and Rich Salz, who are noted here to properly attribute their - work. - - The ideal solution is to obtain a 47 bit cryptographic quality random - number, and use it as the low 47 bits of the node ID, with the most - significant bit of the first octet of the node ID set to 1. This bit - is the unicast/multicast bit, which will never be set in IEEE 802 - addresses obtained from network cards; hence, there can never be a - conflict between UUIDs generated by machines with and without network - cards. - - If a system does not have a primitive to generate cryptographic - quality random numbers, then in most systems there are usually a - fairly large number of sources of randomness available from which one - can be generated. Such sources are system specific, but often - include: - - - - - - -Goland, et al. Standards Track [Page 17] - -RFC 2518 WEBDAV February 1999 - - - - the percent of memory in use - - the size of main memory in bytes - - the amount of free main memory in bytes - - the size of the paging or swap file in bytes - - free bytes of paging or swap file - - the total size of user virtual address space in bytes - - the total available user address space bytes - - the size of boot disk drive in bytes - - the free disk space on boot drive in bytes - - the current time - - the amount of time since the system booted - - the individual sizes of files in various system directories - - the creation, last read, and modification times of files in - various system directories - - the utilization factors of various system resources (heap, etc.) - - current mouse cursor position - - current caret position - - current number of running processes, threads - - handles or IDs of the desktop window and the active window - - the value of stack pointer of the caller - - the process and thread ID of caller - - various processor architecture specific performance counters - (instructions executed, cache misses, TLB misses) - - (Note that it is precisely the above kinds of sources of randomness - that are used to seed cryptographic quality random number generators - on systems without special hardware for their construction.) - - In addition, items such as the computer's name and the name of the - operating system, while not strictly speaking random, will help - differentiate the results from those obtained by other systems. - - The exact algorithm to generate a node ID using these data is system - specific, because both the data available and the functions to obtain - them are often very system specific. However, assuming that one can - concatenate all the values from the randomness sources into a buffer, - and that a cryptographic hash function such as MD5 is available, then - any 6 bytes of the MD5 hash of the buffer, with the multicast bit - (the high bit of the first byte) set will be an appropriately random - node ID. - - Other hash functions, such as SHA-1, can also be used. The only - requirement is that the result be suitably random _ in the sense that - the outputs from a set uniformly distributed inputs are themselves - uniformly distributed, and that a single bit change in the input can - be expected to cause half of the output bits to change. - - - - - -Goland, et al. Standards Track [Page 18] - -RFC 2518 WEBDAV February 1999 - - -6.5 Lock Capability Discovery - - Since server lock support is optional, a client trying to lock a - resource on a server can either try the lock and hope for the best, - or perform some form of discovery to determine what lock capabilities - the server supports. This is known as lock capability discovery. - Lock capability discovery differs from discovery of supported access - control types, since there may be access control types without - corresponding lock types. A client can determine what lock types the - server supports by retrieving the supportedlock property. - - Any DAV compliant resource that supports the LOCK method MUST support - the supportedlock property. - -6.6 Active Lock Discovery - - If another principal locks a resource that a principal wishes to - access, it is useful for the second principal to be able to find out - who the first principal is. For this purpose the lockdiscovery - property is provided. This property lists all outstanding locks, - describes their type, and where available, provides their lock token. - - Any DAV compliant resource that supports the LOCK method MUST support - the lockdiscovery property. - -6.7 Usage Considerations - - Although the locking mechanisms specified here provide some help in - preventing lost updates, they cannot guarantee that updates will - never be lost. Consider the following scenario: - - Two clients A and B are interested in editing the resource ' - index.html'. Client A is an HTTP client rather than a WebDAV client, - and so does not know how to perform locking. - Client A doesn't lock the document, but does a GET and begins - editing. - Client B does LOCK, performs a GET and begins editing. - Client B finishes editing, performs a PUT, then an UNLOCK. - Client A performs a PUT, overwriting and losing all of B's changes. - - There are several reasons why the WebDAV protocol itself cannot - prevent this situation. First, it cannot force all clients to use - locking because it must be compatible with HTTP clients that do not - comprehend locking. Second, it cannot require servers to support - locking because of the variety of repository implementations, some of - which rely on reservations and merging rather than on locking. - Finally, being stateless, it cannot enforce a sequence of operations - like LOCK / GET / PUT / UNLOCK. - - - -Goland, et al. Standards Track [Page 19] - -RFC 2518 WEBDAV February 1999 - - - WebDAV servers that support locking can reduce the likelihood that - clients will accidentally overwrite each other's changes by requiring - clients to lock resources before modifying them. Such servers would - effectively prevent HTTP 1.0 and HTTP 1.1 clients from modifying - resources. - - WebDAV clients can be good citizens by using a lock / retrieve / - write /unlock sequence of operations (at least by default) whenever - they interact with a WebDAV server that supports locking. - - HTTP 1.1 clients can be good citizens, avoiding overwriting other - clients' changes, by using entity tags in If-Match headers with any - requests that would modify resources. - - Information managers may attempt to prevent overwrites by - implementing client-side procedures requiring locking before - modifying WebDAV resources. - -7 Write Lock - - This section describes the semantics specific to the write lock type. - The write lock is a specific instance of a lock type, and is the only - lock type described in this specification. - -7.1 Methods Restricted by Write Locks - - A write lock MUST prevent a principal without the lock from - successfully executing a PUT, POST, PROPPATCH, LOCK, UNLOCK, MOVE, - DELETE, or MKCOL on the locked resource. All other current methods, - GET in particular, function independently of the lock. - - Note, however, that as new methods are created it will be necessary - to specify how they interact with a write lock. - -7.2 Write Locks and Lock Tokens - - A successful request for an exclusive or shared write lock MUST - result in the generation of a unique lock token associated with the - requesting principal. Thus if five principals have a shared write - lock on the same resource there will be five lock tokens, one for - each principal. - -7.3 Write Locks and Properties - - While those without a write lock may not alter a property on a - resource it is still possible for the values of live properties to - change, even while locked, due to the requirements of their schemas. - - - - -Goland, et al. Standards Track [Page 20] - -RFC 2518 WEBDAV February 1999 - - - Only dead properties and live properties defined to respect locks are - guaranteed not to change while write locked. - -7.4 Write Locks and Null Resources - - It is possible to assert a write lock on a null resource in order to - lock the name. - - A write locked null resource, referred to as a lock-null resource, - MUST respond with a 404 (Not Found) or 405 (Method Not Allowed) to - any HTTP/1.1 or DAV methods except for PUT, MKCOL, OPTIONS, PROPFIND, - LOCK, and UNLOCK. A lock-null resource MUST appear as a member of - its parent collection. Additionally the lock-null resource MUST have - defined on it all mandatory DAV properties. Most of these - properties, such as all the get* properties, will have no value as a - lock-null resource does not support the GET method. Lock-Null - resources MUST have defined values for lockdiscovery and - supportedlock properties. - - Until a method such as PUT or MKCOL is successfully executed on the - lock-null resource the resource MUST stay in the lock-null state. - However, once a PUT or MKCOL is successfully executed on a lock-null - resource the resource ceases to be in the lock-null state. - - If the resource is unlocked, for any reason, without a PUT, MKCOL, or - similar method having been successfully executed upon it then the - resource MUST return to the null state. - -7.5 Write Locks and Collections - - A write lock on a collection, whether created by a "Depth: 0" or - "Depth: infinity" lock request, prevents the addition or removal of - member URIs of the collection by non-lock owners. As a consequence, - when a principal issues a PUT or POST request to create a new - resource under a URI which needs to be an internal member of a write - locked collection to maintain HTTP namespace consistency, or issues a - DELETE to remove a resource which has a URI which is an existing - internal member URI of a write locked collection, this request MUST - fail if the principal does not have a write lock on the collection. - - However, if a write lock request is issued to a collection containing - member URIs identifying resources that are currently locked in a - manner which conflicts with the write lock, the request MUST fail - with a 423 (Locked) status code. - - If a lock owner causes the URI of a resource to be added as an - internal member URI of a locked collection then the new resource MUST - be automatically added to the lock. This is the only mechanism that - - - -Goland, et al. Standards Track [Page 21] - -RFC 2518 WEBDAV February 1999 - - - allows a resource to be added to a write lock. Thus, for example, if - the collection /a/b/ is write locked and the resource /c is moved to - /a/b/c then resource /a/b/c will be added to the write lock. - -7.6 Write Locks and the If Request Header - - If a user agent is not required to have knowledge about a lock when - requesting an operation on a locked resource, the following scenario - might occur. Program A, run by User A, takes out a write lock on a - resource. Program B, also run by User A, has no knowledge of the - lock taken out by Program A, yet performs a PUT to the locked - resource. In this scenario, the PUT succeeds because locks are - associated with a principal, not a program, and thus program B, - because it is acting with principal A's credential, is allowed to - perform the PUT. However, had program B known about the lock, it - would not have overwritten the resource, preferring instead to - present a dialog box describing the conflict to the user. Due to - this scenario, a mechanism is needed to prevent different programs - from accidentally ignoring locks taken out by other programs with the - same authorization. - - In order to prevent these collisions a lock token MUST be submitted - by an authorized principal in the If header for all locked resources - that a method may interact with or the method MUST fail. For - example, if a resource is to be moved and both the source and - destination are locked then two lock tokens must be submitted, one - for the source and the other for the destination. - -7.6.1 Example - Write Lock - - >>Request - - COPY /~fielding/index.html HTTP/1.1 - Host: www.ics.uci.edu - Destination: http://www.ics.uci.edu/users/f/fielding/index.html - If: <http://www.ics.uci.edu/users/f/fielding/index.html> - (<opaquelocktoken:f81d4fae-7dec-11d0-a765-00a0c91e6bf6>) - - >>Response - - HTTP/1.1 204 No Content - - In this example, even though both the source and destination are - locked, only one lock token must be submitted, for the lock on the - destination. This is because the source resource is not modified by - a COPY, and hence unaffected by the write lock. In this example, user - agent authentication has previously occurred via a mechanism outside - the scope of the HTTP protocol, in the underlying transport layer. - - - -Goland, et al. Standards Track [Page 22] - -RFC 2518 WEBDAV February 1999 - - -7.7 Write Locks and COPY/MOVE - - A COPY method invocation MUST NOT duplicate any write locks active on - the source. However, as previously noted, if the COPY copies the - resource into a collection that is locked with "Depth: infinity", - then the resource will be added to the lock. - - A successful MOVE request on a write locked resource MUST NOT move - the write lock with the resource. However, the resource is subject to - being added to an existing lock at the destination, as specified in - section 7.5. For example, if the MOVE makes the resource a child of a - collection that is locked with "Depth: infinity", then the resource - will be added to that collection's lock. Additionally, if a resource - locked with "Depth: infinity" is moved to a destination that is - within the scope of the same lock (e.g., within the namespace tree - covered by the lock), the moved resource will again be a added to the - lock. In both these examples, as specified in section 7.6, an If - header must be submitted containing a lock token for both the source - and destination. - -7.8 Refreshing Write Locks - - A client MUST NOT submit the same write lock request twice. Note - that a client is always aware it is resubmitting the same lock - request because it must include the lock token in the If header in - order to make the request for a resource that is already locked. - - However, a client may submit a LOCK method with an If header but - without a body. This form of LOCK MUST only be used to "refresh" a - lock. Meaning, at minimum, that any timers associated with the lock - MUST be re-set. - - A server may return a Timeout header with a lock refresh that is - different than the Timeout header returned when the lock was - originally requested. Additionally clients may submit Timeout - headers of arbitrary value with their lock refresh requests. - Servers, as always, may ignore Timeout headers submitted by the - client. - - If an error is received in response to a refresh LOCK request the - client SHOULD assume that the lock was not refreshed. - -8 HTTP Methods for Distributed Authoring - - The following new HTTP methods use XML as a request and response - format. All DAV compliant clients and resources MUST use XML parsers - that are compliant with [REC-XML]. All XML used in either requests - or responses MUST be, at minimum, well formed. If a server receives - - - -Goland, et al. Standards Track [Page 23] - -RFC 2518 WEBDAV February 1999 - - - ill-formed XML in a request it MUST reject the entire request with a - 400 (Bad Request). If a client receives ill-formed XML in a response - then it MUST NOT assume anything about the outcome of the executed - method and SHOULD treat the server as malfunctioning. - -8.1 PROPFIND - - The PROPFIND method retrieves properties defined on the resource - identified by the Request-URI, if the resource does not have any - internal members, or on the resource identified by the Request-URI - and potentially its member resources, if the resource is a collection - that has internal member URIs. All DAV compliant resources MUST - support the PROPFIND method and the propfind XML element (section - 12.14) along with all XML elements defined for use with that element. - - A client may submit a Depth header with a value of "0", "1", or - "infinity" with a PROPFIND on a collection resource with internal - member URIs. DAV compliant servers MUST support the "0", "1" and - "infinity" behaviors. By default, the PROPFIND method without a Depth - header MUST act as if a "Depth: infinity" header was included. - - A client may submit a propfind XML element in the body of the request - method describing what information is being requested. It is - possible to request particular property values, all property values, - or a list of the names of the resource's properties. A client may - choose not to submit a request body. An empty PROPFIND request body - MUST be treated as a request for the names and values of all - properties. - - All servers MUST support returning a response of content type - text/xml or application/xml that contains a multistatus XML element - that describes the results of the attempts to retrieve the various - properties. - - If there is an error retrieving a property then a proper error result - MUST be included in the response. A request to retrieve the value of - a property which does not exist is an error and MUST be noted, if the - response uses a multistatus XML element, with a response XML element - which contains a 404 (Not Found) status value. - - Consequently, the multistatus XML element for a collection resource - with member URIs MUST include a response XML element for each member - URI of the collection, to whatever depth was requested. Each response - XML element MUST contain an href XML element that gives the URI of - the resource on which the properties in the prop XML element are - defined. Results for a PROPFIND on a collection resource with - internal member URIs are returned as a flat list whose order of - entries is not significant. - - - -Goland, et al. Standards Track [Page 24] - -RFC 2518 WEBDAV February 1999 - - - In the case of allprop and propname, if a principal does not have the - right to know whether a particular property exists then the property - should be silently excluded from the response. - - The results of this method SHOULD NOT be cached. - -8.1.1 Example - Retrieving Named Properties - - >>Request - - PROPFIND /file HTTP/1.1 - Host: www.foo.bar - Content-type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <D:propfind xmlns:D="DAV:"> - <D:prop xmlns:R="http://www.foo.bar/boxschema/"> - <R:bigbox/> - <R:author/> - <R:DingALing/> - <R:Random/> - </D:prop> - </D:propfind> - - >>Response - - HTTP/1.1 207 Multi-Status - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <D:multistatus xmlns:D="DAV:"> - <D:response> - <D:href>http://www.foo.bar/file</D:href> - <D:propstat> - <D:prop xmlns:R="http://www.foo.bar/boxschema/"> - <R:bigbox> - <R:BoxType>Box type A</R:BoxType> - </R:bigbox> - <R:author> - <R:Name>J.J. Johnson</R:Name> - </R:author> - </D:prop> - <D:status>HTTP/1.1 200 OK</D:status> - </D:propstat> - <D:propstat> - <D:prop><R:DingALing/><R:Random/></D:prop> - - - -Goland, et al. Standards Track [Page 25] - -RFC 2518 WEBDAV February 1999 - - - <D:status>HTTP/1.1 403 Forbidden</D:status> - <D:responsedescription> The user does not have access to - the DingALing property. - </D:responsedescription> - </D:propstat> - </D:response> - <D:responsedescription> There has been an access violation error. - </D:responsedescription> - </D:multistatus> - - In this example, PROPFIND is executed on a non-collection resource - http://www.foo.bar/file. The propfind XML element specifies the name - of four properties whose values are being requested. In this case - only two properties were returned, since the principal issuing the - request did not have sufficient access rights to see the third and - fourth properties. - -8.1.2 Example - Using allprop to Retrieve All Properties - - >>Request - - PROPFIND /container/ HTTP/1.1 - Host: www.foo.bar - Depth: 1 - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <D:propfind xmlns:D="DAV:"> - <D:allprop/> - </D:propfind> - - >>Response - - HTTP/1.1 207 Multi-Status - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <D:multistatus xmlns:D="DAV:"> - <D:response> - <D:href>http://www.foo.bar/container/</D:href> - <D:propstat> - <D:prop xmlns:R="http://www.foo.bar/boxschema/"> - <R:bigbox> - <R:BoxType>Box type A</R:BoxType> - </R:bigbox> - <R:author> - - - -Goland, et al. Standards Track [Page 26] - -RFC 2518 WEBDAV February 1999 - - - <R:Name>Hadrian</R:Name> - </R:author> - <D:creationdate> - 1997-12-01T17:42:21-08:00 - </D:creationdate> - <D:displayname> - Example collection - </D:displayname> - <D:resourcetype><D:collection/></D:resourcetype> - <D:supportedlock> - <D:lockentry> - <D:lockscope><D:exclusive/></D:lockscope> - <D:locktype><D:write/></D:locktype> - </D:lockentry> - <D:lockentry> - <D:lockscope><D:shared/></D:lockscope> - <D:locktype><D:write/></D:locktype> - </D:lockentry> - </D:supportedlock> - </D:prop> - <D:status>HTTP/1.1 200 OK</D:status> - </D:propstat> - </D:response> - <D:response> - <D:href>http://www.foo.bar/container/front.html</D:href> - <D:propstat> - <D:prop xmlns:R="http://www.foo.bar/boxschema/"> - <R:bigbox> - <R:BoxType>Box type B</R:BoxType> - </R:bigbox> - <D:creationdate> - 1997-12-01T18:27:21-08:00 - </D:creationdate> - <D:displayname> - Example HTML resource - </D:displayname> - <D:getcontentlength> - 4525 - </D:getcontentlength> - <D:getcontenttype> - text/html - </D:getcontenttype> - <D:getetag> - zzyzx - </D:getetag> - <D:getlastmodified> - Monday, 12-Jan-98 09:25:56 GMT - </D:getlastmodified> - - - -Goland, et al. Standards Track [Page 27] - -RFC 2518 WEBDAV February 1999 - - - <D:resourcetype/> - <D:supportedlock> - <D:lockentry> - <D:lockscope><D:exclusive/></D:lockscope> - <D:locktype><D:write/></D:locktype> - </D:lockentry> - <D:lockentry> - <D:lockscope><D:shared/></D:lockscope> - <D:locktype><D:write/></D:locktype> - </D:lockentry> - </D:supportedlock> - </D:prop> - <D:status>HTTP/1.1 200 OK</D:status> - </D:propstat> - </D:response> - </D:multistatus> - - In this example, PROPFIND was invoked on the resource - http://www.foo.bar/container/ with a Depth header of 1, meaning the - request applies to the resource and its children, and a propfind XML - element containing the allprop XML element, meaning the request - should return the name and value of all properties defined on each - resource. - - The resource http://www.foo.bar/container/ has six properties defined - on it: - - http://www.foo.bar/boxschema/bigbox, - http://www.foo.bar/boxschema/author, DAV:creationdate, - DAV:displayname, DAV:resourcetype, and DAV:supportedlock. - - The last four properties are WebDAV-specific, defined in section 13. - Since GET is not supported on this resource, the get* properties - (e.g., getcontentlength) are not defined on this resource. The DAV- - specific properties assert that "container" was created on December - 1, 1997, at 5:42:21PM, in a time zone 8 hours west of GMT - (creationdate), has a name of "Example collection" (displayname), a - collection resource type (resourcetype), and supports exclusive write - and shared write locks (supportedlock). - - The resource http://www.foo.bar/container/front.html has nine - properties defined on it: - - http://www.foo.bar/boxschema/bigbox (another instance of the "bigbox" - property type), DAV:creationdate, DAV:displayname, - DAV:getcontentlength, DAV:getcontenttype, DAV:getetag, - DAV:getlastmodified, DAV:resourcetype, and DAV:supportedlock. - - - - -Goland, et al. Standards Track [Page 28] - -RFC 2518 WEBDAV February 1999 - - - The DAV-specific properties assert that "front.html" was created on - December 1, 1997, at 6:27:21PM, in a time zone 8 hours west of GMT - (creationdate), has a name of "Example HTML resource" (displayname), - a content length of 4525 bytes (getcontentlength), a MIME type of - "text/html" (getcontenttype), an entity tag of "zzyzx" (getetag), was - last modified on Monday, January 12, 1998, at 09:25:56 GMT - (getlastmodified), has an empty resource type, meaning that it is not - a collection (resourcetype), and supports both exclusive write and - shared write locks (supportedlock). - -8.1.3 Example - Using propname to Retrieve all Property Names - - >>Request - - PROPFIND /container/ HTTP/1.1 - Host: www.foo.bar - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <propfind xmlns="DAV:"> - <propname/> - </propfind> - - >>Response - - HTTP/1.1 207 Multi-Status - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <multistatus xmlns="DAV:"> - <response> - <href>http://www.foo.bar/container/</href> - <propstat> - <prop xmlns:R="http://www.foo.bar/boxschema/"> - <R:bigbox/> - <R:author/> - <creationdate/> - <displayname/> - <resourcetype/> - <supportedlock/> - </prop> - <status>HTTP/1.1 200 OK</status> - </propstat> - </response> - <response> - <href>http://www.foo.bar/container/front.html</href> - - - -Goland, et al. Standards Track [Page 29] - -RFC 2518 WEBDAV February 1999 - - - <propstat> - <prop xmlns:R="http://www.foo.bar/boxschema/"> - <R:bigbox/> - <creationdate/> - <displayname/> - <getcontentlength/> - <getcontenttype/> - <getetag/> - <getlastmodified/> - <resourcetype/> - <supportedlock/> - </prop> - <status>HTTP/1.1 200 OK</status> - </propstat> - </response> - </multistatus> - - - In this example, PROPFIND is invoked on the collection resource - http://www.foo.bar/container/, with a propfind XML element containing - the propname XML element, meaning the name of all properties should - be returned. Since no Depth header is present, it assumes its - default value of "infinity", meaning the name of the properties on - the collection and all its progeny should be returned. - - Consistent with the previous example, resource - http://www.foo.bar/container/ has six properties defined on it, - http://www.foo.bar/boxschema/bigbox, - http://www.foo.bar/boxschema/author, DAV:creationdate, - DAV:displayname, DAV:resourcetype, and DAV:supportedlock. - - The resource http://www.foo.bar/container/index.html, a member of the - "container" collection, has nine properties defined on it, - http://www.foo.bar/boxschema/bigbox, DAV:creationdate, - DAV:displayname, DAV:getcontentlength, DAV:getcontenttype, - DAV:getetag, DAV:getlastmodified, DAV:resourcetype, and - DAV:supportedlock. - - This example also demonstrates the use of XML namespace scoping, and - the default namespace. Since the "xmlns" attribute does not contain - an explicit "shorthand name" (prefix) letter, the namespace applies - by default to all enclosed elements. Hence, all elements which do - not explicitly state the namespace to which they belong are members - of the "DAV:" namespace schema. - - - - - - - -Goland, et al. Standards Track [Page 30] - -RFC 2518 WEBDAV February 1999 - - -8.2 PROPPATCH - - The PROPPATCH method processes instructions specified in the request - body to set and/or remove properties defined on the resource - identified by the Request-URI. - - All DAV compliant resources MUST support the PROPPATCH method and - MUST process instructions that are specified using the - propertyupdate, set, and remove XML elements of the DAV schema. - Execution of the directives in this method is, of course, subject to - access control constraints. DAV compliant resources SHOULD support - the setting of arbitrary dead properties. - - The request message body of a PROPPATCH method MUST contain the - propertyupdate XML element. Instruction processing MUST occur in the - order instructions are received (i.e., from top to bottom). - Instructions MUST either all be executed or none executed. Thus if - any error occurs during processing all executed instructions MUST be - undone and a proper error result returned. Instruction processing - details can be found in the definition of the set and remove - instructions in section 12.13. - -8.2.1 Status Codes for use with 207 (Multi-Status) - - The following are examples of response codes one would expect to be - used in a 207 (Multi-Status) response for this method. Note, - however, that unless explicitly prohibited any 2/3/4/5xx series - response code may be used in a 207 (Multi-Status) response. - - 200 (OK) - The command succeeded. As there can be a mixture of sets - and removes in a body, a 201 (Created) seems inappropriate. - - 403 (Forbidden) - The client, for reasons the server chooses not to - specify, cannot alter one of the properties. - - 409 (Conflict) - The client has provided a value whose semantics are - not appropriate for the property. This includes trying to set read- - only properties. - - 423 (Locked) - The specified resource is locked and the client either - is not a lock owner or the lock type requires a lock token to be - submitted and the client did not submit it. - - 507 (Insufficient Storage) - The server did not have sufficient space - to record the property. - - - - - - -Goland, et al. Standards Track [Page 31] - -RFC 2518 WEBDAV February 1999 - - -8.2.2 Example - PROPPATCH - - >>Request - - PROPPATCH /bar.html HTTP/1.1 - Host: www.foo.com - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <D:propertyupdate xmlns:D="DAV:" - xmlns:Z="http://www.w3.com/standards/z39.50/"> - <D:set> - <D:prop> - <Z:authors> - <Z:Author>Jim Whitehead</Z:Author> - <Z:Author>Roy Fielding</Z:Author> - </Z:authors> - </D:prop> - </D:set> - <D:remove> - <D:prop><Z:Copyright-Owner/></D:prop> - </D:remove> - </D:propertyupdate> - - >>Response - - HTTP/1.1 207 Multi-Status - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <D:multistatus xmlns:D="DAV:" - xmlns:Z="http://www.w3.com/standards/z39.50"> - <D:response> - <D:href>http://www.foo.com/bar.html</D:href> - <D:propstat> - <D:prop><Z:Authors/></D:prop> - <D:status>HTTP/1.1 424 Failed Dependency</D:status> - </D:propstat> - <D:propstat> - <D:prop><Z:Copyright-Owner/></D:prop> - <D:status>HTTP/1.1 409 Conflict</D:status> - </D:propstat> - <D:responsedescription> Copyright Owner can not be deleted or - altered.</D:responsedescription> - </D:response> - </D:multistatus> - - - -Goland, et al. Standards Track [Page 32] - -RFC 2518 WEBDAV February 1999 - - - In this example, the client requests the server to set the value of - the http://www.w3.com/standards/z39.50/Authors property, and to - remove the property http://www.w3.com/standards/z39.50/Copyright- - Owner. Since the Copyright-Owner property could not be removed, no - property modifications occur. The 424 (Failed Dependency) status - code for the Authors property indicates this action would have - succeeded if it were not for the conflict with removing the - Copyright-Owner property. - -8.3 MKCOL Method - - The MKCOL method is used to create a new collection. All DAV - compliant resources MUST support the MKCOL method. - -8.3.1 Request - - MKCOL creates a new collection resource at the location specified by - the Request-URI. If the resource identified by the Request-URI is - non-null then the MKCOL MUST fail. During MKCOL processing, a server - MUST make the Request-URI a member of its parent collection, unless - the Request-URI is "/". If no such ancestor exists, the method MUST - fail. When the MKCOL operation creates a new collection resource, - all ancestors MUST already exist, or the method MUST fail with a 409 - (Conflict) status code. For example, if a request to create - collection /a/b/c/d/ is made, and neither /a/b/ nor /a/b/c/ exists, - the request must fail. - - When MKCOL is invoked without a request body, the newly created - collection SHOULD have no members. - - A MKCOL request message may contain a message body. The behavior of - a MKCOL request when the body is present is limited to creating - collections, members of a collection, bodies of members and - properties on the collections or members. If the server receives a - MKCOL request entity type it does not support or understand it MUST - respond with a 415 (Unsupported Media Type) status code. The exact - behavior of MKCOL for various request media types is undefined in - this document, and will be specified in separate documents. - -8.3.2 Status Codes - - Responses from a MKCOL request MUST NOT be cached as MKCOL has non- - idempotent semantics. - - 201 (Created) - The collection or structured resource was created in - its entirety. - - - - - -Goland, et al. Standards Track [Page 33] - -RFC 2518 WEBDAV February 1999 - - - 403 (Forbidden) - This indicates at least one of two conditions: 1) - the server does not allow the creation of collections at the given - location in its namespace, or 2) the parent collection of the - Request-URI exists but cannot accept members. - - 405 (Method Not Allowed) - MKCOL can only be executed on a - deleted/non-existent resource. - - 409 (Conflict) - A collection cannot be made at the Request-URI until - one or more intermediate collections have been created. - - 415 (Unsupported Media Type)- The server does not support the request - type of the body. - - 507 (Insufficient Storage) - The resource does not have sufficient - space to record the state of the resource after the execution of this - method. - -8.3.3 Example - MKCOL - - This example creates a collection called /webdisc/xfiles/ on the - server www.server.org. - - >>Request - - MKCOL /webdisc/xfiles/ HTTP/1.1 - Host: www.server.org - - >>Response - - HTTP/1.1 201 Created - -8.4 GET, HEAD for Collections - - The semantics of GET are unchanged when applied to a collection, - since GET is defined as, "retrieve whatever information (in the form - of an entity) is identified by the Request-URI" [RFC2068]. GET when - applied to a collection may return the contents of an "index.html" - resource, a human-readable view of the contents of the collection, or - something else altogether. Hence it is possible that the result of a - GET on a collection will bear no correlation to the membership of the - collection. - - Similarly, since the definition of HEAD is a GET without a response - message body, the semantics of HEAD are unmodified when applied to - collection resources. - - - - - -Goland, et al. Standards Track [Page 34] - -RFC 2518 WEBDAV February 1999 - - -8.5 POST for Collections - - Since by definition the actual function performed by POST is - determined by the server and often depends on the particular - resource, the behavior of POST when applied to collections cannot be - meaningfully modified because it is largely undefined. Thus the - semantics of POST are unmodified when applied to a collection. - -8.6 DELETE - - 8.6.1 DELETE for Non-Collection Resources - - If the DELETE method is issued to a non-collection resource whose - URIs are an internal member of one or more collections, then during - DELETE processing a server MUST remove any URI for the resource - identified by the Request-URI from collections which contain it as a - member. - -8.6.2 DELETE for Collections - - The DELETE method on a collection MUST act as if a "Depth: infinity" - header was used on it. A client MUST NOT submit a Depth header with - a DELETE on a collection with any value but infinity. - - DELETE instructs that the collection specified in the Request-URI and - all resources identified by its internal member URIs are to be - deleted. - - If any resource identified by a member URI cannot be deleted then all - of the member's ancestors MUST NOT be deleted, so as to maintain - namespace consistency. - - Any headers included with DELETE MUST be applied in processing every - resource to be deleted. - - When the DELETE method has completed processing it MUST result in a - consistent namespace. - - If an error occurs with a resource other than the resource identified - in the Request-URI then the response MUST be a 207 (Multi-Status). - 424 (Failed Dependency) errors SHOULD NOT be in the 207 (Multi- - Status). They can be safely left out because the client will know - that the ancestors of a resource could not be deleted when the client - receives an error for the ancestor's progeny. Additionally 204 (No - Content) errors SHOULD NOT be returned in the 207 (Multi-Status). - The reason for this prohibition is that 204 (No Content) is the - default success code. - - - - -Goland, et al. Standards Track [Page 35] - -RFC 2518 WEBDAV February 1999 - - -8.6.2.1 Example - DELETE - - >>Request - - DELETE /container/ HTTP/1.1 - Host: www.foo.bar - - >>Response - - HTTP/1.1 207 Multi-Status - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <d:multistatus xmlns:d="DAV:"> - <d:response> - <d:href>http://www.foo.bar/container/resource3</d:href> - <d:status>HTTP/1.1 423 Locked</d:status> - </d:response> - </d:multistatus> - - In this example the attempt to delete - http://www.foo.bar/container/resource3 failed because it is locked, - and no lock token was submitted with the request. Consequently, the - attempt to delete http://www.foo.bar/container/ also failed. Thus the - client knows that the attempt to delete http://www.foo.bar/container/ - must have also failed since the parent can not be deleted unless its - child has also been deleted. Even though a Depth header has not been - included, a depth of infinity is assumed because the method is on a - collection. - -8.7 PUT - -8.7.1 PUT for Non-Collection Resources - - A PUT performed on an existing resource replaces the GET response - entity of the resource. Properties defined on the resource may be - recomputed during PUT processing but are not otherwise affected. For - example, if a server recognizes the content type of the request body, - it may be able to automatically extract information that could be - profitably exposed as properties. - - A PUT that would result in the creation of a resource without an - appropriately scoped parent collection MUST fail with a 409 - (Conflict). - - - - - - -Goland, et al. Standards Track [Page 36] - -RFC 2518 WEBDAV February 1999 - - -8.7.2 PUT for Collections - - As defined in the HTTP/1.1 specification [RFC2068], the "PUT method - requests that the enclosed entity be stored under the supplied - Request-URI." Since submission of an entity representing a - collection would implicitly encode creation and deletion of - resources, this specification intentionally does not define a - transmission format for creating a collection using PUT. Instead, - the MKCOL method is defined to create collections. - - When the PUT operation creates a new non-collection resource all - ancestors MUST already exist. If all ancestors do not exist, the - method MUST fail with a 409 (Conflict) status code. For example, if - resource /a/b/c/d.html is to be created and /a/b/c/ does not exist, - then the request must fail. - -8.8 COPY Method - - The COPY method creates a duplicate of the source resource, - identified by the Request-URI, in the destination resource, - identified by the URI in the Destination header. The Destination - header MUST be present. The exact behavior of the COPY method - depends on the type of the source resource. - - All WebDAV compliant resources MUST support the COPY method. - However, support for the COPY method does not guarantee the ability - to copy a resource. For example, separate programs may control - resources on the same server. As a result, it may not be possible to - copy a resource to a location that appears to be on the same server. - -8.8.1 COPY for HTTP/1.1 resources - - When the source resource is not a collection the result of the COPY - method is the creation of a new resource at the destination whose - state and behavior match that of the source resource as closely as - possible. After a successful COPY invocation, all properties on the - source resource MUST be duplicated on the destination resource, - subject to modifying headers and XML elements, following the - definition for copying properties. Since the environment at the - destination may be different than at the source due to factors - outside the scope of control of the server, such as the absence of - resources required for correct operation, it may not be possible to - completely duplicate the behavior of the resource at the destination. - Subsequent alterations to the destination resource will not modify - the source resource. Subsequent alterations to the source resource - will not modify the destination resource. - - - - - -Goland, et al. Standards Track [Page 37] - -RFC 2518 WEBDAV February 1999 - - -8.8.2. COPY for Properties - - The following section defines how properties on a resource are - handled during a COPY operation. - - Live properties SHOULD be duplicated as identically behaving live - properties at the destination resource. If a property cannot be - copied live, then its value MUST be duplicated, octet-for-octet, in - an identically named, dead property on the destination resource - subject to the effects of the propertybehavior XML element. - - The propertybehavior XML element can specify that properties are - copied on best effort, that all live properties must be successfully - copied or the method must fail, or that a specified list of live - properties must be successfully copied or the method must fail. The - propertybehavior XML element is defined in section 12.12. - -8.8.3 COPY for Collections - - The COPY method on a collection without a Depth header MUST act as if - a Depth header with value "infinity" was included. A client may - submit a Depth header on a COPY on a collection with a value of "0" - or "infinity". DAV compliant servers MUST support the "0" and - "infinity" Depth header behaviors. - - A COPY of depth infinity instructs that the collection resource - identified by the Request-URI is to be copied to the location - identified by the URI in the Destination header, and all its internal - member resources are to be copied to a location relative to it, - recursively through all levels of the collection hierarchy. - - A COPY of "Depth: 0" only instructs that the collection and its - properties but not resources identified by its internal member URIs, - are to be copied. - - Any headers included with a COPY MUST be applied in processing every - resource to be copied with the exception of the Destination header. - - The Destination header only specifies the destination URI for the - Request-URI. When applied to members of the collection identified by - the Request-URI the value of Destination is to be modified to reflect - the current location in the hierarchy. So, if the Request- URI is - /a/ with Host header value http://fun.com/ and the Destination is - http://fun.com/b/ then when http://fun.com/a/c/d is processed it must - use a Destination of http://fun.com/b/c/d. - - - - - - -Goland, et al. Standards Track [Page 38] - -RFC 2518 WEBDAV February 1999 - - - When the COPY method has completed processing it MUST have created a - consistent namespace at the destination (see section 5.1 for the - definition of namespace consistency). However, if an error occurs - while copying an internal collection, the server MUST NOT copy any - resources identified by members of this collection (i.e., the server - must skip this subtree), as this would create an inconsistent - namespace. After detecting an error, the COPY operation SHOULD try to - finish as much of the original copy operation as possible (i.e., the - server should still attempt to copy other subtrees and their members, - that are not descendents of an error-causing collection). So, for - example, if an infinite depth copy operation is performed on - collection /a/, which contains collections /a/b/ and /a/c/, and an - error occurs copying /a/b/, an attempt should still be made to copy - /a/c/. Similarly, after encountering an error copying a non- - collection resource as part of an infinite depth copy, the server - SHOULD try to finish as much of the original copy operation as - possible. - - If an error in executing the COPY method occurs with a resource other - than the resource identified in the Request-URI then the response - MUST be a 207 (Multi-Status). - - The 424 (Failed Dependency) status code SHOULD NOT be returned in the - 207 (Multi-Status) response from a COPY method. These responses can - be safely omitted because the client will know that the progeny of a - resource could not be copied when the client receives an error for - the parent. Additionally 201 (Created)/204 (No Content) status codes - SHOULD NOT be returned as values in 207 (Multi-Status) responses from - COPY methods. They, too, can be safely omitted because they are the - default success codes. - -8.8.4 COPY and the Overwrite Header - - If a resource exists at the destination and the Overwrite header is - "T" then prior to performing the copy the server MUST perform a - DELETE with "Depth: infinity" on the destination resource. If the - Overwrite header is set to "F" then the operation will fail. - -8.8.5 Status Codes - - 201 (Created) - The source resource was successfully copied. The - copy operation resulted in the creation of a new resource. - - 204 (No Content) - The source resource was successfully copied to a - pre-existing destination resource. - - 403 (Forbidden) _ The source and destination URIs are the same. - - - - -Goland, et al. Standards Track [Page 39] - -RFC 2518 WEBDAV February 1999 - - - 409 (Conflict) _ A resource cannot be created at the destination - until one or more intermediate collections have been created. - - 412 (Precondition Failed) - The server was unable to maintain the - liveness of the properties listed in the propertybehavior XML element - or the Overwrite header is "F" and the state of the destination - resource is non-null. - - 423 (Locked) - The destination resource was locked. - - 502 (Bad Gateway) - This may occur when the destination is on another - server and the destination server refuses to accept the resource. - - 507 (Insufficient Storage) - The destination resource does not have - sufficient space to record the state of the resource after the - execution of this method. - -8.8.6 Example - COPY with Overwrite - - This example shows resource - http://www.ics.uci.edu/~fielding/index.html being copied to the - location http://www.ics.uci.edu/users/f/fielding/index.html. The 204 - (No Content) status code indicates the existing resource at the - destination was overwritten. - - >>Request - - COPY /~fielding/index.html HTTP/1.1 - Host: www.ics.uci.edu - Destination: http://www.ics.uci.edu/users/f/fielding/index.html - - >>Response - - HTTP/1.1 204 No Content - -8.8.7 Example - COPY with No Overwrite - - The following example shows the same copy operation being performed, - but with the Overwrite header set to "F." A response of 412 - (Precondition Failed) is returned because the destination resource - has a non-null state. - - >>Request - - COPY /~fielding/index.html HTTP/1.1 - Host: www.ics.uci.edu - Destination: http://www.ics.uci.edu/users/f/fielding/index.html - Overwrite: F - - - -Goland, et al. Standards Track [Page 40] - -RFC 2518 WEBDAV February 1999 - - - >>Response - - HTTP/1.1 412 Precondition Failed - -8.8.8 Example - COPY of a Collection - - >>Request - - COPY /container/ HTTP/1.1 - Host: www.foo.bar - Destination: http://www.foo.bar/othercontainer/ - Depth: infinity - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <d:propertybehavior xmlns:d="DAV:"> - <d:keepalive>*</d:keepalive> - </d:propertybehavior> - - >>Response - - HTTP/1.1 207 Multi-Status - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <d:multistatus xmlns:d="DAV:"> - <d:response> - <d:href>http://www.foo.bar/othercontainer/R2/</d:href> - <d:status>HTTP/1.1 412 Precondition Failed</d:status> - </d:response> - </d:multistatus> - - The Depth header is unnecessary as the default behavior of COPY on a - collection is to act as if a "Depth: infinity" header had been - submitted. In this example most of the resources, along with the - collection, were copied successfully. However the collection R2 - failed, most likely due to a problem with maintaining the liveness of - properties (this is specified by the propertybehavior XML element). - Because there was an error copying R2, none of R2's members were - copied. However no errors were listed for those members due to the - error minimization rules given in section 8.8.3. - - - - - - - - -Goland, et al. Standards Track [Page 41] - -RFC 2518 WEBDAV February 1999 - - -8.9 MOVE Method - - The MOVE operation on a non-collection resource is the logical - equivalent of a copy (COPY), followed by consistency maintenance - processing, followed by a delete of the source, where all three - actions are performed atomically. The consistency maintenance step - allows the server to perform updates caused by the move, such as - updating all URIs other than the Request-URI which identify the - source resource, to point to the new destination resource. - Consequently, the Destination header MUST be present on all MOVE - methods and MUST follow all COPY requirements for the COPY part of - the MOVE method. All DAV compliant resources MUST support the MOVE - method. However, support for the MOVE method does not guarantee the - ability to move a resource to a particular destination. - - For example, separate programs may actually control different sets of - resources on the same server. Therefore, it may not be possible to - move a resource within a namespace that appears to belong to the same - server. - - If a resource exists at the destination, the destination resource - will be DELETEd as a side-effect of the MOVE operation, subject to - the restrictions of the Overwrite header. - -8.9.1 MOVE for Properties - - The behavior of properties on a MOVE, including the effects of the - propertybehavior XML element, MUST be the same as specified in - section 8.8.2. - -8.9.2 MOVE for Collections - - A MOVE with "Depth: infinity" instructs that the collection - identified by the Request-URI be moved to the URI specified in the - Destination header, and all resources identified by its internal - member URIs are to be moved to locations relative to it, recursively - through all levels of the collection hierarchy. - - The MOVE method on a collection MUST act as if a "Depth: infinity" - header was used on it. A client MUST NOT submit a Depth header on a - MOVE on a collection with any value but "infinity". - - Any headers included with MOVE MUST be applied in processing every - resource to be moved with the exception of the Destination header. - - The behavior of the Destination header is the same as given for COPY - on collections. - - - - -Goland, et al. Standards Track [Page 42] - -RFC 2518 WEBDAV February 1999 - - - When the MOVE method has completed processing it MUST have created a - consistent namespace at both the source and destination (see section - 5.1 for the definition of namespace consistency). However, if an - error occurs while moving an internal collection, the server MUST NOT - move any resources identified by members of the failed collection - (i.e., the server must skip the error-causing subtree), as this would - create an inconsistent namespace. In this case, after detecting the - error, the move operation SHOULD try to finish as much of the - original move as possible (i.e., the server should still attempt to - move other subtrees and the resources identified by their members, - that are not descendents of an error-causing collection). So, for - example, if an infinite depth move is performed on collection /a/, - which contains collections /a/b/ and /a/c/, and an error occurs - moving /a/b/, an attempt should still be made to try moving /a/c/. - Similarly, after encountering an error moving a non-collection - resource as part of an infinite depth move, the server SHOULD try to - finish as much of the original move operation as possible. - - If an error occurs with a resource other than the resource identified - in the Request-URI then the response MUST be a 207 (Multi-Status). - - The 424 (Failed Dependency) status code SHOULD NOT be returned in the - 207 (Multi-Status) response from a MOVE method. These errors can be - safely omitted because the client will know that the progeny of a - resource could not be moved when the client receives an error for the - parent. Additionally 201 (Created)/204 (No Content) responses SHOULD - NOT be returned as values in 207 (Multi-Status) responses from a - MOVE. These responses can be safely omitted because they are the - default success codes. - -8.9.3 MOVE and the Overwrite Header - - If a resource exists at the destination and the Overwrite header is - "T" then prior to performing the move the server MUST perform a - DELETE with "Depth: infinity" on the destination resource. If the - Overwrite header is set to "F" then the operation will fail. - -8.9.4 Status Codes - - 201 (Created) - The source resource was successfully moved, and a new - resource was created at the destination. - - 204 (No Content) - The source resource was successfully moved to a - pre-existing destination resource. - - 403 (Forbidden) _ The source and destination URIs are the same. - - - - - -Goland, et al. Standards Track [Page 43] - -RFC 2518 WEBDAV February 1999 - - - 409 (Conflict) _ A resource cannot be created at the destination - until one or more intermediate collections have been created. - - 412 (Precondition Failed) - The server was unable to maintain the - liveness of the properties listed in the propertybehavior XML element - or the Overwrite header is "F" and the state of the destination - resource is non-null. - - 423 (Locked) - The source or the destination resource was locked. - - 502 (Bad Gateway) - This may occur when the destination is on another - server and the destination server refuses to accept the resource. - -8.9.5 Example - MOVE of a Non-Collection - - This example shows resource - http://www.ics.uci.edu/~fielding/index.html being moved to the - location http://www.ics.uci.edu/users/f/fielding/index.html. The - contents of the destination resource would have been overwritten if - the destination resource had been non-null. In this case, since - there was nothing at the destination resource, the response code is - 201 (Created). - - >>Request - - MOVE /~fielding/index.html HTTP/1.1 - Host: www.ics.uci.edu - Destination: http://www.ics.uci.edu/users/f/fielding/index.html - - >>Response - - HTTP/1.1 201 Created - Location: http://www.ics.uci.edu/users/f/fielding/index.html - - -8.9.6 Example - MOVE of a Collection - - >>Request - - MOVE /container/ HTTP/1.1 - Host: www.foo.bar - Destination: http://www.foo.bar/othercontainer/ - Overwrite: F - If: (<opaquelocktoken:fe184f2e-6eec-41d0-c765-01adc56e6bb4>) - (<opaquelocktoken:e454f3f3-acdc-452a-56c7-00a5c91e4b77>) - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - - - -Goland, et al. Standards Track [Page 44] - -RFC 2518 WEBDAV February 1999 - - - <?xml version="1.0" encoding="utf-8" ?> - <d:propertybehavior xmlns:d='DAV:'> - <d:keepalive>*</d:keepalive> - </d:propertybehavior> - - >>Response - - HTTP/1.1 207 Multi-Status - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <d:multistatus xmlns:d='DAV:'> - <d:response> - <d:href>http://www.foo.bar/othercontainer/C2/</d:href> - <d:status>HTTP/1.1 423 Locked</d:status> - </d:response> - </d:multistatus> - - In this example the client has submitted a number of lock tokens with - the request. A lock token will need to be submitted for every - resource, both source and destination, anywhere in the scope of the - method, that is locked. In this case the proper lock token was not - submitted for the destination http://www.foo.bar/othercontainer/C2/. - This means that the resource /container/C2/ could not be moved. - Because there was an error copying /container/C2/, none of - /container/C2's members were copied. However no errors were listed - for those members due to the error minimization rules given in - section 8.8.3. User agent authentication has previously occurred via - a mechanism outside the scope of the HTTP protocol, in an underlying - transport layer. - -8.10 LOCK Method - - The following sections describe the LOCK method, which is used to - take out a lock of any access type. These sections on the LOCK - method describe only those semantics that are specific to the LOCK - method and are independent of the access type of the lock being - requested. - - Any resource which supports the LOCK method MUST, at minimum, support - the XML request and response formats defined herein. - - - - - - - - - -Goland, et al. Standards Track [Page 45] - -RFC 2518 WEBDAV February 1999 - - -8.10.1 Operation - - A LOCK method invocation creates the lock specified by the lockinfo - XML element on the Request-URI. Lock method requests SHOULD have a - XML request body which contains an owner XML element for this lock - request, unless this is a refresh request. The LOCK request may have - a Timeout header. - - Clients MUST assume that locks may arbitrarily disappear at any time, - regardless of the value given in the Timeout header. The Timeout - header only indicates the behavior of the server if "extraordinary" - circumstances do not occur. For example, an administrator may remove - a lock at any time or the system may crash in such a way that it - loses the record of the lock's existence. The response MUST contain - the value of the lockdiscovery property in a prop XML element. - - In order to indicate the lock token associated with a newly created - lock, a Lock-Token response header MUST be included in the response - for every successful LOCK request for a new lock. Note that the - Lock-Token header would not be returned in the response for a - successful refresh LOCK request because a new lock was not created. - -8.10.2 The Effect of Locks on Properties and Collections - - The scope of a lock is the entire state of the resource, including - its body and associated properties. As a result, a lock on a - resource MUST also lock the resource's properties. - - For collections, a lock also affects the ability to add or remove - members. The nature of the effect depends upon the type of access - control involved. - -8.10.3 Locking Replicated Resources - - A resource may be made available through more than one URI. However - locks apply to resources, not URIs. Therefore a LOCK request on a - resource MUST NOT succeed if can not be honored by all the URIs - through which the resource is addressable. - -8.10.4 Depth and Locking - - The Depth header may be used with the LOCK method. Values other than - 0 or infinity MUST NOT be used with the Depth header on a LOCK - method. All resources that support the LOCK method MUST support the - Depth header. - - A Depth header of value 0 means to just lock the resource specified - by the Request-URI. - - - -Goland, et al. Standards Track [Page 46] - -RFC 2518 WEBDAV February 1999 - - - If the Depth header is set to infinity then the resource specified in - the Request-URI along with all its internal members, all the way down - the hierarchy, are to be locked. A successful result MUST return a - single lock token which represents all the resources that have been - locked. If an UNLOCK is successfully executed on this token, all - associated resources are unlocked. If the lock cannot be granted to - all resources, a 409 (Conflict) status code MUST be returned with a - response entity body containing a multistatus XML element describing - which resource(s) prevented the lock from being granted. Hence, - partial success is not an option. Either the entire hierarchy is - locked or no resources are locked. - - If no Depth header is submitted on a LOCK request then the request - MUST act as if a "Depth:infinity" had been submitted. - -8.10.5 Interaction with other Methods - - The interaction of a LOCK with various methods is dependent upon the - lock type. However, independent of lock type, a successful DELETE of - a resource MUST cause all of its locks to be removed. - -8.10.6 Lock Compatibility Table - - The table below describes the behavior that occurs when a lock - request is made on a resource. - - Current lock state/ | Shared Lock | Exclusive - Lock request | | Lock - =====================+=================+============== - None | True | True - ---------------------+-----------------+-------------- - Shared Lock | True | False - ---------------------+-----------------+-------------- - Exclusive Lock | False | False* - ------------------------------------------------------ - - Legend: True = lock may be granted. False = lock MUST NOT be - granted. *=It is illegal for a principal to request the same lock - twice. - - The current lock state of a resource is given in the leftmost column, - and lock requests are listed in the first row. The intersection of a - row and column gives the result of a lock request. For example, if a - shared lock is held on a resource, and an exclusive lock is - requested, the table entry is "false", indicating the lock must not - be granted. - - - - - -Goland, et al. Standards Track [Page 47] - -RFC 2518 WEBDAV February 1999 - - -8.10.7 Status Codes - - 200 (OK) - The lock request succeeded and the value of the - lockdiscovery property is included in the body. - - 412 (Precondition Failed) - The included lock token was not - enforceable on this resource or the server could not satisfy the - request in the lockinfo XML element. - - 423 (Locked) - The resource is locked, so the method has been - rejected. - -8.10.8 Example - Simple Lock Request - - >>Request - - LOCK /workspace/webdav/proposal.doc HTTP/1.1 - Host: webdav.sb.aol.com - Timeout: Infinite, Second-4100000000 - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - Authorization: Digest username="ejw", - realm="ejw@webdav.sb.aol.com", nonce="...", - uri="/workspace/webdav/proposal.doc", - response="...", opaque="..." - - <?xml version="1.0" encoding="utf-8" ?> - <D:lockinfo xmlns:D='DAV:'> - <D:lockscope><D:exclusive/></D:lockscope> - <D:locktype><D:write/></D:locktype> - <D:owner> - <D:href>http://www.ics.uci.edu/~ejw/contact.html</D:href> - </D:owner> - </D:lockinfo> - - >>Response - - HTTP/1.1 200 OK - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <D:prop xmlns:D="DAV:"> - <D:lockdiscovery> - <D:activelock> - <D:locktype><D:write/></D:locktype> - <D:lockscope><D:exclusive/></D:lockscope> - <D:depth>Infinity</D:depth> - - - -Goland, et al. Standards Track [Page 48] - -RFC 2518 WEBDAV February 1999 - - - <D:owner> - <D:href> - http://www.ics.uci.edu/~ejw/contact.html - </D:href> - </D:owner> - <D:timeout>Second-604800</D:timeout> - <D:locktoken> - <D:href> - opaquelocktoken:e71d4fae-5dec-22d6-fea5-00a0c91e6be4 - </D:href> - </D:locktoken> - </D:activelock> - </D:lockdiscovery> - </D:prop> - - This example shows the successful creation of an exclusive write lock - on resource http://webdav.sb.aol.com/workspace/webdav/proposal.doc. - The resource http://www.ics.uci.edu/~ejw/contact.html contains - contact information for the owner of the lock. The server has an - activity-based timeout policy in place on this resource, which causes - the lock to automatically be removed after 1 week (604800 seconds). - Note that the nonce, response, and opaque fields have not been - calculated in the Authorization request header. - -8.10.9 Example - Refreshing a Write Lock - - >>Request - - LOCK /workspace/webdav/proposal.doc HTTP/1.1 - Host: webdav.sb.aol.com - Timeout: Infinite, Second-4100000000 - If: (<opaquelocktoken:e71d4fae-5dec-22d6-fea5-00a0c91e6be4>) - Authorization: Digest username="ejw", - realm="ejw@webdav.sb.aol.com", nonce="...", - uri="/workspace/webdav/proposal.doc", - response="...", opaque="..." - - >>Response - - HTTP/1.1 200 OK - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <D:prop xmlns:D="DAV:"> - <D:lockdiscovery> - <D:activelock> - <D:locktype><D:write/></D:locktype> - - - -Goland, et al. Standards Track [Page 49] - -RFC 2518 WEBDAV February 1999 - - - <D:lockscope><D:exclusive/></D:lockscope> - <D:depth>Infinity</D:depth> - <D:owner> - <D:href> - http://www.ics.uci.edu/~ejw/contact.html - </D:href> - </D:owner> - <D:timeout>Second-604800</D:timeout> - <D:locktoken> - <D:href> - opaquelocktoken:e71d4fae-5dec-22d6-fea5-00a0c91e6be4 - </D:href> - </D:locktoken> - </D:activelock> - </D:lockdiscovery> - </D:prop> - - This request would refresh the lock, resetting any time outs. Notice - that the client asked for an infinite time out but the server choose - to ignore the request. In this example, the nonce, response, and - opaque fields have not been calculated in the Authorization request - header. - -8.10.10 Example - Multi-Resource Lock Request - - >>Request - - LOCK /webdav/ HTTP/1.1 - Host: webdav.sb.aol.com - Timeout: Infinite, Second-4100000000 - Depth: infinity - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - Authorization: Digest username="ejw", - realm="ejw@webdav.sb.aol.com", nonce="...", - uri="/workspace/webdav/proposal.doc", - response="...", opaque="..." - - <?xml version="1.0" encoding="utf-8" ?> - <D:lockinfo xmlns:D="DAV:"> - <D:locktype><D:write/></D:locktype> - <D:lockscope><D:exclusive/></D:lockscope> - <D:owner> - <D:href>http://www.ics.uci.edu/~ejw/contact.html</D:href> - </D:owner> - </D:lockinfo> - - >>Response - - - -Goland, et al. Standards Track [Page 50] - -RFC 2518 WEBDAV February 1999 - - - HTTP/1.1 207 Multi-Status - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <D:multistatus xmlns:D="DAV:"> - <D:response> - <D:href>http://webdav.sb.aol.com/webdav/secret</D:href> - <D:status>HTTP/1.1 403 Forbidden</D:status> - </D:response> - <D:response> - <D:href>http://webdav.sb.aol.com/webdav/</D:href> - <D:propstat> - <D:prop><D:lockdiscovery/></D:prop> - <D:status>HTTP/1.1 424 Failed Dependency</D:status> - </D:propstat> - </D:response> - </D:multistatus> - - This example shows a request for an exclusive write lock on a - collection and all its children. In this request, the client has - specified that it desires an infinite length lock, if available, - otherwise a timeout of 4.1 billion seconds, if available. The request - entity body contains the contact information for the principal taking - out the lock, in this case a web page URL. - - The error is a 403 (Forbidden) response on the resource - http://webdav.sb.aol.com/webdav/secret. Because this resource could - not be locked, none of the resources were locked. Note also that the - lockdiscovery property for the Request-URI has been included as - required. In this example the lockdiscovery property is empty which - means that there are no outstanding locks on the resource. - - In this example, the nonce, response, and opaque fields have not been - calculated in the Authorization request header. - -8.11 UNLOCK Method - - The UNLOCK method removes the lock identified by the lock token in - the Lock-Token request header from the Request-URI, and all other - resources included in the lock. If all resources which have been - locked under the submitted lock token can not be unlocked then the - UNLOCK request MUST fail. - - Any DAV compliant resource which supports the LOCK method MUST - support the UNLOCK method. - - - - - -Goland, et al. Standards Track [Page 51] - -RFC 2518 WEBDAV February 1999 - - -8.11.1 Example - UNLOCK - - >>Request - - UNLOCK /workspace/webdav/info.doc HTTP/1.1 - Host: webdav.sb.aol.com - Lock-Token: <opaquelocktoken:a515cfa4-5da4-22e1-f5b5-00a0451e6bf7> - Authorization: Digest username="ejw", - realm="ejw@webdav.sb.aol.com", nonce="...", - uri="/workspace/webdav/proposal.doc", - response="...", opaque="..." - - >>Response - - HTTP/1.1 204 No Content - - In this example, the lock identified by the lock token - "opaquelocktoken:a515cfa4-5da4-22e1-f5b5-00a0451e6bf7" is - successfully removed from the resource - http://webdav.sb.aol.com/workspace/webdav/info.doc. If this lock - included more than just one resource, the lock is removed from all - resources included in the lock. The 204 (No Content) status code is - used instead of 200 (OK) because there is no response entity body. - - In this example, the nonce, response, and opaque fields have not been - calculated in the Authorization request header. - -9 HTTP Headers for Distributed Authoring - -9.1 DAV Header - - DAV = "DAV" ":" "1" ["," "2"] ["," 1#extend] - - This header indicates that the resource supports the DAV schema and - protocol as specified. All DAV compliant resources MUST return the - DAV header on all OPTIONS responses. - - The value is a list of all compliance classes that the resource - supports. Note that above a comma has already been added to the 2. - This is because a resource can not be level 2 compliant unless it is - also level 1 compliant. Please refer to section 15 for more details. - In general, however, support for one compliance class does not entail - support for any other. - -9.2 Depth Header - - Depth = "Depth" ":" ("0" | "1" | "infinity") - - - - -Goland, et al. Standards Track [Page 52] - -RFC 2518 WEBDAV February 1999 - - - The Depth header is used with methods executed on resources which - could potentially have internal members to indicate whether the - method is to be applied only to the resource ("Depth: 0"), to the - resource and its immediate children, ("Depth: 1"), or the resource - and all its progeny ("Depth: infinity"). - - The Depth header is only supported if a method's definition - explicitly provides for such support. - - The following rules are the default behavior for any method that - supports the Depth header. A method may override these defaults by - defining different behavior in its definition. - - Methods which support the Depth header may choose not to support all - of the header's values and may define, on a case by case basis, the - behavior of the method if a Depth header is not present. For example, - the MOVE method only supports "Depth: infinity" and if a Depth header - is not present will act as if a "Depth: infinity" header had been - applied. - - Clients MUST NOT rely upon methods executing on members of their - hierarchies in any particular order or on the execution being atomic - unless the particular method explicitly provides such guarantees. - - Upon execution, a method with a Depth header will perform as much of - its assigned task as possible and then return a response specifying - what it was able to accomplish and what it failed to do. - - So, for example, an attempt to COPY a hierarchy may result in some of - the members being copied and some not. - - Any headers on a method that has a defined interaction with the Depth - header MUST be applied to all resources in the scope of the method - except where alternative behavior is explicitly defined. For example, - an If-Match header will have its value applied against every resource - in the method's scope and will cause the method to fail if the header - fails to match. - - If a resource, source or destination, within the scope of the method - with a Depth header is locked in such a way as to prevent the - successful execution of the method, then the lock token for that - resource MUST be submitted with the request in the If request header. - - The Depth header only specifies the behavior of the method with - regards to internal children. If a resource does not have internal - children then the Depth header MUST be ignored. - - - - - -Goland, et al. Standards Track [Page 53] - -RFC 2518 WEBDAV February 1999 - - - Please note, however, that it is always an error to submit a value - for the Depth header that is not allowed by the method's definition. - Thus submitting a "Depth: 1" on a COPY, even if the resource does not - have internal members, will result in a 400 (Bad Request). The method - should fail not because the resource doesn't have internal members, - but because of the illegal value in the header. - -9.3 Destination Header - - Destination = "Destination" ":" absoluteURI - - The Destination header specifies the URI which identifies a - destination resource for methods such as COPY and MOVE, which take - two URIs as parameters. Note that the absoluteURI production is - defined in [RFC2396]. - -9.4 If Header - - If = "If" ":" ( 1*No-tag-list | 1*Tagged-list) - No-tag-list = List - Tagged-list = Resource 1*List - Resource = Coded-URL - List = "(" 1*(["Not"](State-token | "[" entity-tag "]")) ")" - State-token = Coded-URL - Coded-URL = "<" absoluteURI ">" - - The If header is intended to have similar functionality to the If- - Match header defined in section 14.25 of [RFC2068]. However the If - header is intended for use with any URI which represents state - information, referred to as a state token, about a resource as well - as ETags. A typical example of a state token is a lock token, and - lock tokens are the only state tokens defined in this specification. - - All DAV compliant resources MUST honor the If header. - - The If header's purpose is to describe a series of state lists. If - the state of the resource to which the header is applied does not - match any of the specified state lists then the request MUST fail - with a 412 (Precondition Failed). If one of the described state - lists matches the state of the resource then the request may succeed. - - Note that the absoluteURI production is defined in [RFC2396]. - - - - - - - - - -Goland, et al. Standards Track [Page 54] - -RFC 2518 WEBDAV February 1999 - - -9.4.1 No-tag-list Production - - The No-tag-list production describes a series of state tokens and - ETags. If multiple No-tag-list productions are used then one only - needs to match the state of the resource for the method to be allowed - to continue. - - If a method, due to the presence of a Depth or Destination header, is - applied to multiple resources then the No-tag-list production MUST be - applied to each resource the method is applied to. - -9.4.1.1 Example - No-tag-list If Header - - If: (<locktoken:a-write-lock-token> ["I am an ETag"]) (["I am another - ETag"]) - - The previous header would require that any resources within the scope - of the method must either be locked with the specified lock token and - in the state identified by the "I am an ETag" ETag or in the state - identified by the second ETag "I am another ETag". To put the matter - more plainly one can think of the previous If header as being in the - form (or (and <locktoken:a-write-lock-token> ["I am an ETag"]) (and - ["I am another ETag"])). - -9.4.2 Tagged-list Production - - The tagged-list production scopes a list production. That is, it - specifies that the lists following the resource specification only - apply to the specified resource. The scope of the resource - production begins with the list production immediately following the - resource production and ends with the next resource production, if - any. - - When the If header is applied to a particular resource, the Tagged- - list productions MUST be searched to determine if any of the listed - resources match the operand resource(s) for the current method. If - none of the resource productions match the current resource then the - header MUST be ignored. If one of the resource productions does - match the name of the resource under consideration then the list - productions following the resource production MUST be applied to the - resource in the manner specified in the previous section. - - The same URI MUST NOT appear more than once in a resource production - in an If header. - - - - - - - -Goland, et al. Standards Track [Page 55] - -RFC 2518 WEBDAV February 1999 - - -9.4.2.1 Example - Tagged List If header - - COPY /resource1 HTTP/1.1 - Host: www.foo.bar - Destination: http://www.foo.bar/resource2 - If: <http://www.foo.bar/resource1> (<locktoken:a-write-lock-token> - [W/"A weak ETag"]) (["strong ETag"]) - <http://www.bar.bar/random>(["another strong ETag"]) - - In this example http://www.foo.bar/resource1 is being copied to - http://www.foo.bar/resource2. When the method is first applied to - http://www.foo.bar/resource1, resource1 must be in the state - specified by "(<locktoken:a-write-lock-token> [W/"A weak ETag"]) - (["strong ETag"])", that is, it either must be locked with a lock - token of "locktoken:a-write-lock-token" and have a weak entity tag - W/"A weak ETag" or it must have a strong entity tag "strong ETag". - - That is the only success condition since the resource - http://www.bar.bar/random never has the method applied to it (the - only other resource listed in the If header) and - http://www.foo.bar/resource2 is not listed in the If header. - -9.4.3 not Production - - Every state token or ETag is either current, and hence describes the - state of a resource, or is not current, and does not describe the - state of a resource. The boolean operation of matching a state token - or ETag to the current state of a resource thus resolves to a true or - false value. The not production is used to reverse that value. The - scope of the not production is the state-token or entity-tag - immediately following it. - - If: (Not <locktoken:write1> <locktoken:write2>) - - When submitted with a request, this If header requires that all - operand resources must not be locked with locktoken:write1 and must - be locked with locktoken:write2. - -9.4.4 Matching Function - - When performing If header processing, the definition of a matching - state token or entity tag is as follows. - - Matching entity tag: Where the entity tag matches an entity tag - associated with that resource. - - Matching state token: Where there is an exact match between the state - token in the If header and any state token on the resource. - - - -Goland, et al. Standards Track [Page 56] - -RFC 2518 WEBDAV February 1999 - - -9.4.5 If Header and Non-DAV Compliant Proxies - - Non-DAV compliant proxies will not honor the If header, since they - will not understand the If header, and HTTP requires non-understood - headers to be ignored. When communicating with HTTP/1.1 proxies, the - "Cache-Control: no-cache" request header MUST be used so as to - prevent the proxy from improperly trying to service the request from - its cache. When dealing with HTTP/1.0 proxies the "Pragma: no-cache" - request header MUST be used for the same reason. - -9.5 Lock-Token Header - - Lock-Token = "Lock-Token" ":" Coded-URL - - The Lock-Token request header is used with the UNLOCK method to - identify the lock to be removed. The lock token in the Lock-Token - request header MUST identify a lock that contains the resource - identified by Request-URI as a member. - - The Lock-Token response header is used with the LOCK method to - indicate the lock token created as a result of a successful LOCK - request to create a new lock. - -9.6 Overwrite Header - - Overwrite = "Overwrite" ":" ("T" | "F") - - The Overwrite header specifies whether the server should overwrite - the state of a non-null destination resource during a COPY or MOVE. - A value of "F" states that the server must not perform the COPY or - MOVE operation if the state of the destination resource is non-null. - If the overwrite header is not included in a COPY or MOVE request - then the resource MUST treat the request as if it has an overwrite - header of value "T". While the Overwrite header appears to duplicate - the functionality of the If-Match: * header of HTTP/1.1, If-Match - applies only to the Request-URI, and not to the Destination of a COPY - or MOVE. - - If a COPY or MOVE is not performed due to the value of the Overwrite - header, the method MUST fail with a 412 (Precondition Failed) status - code. - - All DAV compliant resources MUST support the Overwrite header. - -9.7 Status-URI Response Header - - The Status-URI response header may be used with the 102 (Processing) - status code to inform the client as to the status of a method. - - - -Goland, et al. Standards Track [Page 57] - -RFC 2518 WEBDAV February 1999 - - - Status-URI = "Status-URI" ":" *(Status-Code Coded-URL) ; Status-Code - is defined in 6.1.1 of [RFC2068] - - The URIs listed in the header are source resources which have been - affected by the outstanding method. The status code indicates the - resolution of the method on the identified resource. So, for - example, if a MOVE method on a collection is outstanding and a 102 - (Processing) response with a Status-URI response header is returned, - the included URIs will indicate resources that have had move - attempted on them and what the result was. - -9.8 Timeout Request Header - - TimeOut = "Timeout" ":" 1#TimeType - TimeType = ("Second-" DAVTimeOutVal | "Infinite" | Other) - DAVTimeOutVal = 1*digit - Other = "Extend" field-value ; See section 4.2 of [RFC2068] - - Clients may include Timeout headers in their LOCK requests. However, - the server is not required to honor or even consider these requests. - Clients MUST NOT submit a Timeout request header with any method - other than a LOCK method. - - A Timeout request header MUST contain at least one TimeType and may - contain multiple TimeType entries. The purpose of listing multiple - TimeType entries is to indicate multiple different values and value - types that are acceptable to the client. The client lists the - TimeType entries in order of preference. - - Timeout response values MUST use a Second value, Infinite, or a - TimeType the client has indicated familiarity with. The server may - assume a client is familiar with any TimeType submitted in a Timeout - header. - - The "Second" TimeType specifies the number of seconds that will - elapse between granting of the lock at the server, and the automatic - removal of the lock. The timeout value for TimeType "Second" MUST - NOT be greater than 2^32-1. - - The timeout counter SHOULD be restarted any time an owner of the lock - sends a method to any member of the lock, including unsupported - methods, or methods which are unsuccessful. However the lock MUST be - refreshed if a refresh LOCK method is successfully received. - - If the timeout expires then the lock may be lost. Specifically, if - the server wishes to harvest the lock upon time-out, the server - SHOULD act as if an UNLOCK method was executed by the server on the - resource using the lock token of the timed-out lock, performed with - - - -Goland, et al. Standards Track [Page 58] - -RFC 2518 WEBDAV February 1999 - - - its override authority. Thus logs should be updated with the - disposition of the lock, notifications should be sent, etc., just as - they would be for an UNLOCK request. - - Servers are advised to pay close attention to the values submitted by - clients, as they will be indicative of the type of activity the - client intends to perform. For example, an applet running in a - browser may need to lock a resource, but because of the instability - of the environment within which the applet is running, the applet may - be turned off without warning. As a result, the applet is likely to - ask for a relatively small timeout value so that if the applet dies, - the lock can be quickly harvested. However, a document management - system is likely to ask for an extremely long timeout because its - user may be planning on going off-line. - - A client MUST NOT assume that just because the time-out has expired - the lock has been lost. - -10 Status Code Extensions to HTTP/1.1 - - The following status codes are added to those defined in HTTP/1.1 - [RFC2068]. - -10.1 102 Processing - - The 102 (Processing) status code is an interim response used to - inform the client that the server has accepted the complete request, - but has not yet completed it. This status code SHOULD only be sent - when the server has a reasonable expectation that the request will - take significant time to complete. As guidance, if a method is taking - longer than 20 seconds (a reasonable, but arbitrary value) to process - the server SHOULD return a 102 (Processing) response. The server MUST - send a final response after the request has been completed. - - Methods can potentially take a long period of time to process, - especially methods that support the Depth header. In such cases the - client may time-out the connection while waiting for a response. To - prevent this the server may return a 102 (Processing) status code to - indicate to the client that the server is still processing the - method. - -10.2 207 Multi-Status - - The 207 (Multi-Status) status code provides status for multiple - independent operations (see section 11 for more information). - - - - - - -Goland, et al. Standards Track [Page 59] - -RFC 2518 WEBDAV February 1999 - - -10.3 422 Unprocessable Entity - - The 422 (Unprocessable Entity) status code means the server - understands the content type of the request entity (hence a - 415(Unsupported Media Type) status code is inappropriate), and the - syntax of the request entity is correct (thus a 400 (Bad Request) - status code is inappropriate) but was unable to process the contained - instructions. For example, this error condition may occur if an XML - request body contains well-formed (i.e., syntactically correct), but - semantically erroneous XML instructions. - -10.4 423 Locked - - The 423 (Locked) status code means the source or destination resource - of a method is locked. - -10.5 424 Failed Dependency - - The 424 (Failed Dependency) status code means that the method could - not be performed on the resource because the requested action - depended on another action and that action failed. For example, if a - command in a PROPPATCH method fails then, at minimum, the rest of the - commands will also fail with 424 (Failed Dependency). - -10.6 507 Insufficient Storage - - The 507 (Insufficient Storage) status code means the method could not - be performed on the resource because the server is unable to store - the representation needed to successfully complete the request. This - condition is considered to be temporary. If the request which - received this status code was the result of a user action, the - request MUST NOT be repeated until it is requested by a separate user - action. - -11 Multi-Status Response - - The default 207 (Multi-Status) response body is a text/xml or - application/xml HTTP entity that contains a single XML element called - multistatus, which contains a set of XML elements called response - which contain 200, 300, 400, and 500 series status codes generated - during the method invocation. 100 series status codes SHOULD NOT be - recorded in a response XML element. - - - - - - - - - -Goland, et al. Standards Track [Page 60] - -RFC 2518 WEBDAV February 1999 - - -12 XML Element Definitions - - In the section below, the final line of each section gives the - element type declaration using the format defined in [REC-XML]. The - "Value" field, where present, specifies further restrictions on the - allowable contents of the XML element using BNF (i.e., to further - restrict the values of a PCDATA element). - -12.1 activelock XML Element - - Name: activelock - Namespace: DAV: - Purpose: Describes a lock on a resource. - - <!ELEMENT activelock (lockscope, locktype, depth, owner?, timeout?, - locktoken?) > - -12.1.1 depth XML Element - - Name: depth - Namespace: DAV: - Purpose: The value of the Depth header. - Value: "0" | "1" | "infinity" - - <!ELEMENT depth (#PCDATA) > - -12.1.2 locktoken XML Element - - Name: locktoken - Namespace: DAV: - Purpose: The lock token associated with a lock. - Description: The href contains one or more opaque lock token URIs - which all refer to the same lock (i.e., the OpaqueLockToken-URI - production in section 6.4). - - <!ELEMENT locktoken (href+) > - -12.1.3 timeout XML Element - - Name: timeout - Namespace: DAV: - Purpose: The timeout associated with a lock - Value: TimeType ;Defined in section 9.8 - - <!ELEMENT timeout (#PCDATA) > - - - - - - -Goland, et al. Standards Track [Page 61] - -RFC 2518 WEBDAV February 1999 - - -12.2 collection XML Element - - Name: collection - Namespace: DAV: - Purpose: Identifies the associated resource as a collection. The - resourcetype property of a collection resource MUST have this value. - - <!ELEMENT collection EMPTY > - -12.3 href XML Element - - Name: href - Namespace: DAV: - Purpose: Identifies the content of the element as a URI. - Value: URI ; See section 3.2.1 of [RFC2068] - - <!ELEMENT href (#PCDATA)> - -12.4 link XML Element - - Name: link - Namespace: DAV: - Purpose: Identifies the property as a link and contains the source - and destination of that link. - Description: The link XML element is used to provide the sources and - destinations of a link. The name of the property containing the link - XML element provides the type of the link. Link is a multi-valued - element, so multiple links may be used together to indicate multiple - links with the same type. The values in the href XML elements inside - the src and dst XML elements of the link XML element MUST NOT be - rejected if they point to resources which do not exist. - - <!ELEMENT link (src+, dst+) > - -12.4.1 dst XML Element - - Name: dst - Namespace: DAV: - Purpose: Indicates the destination of a link - Value: URI - - <!ELEMENT dst (#PCDATA) > - -12.4.2 src XML Element - - Name: src - Namespace: DAV: - Purpose: Indicates the source of a link. - - - -Goland, et al. Standards Track [Page 62] - -RFC 2518 WEBDAV February 1999 - - - Value: URI - - <!ELEMENT src (#PCDATA) > - -12.5 lockentry XML Element - - Name: lockentry - Namespace: DAV: - Purpose: Defines the types of locks that can be used with the - resource. - - <!ELEMENT lockentry (lockscope, locktype) > - -12.6 lockinfo XML Element - - Name: lockinfo - Namespace: DAV: - Purpose: The lockinfo XML element is used with a LOCK method to - specify the type of lock the client wishes to have created. - - <!ELEMENT lockinfo (lockscope, locktype, owner?) > - -12.7 lockscope XML Element - - Name: lockscope - Namespace: DAV: - Purpose: Specifies whether a lock is an exclusive lock, or a - shared lock. - - <!ELEMENT lockscope (exclusive | shared) > - -12.7.1 exclusive XML Element - - Name: exclusive - Namespace: DAV: - Purpose: Specifies an exclusive lock - - <!ELEMENT exclusive EMPTY > - -12.7.2 shared XML Element - - Name: shared - Namespace: DAV: - Purpose: Specifies a shared lock - - <!ELEMENT shared EMPTY > - - - - - -Goland, et al. Standards Track [Page 63] - -RFC 2518 WEBDAV February 1999 - - -12.8 locktype XML Element - - Name: locktype - Namespace: DAV: - Purpose: Specifies the access type of a lock. At present, this - specification only defines one lock type, the write lock. - - <!ELEMENT locktype (write) > - -12.8.1 write XML Element - - Name: write - Namespace: DAV: - Purpose: Specifies a write lock. - - <!ELEMENT write EMPTY > - -12.9 multistatus XML Element - - Name: multistatus - Namespace: DAV: - Purpose: Contains multiple response messages. - Description: The responsedescription at the top level is used to - provide a general message describing the overarching nature of the - response. If this value is available an application may use it - instead of presenting the individual response descriptions contained - within the responses. - - <!ELEMENT multistatus (response+, responsedescription?) > - -12.9.1 response XML Element - - Name: response - Namespace: DAV: - Purpose: Holds a single response describing the effect of a - method on resource and/or its properties. - Description: A particular href MUST NOT appear more than once as the - child of a response XML element under a multistatus XML element. - This requirement is necessary in order to keep processing costs for a - response to linear time. Essentially, this prevents having to search - in order to group together all the responses by href. There are, - however, no requirements regarding ordering based on href values. - - <!ELEMENT response (href, ((href*, status)|(propstat+)), - responsedescription?) > - - - - - - -Goland, et al. Standards Track [Page 64] - -RFC 2518 WEBDAV February 1999 - - -12.9.1.1 propstat XML Element - - Name: propstat - Namespace: DAV: - Purpose: Groups together a prop and status element that is - associated with a particular href element. - Description: The propstat XML element MUST contain one prop XML - element and one status XML element. The contents of the prop XML - element MUST only list the names of properties to which the result in - the status element applies. - - <!ELEMENT propstat (prop, status, responsedescription?) > - -12.9.1.2 status XML Element - - Name: status - Namespace: DAV: - Purpose: Holds a single HTTP status-line - Value: status-line ;status-line defined in [RFC2068] - - <!ELEMENT status (#PCDATA) > - -12.9.2 responsedescription XML Element - - Name: responsedescription - Namespace: DAV: - Purpose: Contains a message that can be displayed to the user - explaining the nature of the response. - Description: This XML element provides information suitable to be - presented to a user. - - <!ELEMENT responsedescription (#PCDATA) > - -12.10 owner XML Element - - Name: owner - Namespace: DAV: - Purpose: Provides information about the principal taking out a - lock. - Description: The owner XML element provides information sufficient - for either directly contacting a principal (such as a telephone - number or Email URI), or for discovering the principal (such as the - URL of a homepage) who owns a lock. - - <!ELEMENT owner ANY> - - - - - - -Goland, et al. Standards Track [Page 65] - -RFC 2518 WEBDAV February 1999 - - -12.11 prop XML element - - Name: prop - Namespace: DAV: - Purpose: Contains properties related to a resource. - Description: The prop XML element is a generic container for - properties defined on resources. All elements inside a prop XML - element MUST define properties related to the resource. No other - elements may be used inside of a prop element. - - <!ELEMENT prop ANY> - -12.12 propertybehavior XML element - - Name: propertybehavior Namespace: DAV: Purpose: Specifies - how properties are handled during a COPY or MOVE. - Description: The propertybehavior XML element specifies how - properties are handled during a COPY or MOVE. If this XML element is - not included in the request body then the server is expected to act - as defined by the default property handling behavior of the - associated method. All WebDAV compliant resources MUST support the - propertybehavior XML element. - - <!ELEMENT propertybehavior (omit | keepalive) > - -12.12.1 keepalive XML element - - Name: keepalive - Namespace: DAV: - Purpose: Specifies requirements for the copying/moving of live - properties. - Description: If a list of URIs is included as the value of keepalive - then the named properties MUST be "live" after they are copied - (moved) to the destination resource of a COPY (or MOVE). If the - value "*" is given for the keepalive XML element, this designates - that all live properties on the source resource MUST be live on the - destination. If the requirements specified by the keepalive element - can not be honored then the method MUST fail with a 412 (Precondition - Failed). All DAV compliant resources MUST support the keepalive XML - element for use with the COPY and MOVE methods. - Value: "*" ; #PCDATA value can only be "*" - - <!ELEMENT keepalive (#PCDATA | href+) > - - - - - - - - -Goland, et al. Standards Track [Page 66] - -RFC 2518 WEBDAV February 1999 - - -12.12.2 omit XML element - - Name: omit - Namespace: DAV: - Purpose: The omit XML element instructs the server that it should - use best effort to copy properties but a failure to copy a property - MUST NOT cause the method to fail. Description: The default behavior - for a COPY or MOVE is to copy/move all properties or fail the method. - In certain circumstances, such as when a server copies a resource - over another protocol such as FTP, it may not be possible to - copy/move the properties associated with the resource. Thus any - attempt to copy/move over FTP would always have to fail because - properties could not be moved over, even as dead properties. All DAV - compliant resources MUST support the omit XML element on COPY/MOVE - methods. - - <!ELEMENT omit EMPTY > - -12.13 propertyupdate XML element - - Name: propertyupdate - Namespace: DAV: - Purpose: Contains a request to alter the properties on a - resource. - Description: This XML element is a container for the information - required to modify the properties on the resource. This XML element - is multi-valued. - - <!ELEMENT propertyupdate (remove | set)+ > - -12.13.1 remove XML element - - Name: remove - Namespace: DAV: - Purpose: Lists the DAV properties to be removed from a resource. - Description: Remove instructs that the properties specified in prop - should be removed. Specifying the removal of a property that does - not exist is not an error. All the XML elements in a prop XML - element inside of a remove XML element MUST be empty, as only the - names of properties to be removed are required. - - <!ELEMENT remove (prop) > - -12.13.2 set XML element - - Name: set - Namespace: DAV: - Purpose: Lists the DAV property values to be set for a resource. - - - -Goland, et al. Standards Track [Page 67] - -RFC 2518 WEBDAV February 1999 - - - Description: The set XML element MUST contain only a prop XML - element. The elements contained by the prop XML element inside the - set XML element MUST specify the name and value of properties that - are set on the resource identified by Request-URI. If a property - already exists then its value is replaced. Language tagging - information in the property's value (in the "xml:lang" attribute, if - present) MUST be persistently stored along with the property, and - MUST be subsequently retrievable using PROPFIND. - - <!ELEMENT set (prop) > - -12.14 propfind XML Element - - Name: propfind - Namespace: DAV: - Purpose: Specifies the properties to be returned from a PROPFIND - method. Two special elements are specified for use with propfind, - allprop and propname. If prop is used inside propfind it MUST only - contain property names, not values. - - <!ELEMENT propfind (allprop | propname | prop) > - -12.14.1 allprop XML Element - - Name: allprop Namespace: DAV: Purpose: The allprop XML - element specifies that all property names and values on the resource - are to be returned. - - <!ELEMENT allprop EMPTY > - -12.14.2 propname XML Element - - Name: propname Namespace: DAV: Purpose: The propname XML - element specifies that only a list of property names on the resource - is to be returned. - - <!ELEMENT propname EMPTY > - -13 DAV Properties - - For DAV properties, the name of the property is also the same as the - name of the XML element that contains its value. In the section - below, the final line of each section gives the element type - declaration using the format defined in [REC-XML]. The "Value" field, - where present, specifies further restrictions on the allowable - contents of the XML element using BNF (i.e., to further restrict the - values of a PCDATA element). - - - - -Goland, et al. Standards Track [Page 68] - -RFC 2518 WEBDAV February 1999 - - -13.1 creationdate Property - - Name: creationdate - Namespace: DAV: - Purpose: Records the time and date the resource was created. - Value: date-time ; See Appendix 2 - Description: The creationdate property should be defined on all DAV - compliant resources. If present, it contains a timestamp of the - moment when the resource was created (i.e., the moment it had non- - null state). - - <!ELEMENT creationdate (#PCDATA) > - -13.2 displayname Property - - Name: displayname - Namespace: DAV: - Purpose: Provides a name for the resource that is suitable for - presentation to a user. - Description: The displayname property should be defined on all DAV - compliant resources. If present, the property contains a description - of the resource that is suitable for presentation to a user. - - <!ELEMENT displayname (#PCDATA) > - -13.3 getcontentlanguage Property - - Name: getcontentlanguage - Namespace: DAV: - Purpose: Contains the Content-Language header returned by a GET - without accept headers - Description: The getcontentlanguage property MUST be defined on any - DAV compliant resource that returns the Content-Language header on a - GET. - Value: language-tag ;language-tag is defined in section 14.13 - of [RFC2068] - - <!ELEMENT getcontentlanguage (#PCDATA) > - -13.4 getcontentlength Property - - Name: getcontentlength - Namespace: DAV: - Purpose: Contains the Content-Length header returned by a GET - without accept headers. - Description: The getcontentlength property MUST be defined on any - DAV compliant resource that returns the Content-Length header in - response to a GET. - - - -Goland, et al. Standards Track [Page 69] - -RFC 2518 WEBDAV February 1999 - - - Value: content-length ; see section 14.14 of [RFC2068] - - <!ELEMENT getcontentlength (#PCDATA) > - -13.5 getcontenttype Property - - Name: getcontenttype - Namespace: DAV: - Purpose: Contains the Content-Type header returned by a GET - without accept headers. - Description: This getcontenttype property MUST be defined on any DAV - compliant resource that returns the Content-Type header in response - to a GET. - Value: media-type ; defined in section 3.7 of [RFC2068] - - <!ELEMENT getcontenttype (#PCDATA) > - -13.6 getetag Property - - Name: getetag - Namespace: DAV: - Purpose: Contains the ETag header returned by a GET without - accept headers. - Description: The getetag property MUST be defined on any DAV - compliant resource that returns the Etag header. - Value: entity-tag ; defined in section 3.11 of [RFC2068] - - <!ELEMENT getetag (#PCDATA) > - -13.7 getlastmodified Property - - Name: getlastmodified - Namespace: DAV: - Purpose: Contains the Last-Modified header returned by a GET - method without accept headers. - Description: Note that the last-modified date on a resource may - reflect changes in any part of the state of the resource, not - necessarily just a change to the response to the GET method. For - example, a change in a property may cause the last-modified date to - change. The getlastmodified property MUST be defined on any DAV - compliant resource that returns the Last-Modified header in response - to a GET. - Value: HTTP-date ; defined in section 3.3.1 of [RFC2068] - - <!ELEMENT getlastmodified (#PCDATA) > - - - - - - -Goland, et al. Standards Track [Page 70] - -RFC 2518 WEBDAV February 1999 - - -13.8 lockdiscovery Property - - Name: lockdiscovery - Namespace: DAV: - Purpose: Describes the active locks on a resource - Description: The lockdiscovery property returns a listing of who has - a lock, what type of lock he has, the timeout type and the time - remaining on the timeout, and the associated lock token. The server - is free to withhold any or all of this information if the requesting - principal does not have sufficient access rights to see the requested - data. - - <!ELEMENT lockdiscovery (activelock)* > - -13.8.1 Example - Retrieving the lockdiscovery Property - - >>Request - - PROPFIND /container/ HTTP/1.1 - Host: www.foo.bar - Content-Length: xxxx - Content-Type: text/xml; charset="utf-8" - - <?xml version="1.0" encoding="utf-8" ?> - <D:propfind xmlns:D='DAV:'> - <D:prop><D:lockdiscovery/></D:prop> - </D:propfind> - - >>Response - - HTTP/1.1 207 Multi-Status - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <D:multistatus xmlns:D='DAV:'> - <D:response> - <D:href>http://www.foo.bar/container/</D:href> - <D:propstat> - <D:prop> - <D:lockdiscovery> - <D:activelock> - <D:locktype><D:write/></D:locktype> - <D:lockscope><D:exclusive/></D:lockscope> - <D:depth>0</D:depth> - <D:owner>Jane Smith</D:owner> - <D:timeout>Infinite</D:timeout> - <D:locktoken> - - - -Goland, et al. Standards Track [Page 71] - -RFC 2518 WEBDAV February 1999 - - - <D:href> - opaquelocktoken:f81de2ad-7f3d-a1b2-4f3c-00a0c91a9d76 - </D:href> - </D:locktoken> - </D:activelock> - </D:lockdiscovery> - </D:prop> - <D:status>HTTP/1.1 200 OK</D:status> - </D:propstat> - </D:response> - </D:multistatus> - - This resource has a single exclusive write lock on it, with an - infinite timeout. - -13.9 resourcetype Property - - Name: resourcetype - Namespace: DAV: - Purpose: Specifies the nature of the resource. - Description: The resourcetype property MUST be defined on all DAV - compliant resources. The default value is empty. - - <!ELEMENT resourcetype ANY > - -13.10 source Property - - Name: source - Namespace: DAV: - Purpose: The destination of the source link identifies the - resource that contains the unprocessed source of the link's source. - Description: The source of the link (src) is typically the URI of the - output resource on which the link is defined, and there is typically - only one destination (dst) of the link, which is the URI where the - unprocessed source of the resource may be accessed. When more than - one link destination exists, this specification asserts no policy on - ordering. - - <!ELEMENT source (link)* > - -13.10.1 Example - A source Property - - <?xml version="1.0" encoding="utf-8" ?> - <D:prop xmlns:D="DAV:" xmlns:F="http://www.foocorp.com/Project/"> - <D:source> - <D:link> - <F:projfiles>Source</F:projfiles> - <D:src>http://foo.bar/program</D:src> - - - -Goland, et al. Standards Track [Page 72] - -RFC 2518 WEBDAV February 1999 - - - <D:dst>http://foo.bar/src/main.c</D:dst> - </D:link> - <D:link> - <F:projfiles>Library</F:projfiles> - <D:src>http://foo.bar/program</D:src> - <D:dst>http://foo.bar/src/main.lib</D:dst> - </D:link> - <D:link> - <F:projfiles>Makefile</F:projfiles> - <D:src>http://foo.bar/program</D:src> - <D:dst>http://foo.bar/src/makefile</D:dst> - </D:link> - </D:source> - </D:prop> - - In this example the resource http://foo.bar/program has a source - property that contains three links. Each link contains three - elements, two of which, src and dst, are part of the DAV schema - defined in this document, and one which is defined by the schema - http://www.foocorp.com/project/ (Source, Library, and Makefile). A - client which only implements the elements in the DAV spec will not - understand the foocorp elements and will ignore them, thus seeing the - expected source and destination links. An enhanced client may know - about the foocorp elements and be able to present the user with - additional information about the links. This example demonstrates - the power of XML markup, allowing element values to be enhanced - without breaking older clients. - -13.11 supportedlock Property - - Name: supportedlock - Namespace: DAV: - Purpose: To provide a listing of the lock capabilities supported - by the resource. - Description: The supportedlock property of a resource returns a - listing of the combinations of scope and access types which may be - specified in a lock request on the resource. Note that the actual - contents are themselves controlled by access controls so a server is - not required to provide information the client is not authorized to - see. - - <!ELEMENT supportedlock (lockentry)* > - -13.11.1 Example - Retrieving the supportedlock Property - - >>Request - - PROPFIND /container/ HTTP/1.1 - - - -Goland, et al. Standards Track [Page 73] - -RFC 2518 WEBDAV February 1999 - - - Host: www.foo.bar - Content-Length: xxxx - Content-Type: text/xml; charset="utf-8" - - <?xml version="1.0" encoding="utf-8" ?> - <D:propfind xmlns:D="DAV:"> - <D:prop><D:supportedlock/></D:prop> - </D:propfind> - - >>Response - - HTTP/1.1 207 Multi-Status - Content-Type: text/xml; charset="utf-8" - Content-Length: xxxx - - <?xml version="1.0" encoding="utf-8" ?> - <D:multistatus xmlns:D="DAV:"> - <D:response> - <D:href>http://www.foo.bar/container/</D:href> - <D:propstat> - <D:prop> - <D:supportedlock> - <D:lockentry> - <D:lockscope><D:exclusive/></D:lockscope> - <D:locktype><D:write/></D:locktype> - </D:lockentry> - <D:lockentry> - <D:lockscope><D:shared/></D:lockscope> - <D:locktype><D:write/></D:locktype> - </D:lockentry> - </D:supportedlock> - </D:prop> - <D:status>HTTP/1.1 200 OK</D:status> - </D:propstat> - </D:response> - </D:multistatus> - -14 Instructions for Processing XML in DAV - - All DAV compliant resources MUST ignore any unknown XML element and - all its children encountered while processing a DAV method that uses - XML as its command language. - - This restriction also applies to the processing, by clients, of DAV - property values where unknown XML elements SHOULD be ignored unless - the property's schema declares otherwise. - - - - - -Goland, et al. Standards Track [Page 74] - -RFC 2518 WEBDAV February 1999 - - - This restriction does not apply to setting dead DAV properties on the - server where the server MUST record unknown XML elements. - - Additionally, this restriction does not apply to the use of XML where - XML happens to be the content type of the entity body, for example, - when used as the body of a PUT. - - Since XML can be transported as text/xml or application/xml, a DAV - server MUST accept DAV method requests with XML parameters - transported as either text/xml or application/xml, and DAV client - MUST accept XML responses using either text/xml or application/xml. - -15 DAV Compliance Classes - - A DAV compliant resource can choose from two classes of compliance. - A client can discover the compliance classes of a resource by - executing OPTIONS on the resource, and examining the "DAV" header - which is returned. - - Since this document describes extensions to the HTTP/1.1 protocol, - minimally all DAV compliant resources, clients, and proxies MUST be - compliant with [RFC2068]. - - Compliance classes are not necessarily sequential. A resource that is - class 2 compliant must also be class 1 compliant; but if additional - compliance classes are defined later, a resource that is class 1, 2, - and 4 compliant might not be class 3 compliant. Also note that - identifiers other than numbers may be used as compliance class - identifiers. - -15.1 Class 1 - - A class 1 compliant resource MUST meet all "MUST" requirements in all - sections of this document. - - Class 1 compliant resources MUST return, at minimum, the value "1" in - the DAV header on all responses to the OPTIONS method. - -15.2 Class 2 - - A class 2 compliant resource MUST meet all class 1 requirements and - support the LOCK method, the supportedlock property, the - lockdiscovery property, the Time-Out response header and the Lock- - Token request header. A class "2" compliant resource SHOULD also - support the Time-Out request header and the owner XML element. - - Class 2 compliant resources MUST return, at minimum, the values "1" - and "2" in the DAV header on all responses to the OPTIONS method. - - - -Goland, et al. Standards Track [Page 75] - -RFC 2518 WEBDAV February 1999 - - -16 Internationalization Considerations - - In the realm of internationalization, this specification complies - with the IETF Character Set Policy [RFC2277]. In this specification, - human-readable fields can be found either in the value of a property, - or in an error message returned in a response entity body. In both - cases, the human-readable content is encoded using XML, which has - explicit provisions for character set tagging and encoding, and - requires that XML processors read XML elements encoded, at minimum, - using the UTF-8 [UTF-8] encoding of the ISO 10646 multilingual plane. - XML examples in this specification demonstrate use of the charset - parameter of the Content-Type header, as defined in [RFC2376], as - well as the XML "encoding" attribute, which together provide charset - identification information for MIME and XML processors. - - XML also provides a language tagging capability for specifying the - language of the contents of a particular XML element. XML uses - either IANA registered language tags (see [RFC1766]) or ISO 639 - language tags [ISO-639] in the "xml:lang" attribute of an XML element - to identify the language of its content and attributes. - - WebDAV applications MUST support the character set tagging, character - set encoding, and the language tagging functionality of the XML - specification. Implementors of WebDAV applications are strongly - encouraged to read "XML Media Types" [RFC2376] for instruction on - which MIME media type to use for XML transport, and on use of the - charset parameter of the Content-Type header. - - Names used within this specification fall into three categories: - names of protocol elements such as methods and headers, names of XML - elements, and names of properties. Naming of protocol elements - follows the precedent of HTTP, using English names encoded in USASCII - for methods and headers. Since these protocol elements are not - visible to users, and are in fact simply long token identifiers, they - do not need to support encoding in multiple character sets. - Similarly, though the names of XML elements used in this - specification are English names encoded in UTF-8, these names are not - visible to the user, and hence do not need to support multiple - character set encodings. - - The name of a property defined on a resource is a URI. Although some - applications (e.g., a generic property viewer) will display property - URIs directly to their users, it is expected that the typical - application will use a fixed set of properties, and will provide a - mapping from the property name URI to a human-readable field when - displaying the property name to a user. It is only in the case where - - - - - -Goland, et al. Standards Track [Page 76] - -RFC 2518 WEBDAV February 1999 - - - the set of properties is not known ahead of time that an application - need display a property name URI to a user. We recommend that - applications provide human-readable property names wherever feasible. - - For error reporting, we follow the convention of HTTP/1.1 status - codes, including with each status code a short, English description - of the code (e.g., 423 (Locked)). While the possibility exists that - a poorly crafted user agent would display this message to a user, - internationalized applications will ignore this message, and display - an appropriate message in the user's language and character set. - - Since interoperation of clients and servers does not require locale - information, this specification does not specify any mechanism for - transmission of this information. - -17 Security Considerations - - This section is provided to detail issues concerning security - implications of which WebDAV applications need to be aware. - - All of the security considerations of HTTP/1.1 (discussed in - [RFC2068]) and XML (discussed in [RFC2376]) also apply to WebDAV. In - addition, the security risks inherent in remote authoring require - stronger authentication technology, introduce several new privacy - concerns, and may increase the hazards from poor server design. - These issues are detailed below. - -17.1 Authentication of Clients - - Due to their emphasis on authoring, WebDAV servers need to use - authentication technology to protect not just access to a network - resource, but the integrity of the resource as well. Furthermore, - the introduction of locking functionality requires support for - authentication. - - A password sent in the clear over an insecure channel is an - inadequate means for protecting the accessibility and integrity of a - resource as the password may be intercepted. Since Basic - authentication for HTTP/1.1 performs essentially clear text - transmission of a password, Basic authentication MUST NOT be used to - authenticate a WebDAV client to a server unless the connection is - secure. Furthermore, a WebDAV server MUST NOT send Basic - authentication credentials in a WWW-Authenticate header unless the - connection is secure. Examples of secure connections include a - Transport Layer Security (TLS) connection employing a strong cipher - suite with mutual authentication of client and server, or a - connection over a network which is physically secure, for example, an - isolated network in a building with restricted access. - - - -Goland, et al. Standards Track [Page 77] - -RFC 2518 WEBDAV February 1999 - - - WebDAV applications MUST support the Digest authentication scheme - [RFC2069]. Since Digest authentication verifies that both parties to - a communication know a shared secret, a password, without having to - send that secret in the clear, Digest authentication avoids the - security problems inherent in Basic authentication while providing a - level of authentication which is useful in a wide range of scenarios. - -17.2 Denial of Service - - Denial of service attacks are of special concern to WebDAV servers. - WebDAV plus HTTP enables denial of service attacks on every part of a - system's resources. - - The underlying storage can be attacked by PUTting extremely large - files. - - Asking for recursive operations on large collections can attack - processing time. - - Making multiple pipelined requests on multiple connections can attack - network connections. - - WebDAV servers need to be aware of the possibility of a denial of - service attack at all levels. - -17.3 Security through Obscurity - - WebDAV provides, through the PROPFIND method, a mechanism for listing - the member resources of a collection. This greatly diminishes the - effectiveness of security or privacy techniques that rely only on the - difficulty of discovering the names of network resources. Users of - WebDAV servers are encouraged to use access control techniques to - prevent unwanted access to resources, rather than depending on the - relative obscurity of their resource names. - -17.4 Privacy Issues Connected to Locks - - When submitting a lock request a user agent may also submit an owner - XML field giving contact information for the person taking out the - lock (for those cases where a person, rather than a robot, is taking - out the lock). This contact information is stored in a lockdiscovery - property on the resource, and can be used by other collaborators to - begin negotiation over access to the resource. However, in many - cases this contact information can be very private, and should not be - widely disseminated. Servers SHOULD limit read access to the - lockdiscovery property as appropriate. Furthermore, user agents - - - - - -Goland, et al. Standards Track [Page 78] - -RFC 2518 WEBDAV February 1999 - - - SHOULD provide control over whether contact information is sent at - all, and if contact information is sent, control over exactly what - information is sent. - -17.5 Privacy Issues Connected to Properties - - Since property values are typically used to hold information such as - the author of a document, there is the possibility that privacy - concerns could arise stemming from widespread access to a resource's - property data. To reduce the risk of inadvertent release of private - information via properties, servers are encouraged to develop access - control mechanisms that separate read access to the resource body and - read access to the resource's properties. This allows a user to - control the dissemination of their property data without overly - restricting access to the resource's contents. - -17.6 Reduction of Security due to Source Link - - HTTP/1.1 warns against providing read access to script code because - it may contain sensitive information. Yet WebDAV, via its source - link facility, can potentially provide a URI for script resources so - they may be authored. For HTTP/1.1, a server could reasonably - prevent access to source resources due to the predominance of read- - only access. WebDAV, with its emphasis on authoring, encourages read - and write access to source resources, and provides the source link - facility to identify the source. This reduces the security benefits - of eliminating access to source resources. Users and administrators - of WebDAV servers should be very cautious when allowing remote - authoring of scripts, limiting read and write access to the source - resources to authorized principals. - -17.7 Implications of XML External Entities - - XML supports a facility known as "external entities", defined in - section 4.2.2 of [REC-XML], which instruct an XML processor to - retrieve and perform an inline include of XML located at a particular - URI. An external XML entity can be used to append or modify the - document type declaration (DTD) associated with an XML document. An - external XML entity can also be used to include XML within the - content of an XML document. For non-validating XML, such as the XML - used in this specification, including an external XML entity is not - required by [REC-XML]. However, [REC-XML] does state that an XML - processor may, at its discretion, include the external XML entity. - - External XML entities have no inherent trustworthiness and are - subject to all the attacks that are endemic to any HTTP GET request. - Furthermore, it is possible for an external XML entity to modify the - DTD, and hence affect the final form of an XML document, in the worst - - - -Goland, et al. Standards Track [Page 79] - -RFC 2518 WEBDAV February 1999 - - - case significantly modifying its semantics, or exposing the XML - processor to the security risks discussed in [RFC2376]. Therefore, - implementers must be aware that external XML entities should be - treated as untrustworthy. - - There is also the scalability risk that would accompany a widely - deployed application which made use of external XML entities. In - this situation, it is possible that there would be significant - numbers of requests for one external XML entity, potentially - overloading any server which fields requests for the resource - containing the external XML entity. - -17.8 Risks Connected with Lock Tokens - - This specification, in section 6.4, requires the use of Universal - Unique Identifiers (UUIDs) for lock tokens, in order to guarantee - their uniqueness across space and time. UUIDs, as defined in [ISO- - 11578], contain a "node" field which "consists of the IEEE address, - usually the host address. For systems with multiple IEEE 802 nodes, - any available node address can be used." Since a WebDAV server will - issue many locks over its lifetime, the implication is that it will - also be publicly exposing its IEEE 802 address. - - There are several risks associated with exposure of IEEE 802 - addresses. Using the IEEE 802 address: - - * It is possible to track the movement of hardware from subnet to - subnet. - - * It may be possible to identify the manufacturer of the hardware - running a WebDAV server. - - * It may be possible to determine the number of each type of computer - running WebDAV. - - Section 6.4.1 of this specification details an alternate mechanism - for generating the "node" field of a UUID without using an IEEE 802 - address, which alleviates the risks associated with exposure of IEEE - 802 addresses by using an alternate source of uniqueness. - -18 IANA Considerations - - This document defines two namespaces, the namespace of property - names, and the namespace of WebDAV-specific XML elements used within - property values. - - - - - - -Goland, et al. Standards Track [Page 80] - -RFC 2518 WEBDAV February 1999 - - - URIs are used for both names, for several reasons. Assignment of a - URI does not require a request to a central naming authority, and - hence allow WebDAV property names and XML elements to be quickly - defined by any WebDAV user or application. URIs also provide a - unique address space, ensuring that the distributed users of WebDAV - will not have collisions among the property names and XML elements - they create. - - This specification defines a distinguished set of property names and - XML elements that are understood by all WebDAV applications. The - property names and XML elements in this specification are all derived - from the base URI DAV: by adding a suffix to this URI, for example, - DAV:creationdate for the "creationdate" property. - - This specification also defines a URI scheme for the encoding of lock - tokens, the opaquelocktoken URI scheme described in section 6.4. - - To ensure correct interoperation based on this specification, IANA - must reserve the URI namespaces starting with "DAV:" and with - "opaquelocktoken:" for use by this specification, its revisions, and - related WebDAV specifications. - -19 Intellectual Property - - The following notice is copied from RFC 2026 [RFC2026], section 10.4, - and describes the position of the IETF concerning intellectual - property claims made against this document. - - The IETF takes no position regarding the validity or scope of any - intellectual property or other rights that might be claimed to - pertain to the implementation or use other technology described in - this document or the extent to which any license under such rights - might or might not be available; neither does it represent that it - has made any effort to identify any such rights. Information on the - IETF's procedures with respect to rights in standards-track and - standards-related documentation can be found in BCP-11. Copies of - claims of rights made available for publication and any assurances of - licenses to be made available, or the result of an attempt made to - obtain a general license or permission for the use of such - proprietary rights by implementors or users of this specification can - be obtained from the IETF Secretariat. - - The IETF invites any interested party to bring to its attention any - copyrights, patents or patent applications, or other proprietary - rights which may cover technology that may be required to practice - this standard. Please address the information to the IETF Executive - Director. - - - - -Goland, et al. Standards Track [Page 81] - -RFC 2518 WEBDAV February 1999 - - -20 Acknowledgements - - A specification such as this thrives on piercing critical review and - withers from apathetic neglect. The authors gratefully acknowledge - the contributions of the following people, whose insights were so - valuable at every stage of our work. - - Terry Allen, Harald Alvestrand, Jim Amsden, Becky Anderson, Alan - Babich, Sanford Barr, Dylan Barrell, Bernard Chester, Tim Berners- - Lee, Dan Connolly, Jim Cunningham, Ron Daniel, Jr., Jim Davis, Keith - Dawson, Mark Day, Brian Deen, Martin Duerst, David Durand, Lee - Farrell, Chuck Fay, Wesley Felter, Roy Fielding, Mark Fisher, Alan - Freier, George Florentine, Jim Gettys, Phill Hallam-Baker, Dennis - Hamilton, Steve Henning, Mead Himelstein, Alex Hopmann, Andre van der - Hoek, Ben Laurie, Paul Leach, Ora Lassila, Karen MacArthur, Steven - Martin, Larry Masinter, Michael Mealling, Keith Moore, Thomas Narten, - Henrik Nielsen, Kenji Ota, Bob Parker, Glenn Peterson, Jon Radoff, - Saveen Reddy, Henry Sanders, Christopher Seiwald, Judith Slein, Mike - Spreitzer, Einar Stefferud, Greg Stein, Ralph Swick, Kenji Takahashi, - Richard N. Taylor, Robert Thau, John Turner, Sankar Virdhagriswaran, - Fabio Vitali, Gregory Woodhouse, and Lauren Wood. - - Two from this list deserve special mention. The contributions by - Larry Masinter have been invaluable, both in helping the formation of - the working group and in patiently coaching the authors along the - way. In so many ways he has set high standards we have toiled to - meet. The contributions of Judith Slein in clarifying the - requirements, and in patiently reviewing draft after draft, both - improved this specification and expanded our minds on document - management. - - We would also like to thank John Turner for developing the XML DTD. - -21 References - -21.1 Normative References - - [RFC1766] Alvestrand, H., "Tags for the Identification of - Languages", RFC 1766, March 1995. - - [RFC2277] Alvestrand, H., "IETF Policy on Character Sets and - Languages", BCP 18, RFC 2277, January 1998. - - [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - - - - - -Goland, et al. Standards Track [Page 82] - -RFC 2518 WEBDAV February 1999 - - - [RFC2396] Berners-Lee, T., Fielding, R. and L. Masinter, - "Uniform Resource Identifiers (URI): Generic Syntax", - RFC 2396, August 1998. - - [REC-XML] T. Bray, J. Paoli, C. M. Sperberg-McQueen, - "Extensible Markup Language (XML)." World Wide Web - Consortium Recommendation REC-xml-19980210. - http://www.w3.org/TR/1998/REC-xml-19980210. - - [REC-XML-NAMES] T. Bray, D. Hollander, A. Layman, "Namespaces in - XML". World Wide Web Consortium Recommendation REC- - xml-names-19990114. http://www.w3.org/TR/1999/REC- - xml-names-19990114/ - - [RFC2069] Franks, J., Hallam-Baker, P., Hostetler, J., Leach, - P, Luotonen, A., Sink, E. and L. Stewart, "An - Extension to HTTP : Digest Access Authentication", - RFC 2069, January 1997. - - [RFC2068] Fielding, R., Gettys, J., Mogul, J., Frystyk, H. and - T. Berners-Lee, "Hypertext Transfer Protocol -- - HTTP/1.1", RFC 2068, January 1997. - - [ISO-639] ISO (International Organization for Standardization). - ISO 639:1988. "Code for the representation of names - of languages." - - [ISO-8601] ISO (International Organization for Standardization). - ISO 8601:1988. "Data elements and interchange formats - - Information interchange - Representation of dates - and times." - - [ISO-11578] ISO (International Organization for Standardization). - ISO/IEC 11578:1996. "Information technology - Open - Systems Interconnection - Remote Procedure Call - (RPC)" - - [RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997. - - [UTF-8] Yergeau, F., "UTF-8, a transformation format of - Unicode and ISO 10646", RFC 2279, January 1998. - -21.2 Informational References - - [RFC2026] Bradner, S., "The Internet Standards Process - Revision - 3", BCP 9, RFC 2026, October 1996. - - - - - -Goland, et al. Standards Track [Page 83] - -RFC 2518 WEBDAV February 1999 - - - [RFC1807] Lasher, R. and D. Cohen, "A Format for Bibliographic - Records", RFC 1807, June 1995. - - [WF] C. Lagoze, "The Warwick Framework: A Container - Architecture for Diverse Sets of Metadata", D-Lib - Magazine, July/August 1996. - http://www.dlib.org/dlib/july96/lagoze/07lagoze.html - - [USMARC] Network Development and MARC Standards, Office, ed. 1994. - "USMARC Format for Bibliographic Data", 1994. Washington, - DC: Cataloging Distribution Service, Library of Congress. - - [REC-PICS] J. Miller, T. Krauskopf, P. Resnick, W. Treese, "PICS - Label Distribution Label Syntax and Communication - Protocols" Version 1.1, World Wide Web Consortium - Recommendation REC-PICS-labels-961031. - http://www.w3.org/pub/WWW/TR/REC-PICS-labels-961031.html. - - [RFC2291] Slein, J., Vitali, F., Whitehead, E. and D. Durand, - "Requirements for Distributed Authoring and Versioning - Protocol for the World Wide Web", RFC 2291, February 1998. - - [RFC2413] Weibel, S., Kunze, J., Lagoze, C. and M. Wolf, "Dublin - Core Metadata for Resource Discovery", RFC 2413, September - 1998. - - [RFC2376] Whitehead, E. and M. Murata, "XML Media Types", RFC 2376, - July 1998. - -22 Authors' Addresses - - Y. Y. Goland - Microsoft Corporation - One Microsoft Way - Redmond, WA 98052-6399 - - EMail: yarong@microsoft.com - - - E. J. Whitehead, Jr. - Dept. Of Information and Computer Science - University of California, Irvine - Irvine, CA 92697-3425 - - EMail: ejw@ics.uci.edu - - - - - - -Goland, et al. Standards Track [Page 84] - -RFC 2518 WEBDAV February 1999 - - - A. Faizi - Netscape - 685 East Middlefield Road - Mountain View, CA 94043 - - EMail: asad@netscape.com - - - S. R. Carter - Novell - 1555 N. Technology Way - M/S ORM F111 - Orem, UT 84097-2399 - - EMail: srcarter@novell.com - - - D. Jensen - Novell - 1555 N. Technology Way - M/S ORM F111 - Orem, UT 84097-2399 - - EMail: dcjensen@novell.com - - - - - - - - - - - - - - - - - - - - - - - - - - - -Goland, et al. Standards Track [Page 85] - -RFC 2518 WEBDAV February 1999 - - -23 Appendices - -23.1 Appendix 1 - WebDAV Document Type Definition - - This section provides a document type definition, following the rules - in [REC-XML], for the XML elements used in the protocol stream and in - the values of properties. It collects the element definitions given - in sections 12 and 13. - - <!DOCTYPE webdav-1.0 [ - - <!--============ XML Elements from Section 12 ==================--> - - <!ELEMENT activelock (lockscope, locktype, depth, owner?, timeout?, - locktoken?) > - - <!ELEMENT lockentry (lockscope, locktype) > - <!ELEMENT lockinfo (lockscope, locktype, owner?) > - - <!ELEMENT locktype (write) > - <!ELEMENT write EMPTY > - - <!ELEMENT lockscope (exclusive | shared) > - <!ELEMENT exclusive EMPTY > - <!ELEMENT shared EMPTY > - - <!ELEMENT depth (#PCDATA) > - - <!ELEMENT owner ANY > - - <!ELEMENT timeout (#PCDATA) > - - <!ELEMENT locktoken (href+) > - - <!ELEMENT href (#PCDATA) > - - <!ELEMENT link (src+, dst+) > - <!ELEMENT dst (#PCDATA) > - <!ELEMENT src (#PCDATA) > - - <!ELEMENT multistatus (response+, responsedescription?) > - - <!ELEMENT response (href, ((href*, status)|(propstat+)), - responsedescription?) > - <!ELEMENT status (#PCDATA) > - <!ELEMENT propstat (prop, status, responsedescription?) > - <!ELEMENT responsedescription (#PCDATA) > - - - - -Goland, et al. Standards Track [Page 86] - -RFC 2518 WEBDAV February 1999 - - - <!ELEMENT prop ANY > - - <!ELEMENT propertybehavior (omit | keepalive) > - <!ELEMENT omit EMPTY > - - <!ELEMENT keepalive (#PCDATA | href+) > - - <!ELEMENT propertyupdate (remove | set)+ > - <!ELEMENT remove (prop) > - <!ELEMENT set (prop) > - - <!ELEMENT propfind (allprop | propname | prop) > - <!ELEMENT allprop EMPTY > - <!ELEMENT propname EMPTY > - - <!ELEMENT collection EMPTY > - - <!--=========== Property Elements from Section 13 ===============--> - <!ELEMENT creationdate (#PCDATA) > - <!ELEMENT displayname (#PCDATA) > - <!ELEMENT getcontentlanguage (#PCDATA) > - <!ELEMENT getcontentlength (#PCDATA) > - <!ELEMENT getcontenttype (#PCDATA) > - <!ELEMENT getetag (#PCDATA) > - <!ELEMENT getlastmodified (#PCDATA) > - <!ELEMENT lockdiscovery (activelock)* > - <!ELEMENT resourcetype ANY > - <!ELEMENT source (link)* > - <!ELEMENT supportedlock (lockentry)* > - ]> - - - - - - - - - - - - - - - - - - - - - -Goland, et al. Standards Track [Page 87] - -RFC 2518 WEBDAV February 1999 - - -23.2 Appendix 2 - ISO 8601 Date and Time Profile - - The creationdate property specifies the use of the ISO 8601 date - format [ISO-8601]. This section defines a profile of the ISO 8601 - date format for use with this specification. This profile is quoted - from an Internet-Draft by Chris Newman, and is mentioned here to - properly attribute his work. - - date-time = full-date "T" full-time - - full-date = date-fullyear "-" date-month "-" date-mday - full-time = partial-time time-offset - - date-fullyear = 4DIGIT - date-month = 2DIGIT ; 01-12 - date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on - month/year - time-hour = 2DIGIT ; 00-23 - time-minute = 2DIGIT ; 00-59 - time-second = 2DIGIT ; 00-59, 00-60 based on leap second rules - time-secfrac = "." 1*DIGIT - time-numoffset = ("+" / "-") time-hour ":" time-minute - time-offset = "Z" / time-numoffset - - partial-time = time-hour ":" time-minute ":" time-second - [time-secfrac] - - Numeric offsets are calculated as local time minus UTC (Coordinated - Universal Time). So the equivalent time in UTC can be determined by - subtracting the offset from the local time. For example, 18:50:00- - 04:00 is the same time as 22:58:00Z. - - If the time in UTC is known, but the offset to local time is unknown, - this can be represented with an offset of "-00:00". This differs - from an offset of "Z" which implies that UTC is the preferred - reference point for the specified time. - - - - - - - - - - - - - - - -Goland, et al. Standards Track [Page 88] - -RFC 2518 WEBDAV February 1999 - - -23.3 Appendix 3 - Notes on Processing XML Elements - -23.3.1 Notes on Empty XML Elements - - XML supports two mechanisms for indicating that an XML element does - not have any content. The first is to declare an XML element of the - form <A></A>. The second is to declare an XML element of the form - <A/>. The two XML elements are semantically identical. - - It is a violation of the XML specification to use the <A></A> form if - the associated DTD declares the element to be EMPTY (e.g., <!ELEMENT - A EMPTY>). If such a statement is included, then the empty element - format, <A/> must be used. If the element is not declared to be - EMPTY, then either form <A></A> or <A/> may be used for empty - elements. - - 23.3.2 Notes on Illegal XML Processing - - XML is a flexible data format that makes it easy to submit data that - appears legal but in fact is not. The philosophy of "Be flexible in - what you accept and strict in what you send" still applies, but it - must not be applied inappropriately. XML is extremely flexible in - dealing with issues of white space, element ordering, inserting new - elements, etc. This flexibility does not require extension, - especially not in the area of the meaning of elements. - - There is no kindness in accepting illegal combinations of XML - elements. At best it will cause an unwanted result and at worst it - can cause real damage. - -23.3.2.1 Example - XML Syntax Error - - The following request body for a PROPFIND method is illegal. - - <?xml version="1.0" encoding="utf-8" ?> - <D:propfind xmlns:D="DAV:"> - <D:allprop/> - <D:propname/> - </D:propfind> - - The definition of the propfind element only allows for the allprop or - the propname element, not both. Thus the above is an error and must - be responded to with a 400 (Bad Request). - - - - - - - - -Goland, et al. Standards Track [Page 89] - -RFC 2518 WEBDAV February 1999 - - - Imagine, however, that a server wanted to be "kind" and decided to - pick the allprop element as the true element and respond to it. A - client running over a bandwidth limited line who intended to execute - a propname would be in for a big surprise if the server treated the - command as an allprop. - - Additionally, if a server were lenient and decided to reply to this - request, the results would vary randomly from server to server, with - some servers executing the allprop directive, and others executing - the propname directive. This reduces interoperability rather than - increasing it. - -23.3.2.2 Example - Unknown XML Element - - The previous example was illegal because it contained two elements - that were explicitly banned from appearing together in the propfind - element. However, XML is an extensible language, so one can imagine - new elements being defined for use with propfind. Below is the - request body of a PROPFIND and, like the previous example, must be - rejected with a 400 (Bad Request) by a server that does not - understand the expired-props element. - - <?xml version="1.0" encoding="utf-8" ?> - <D:propfind xmlns:D="DAV:" - xmlns:E="http://www.foo.bar/standards/props/"> - <E:expired-props/> - </D:propfind> - - To understand why a 400 (Bad Request) is returned let us look at the - request body as the server unfamiliar with expired-props sees it. - - <?xml version="1.0" encoding="utf-8" ?> - <D:propfind xmlns:D="DAV:" - xmlns:E="http://www.foo.bar/standards/props/"> - </D:propfind> - - As the server does not understand the expired-props element, - according to the WebDAV-specific XML processing rules specified in - section 14, it must ignore it. Thus the server sees an empty - propfind, which by the definition of the propfind element is illegal. - - Please note that had the extension been additive it would not - necessarily have resulted in a 400 (Bad Request). For example, - imagine the following request body for a PROPFIND: - - <?xml version="1.0" encoding="utf-8" ?> - <D:propfind xmlns:D="DAV:" - xmlns:E="http://www.foo.bar/standards/props/"> - - - -Goland, et al. Standards Track [Page 90] - -RFC 2518 WEBDAV February 1999 - - - <D:propname/> - <E:leave-out>*boss*</E:leave-out> - </D:propfind> - - The previous example contains the fictitious element leave-out. Its - purpose is to prevent the return of any property whose name matches - the submitted pattern. If the previous example were submitted to a - server unfamiliar with leave-out, the only result would be that the - leave-out element would be ignored and a propname would be executed. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Goland, et al. Standards Track [Page 91] - -RFC 2518 WEBDAV February 1999 - - -23.4 Appendix 4 -- XML Namespaces for WebDAV - -23.4.1 Introduction - - All DAV compliant systems MUST support the XML namespace extensions - as specified in [REC-XML-NAMES]. - -23.4.2 Meaning of Qualified Names - - [Note to the reader: This section does not appear in [REC-XML-NAMES], - but is necessary to avoid ambiguity for WebDAV XML processors.] - - WebDAV compliant XML processors MUST interpret a qualified name as a - URI constructed by appending the LocalPart to the namespace name URI. - - Example - - <del:glider xmlns:del="http://www.del.jensen.org/"> - <del:glidername> - Johnny Updraft - </del:glidername> - <del:glideraccidents/> - </del:glider> - - In this example, the qualified element name "del:glider" is - interpreted as the URL "http://www.del.jensen.org/glider". - - <bar:glider xmlns:del="http://www.del.jensen.org/"> - <bar:glidername> - Johnny Updraft - </bar:glidername> - <bar:glideraccidents/> - </bar:glider> - - Even though this example is syntactically different from the previous - example, it is semantically identical. Each instance of the - namespace name "bar" is replaced with "http://www.del.jensen.org/" - and then appended to the local name for each element tag. The - resulting tag names in this example are exactly the same as for the - previous example. - - <foo:r xmlns:foo="http://www.del.jensen.org/glide"> - <foo:rname> - Johnny Updraft - </foo:rname> - <foo:raccidents/> - </foo:r> - - - - -Goland, et al. Standards Track [Page 92] - -RFC 2518 WEBDAV February 1999 - - - This example is semantically identical to the two previous ones. - Each instance of the namespace name "foo" is replaced with - "http://www.del.jensen.org/glide" which is then appended to the local - name for each element tag, the resulting tag names are identical to - those in the previous examples. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Goland, et al. Standards Track [Page 93] - -RFC 2518 WEBDAV February 1999 - - -24. Full Copyright Statement - - Copyright (C) The Internet Society (1999). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - - - - - - - - - - - - - - - - - - - - - - - - -Goland, et al. Standards Track [Page 94] - diff --git a/docs/specs/rfc2616.txt b/docs/specs/rfc2616.txt deleted file mode 100644 index 32f6f69d..00000000 --- a/docs/specs/rfc2616.txt +++ /dev/null @@ -1,9934 +0,0 @@ - -[[ Text in double brackets is from the unofficial errata at ]] -[[ http://skrb.org/ietf/http_errata.html ]] - - -Network Working Group R. Fielding -Request for Comments: 2616 UC Irvine -Obsoletes: 2068 J. Gettys -Category: Standards Track Compaq/W3C - J. Mogul - Compaq - H. Frystyk - W3C/MIT - L. Masinter - Xerox - P. Leach - Microsoft - T. Berners-Lee - W3C/MIT - June 1999 - - - Hypertext Transfer Protocol -- HTTP/1.1 - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (1999). All Rights Reserved. - -Abstract - - The Hypertext Transfer Protocol (HTTP) is an application-level - protocol for distributed, collaborative, hypermedia information - systems. It is a generic, stateless, protocol which can be used for - many tasks beyond its use for hypertext, such as name servers and - distributed object management systems, through extension of its - request methods, error codes and headers [47]. A feature of HTTP is - the typing and negotiation of data representation, allowing systems - to be built independently of the data being transferred. - - HTTP has been in use by the World-Wide Web global information - initiative since 1990. This specification defines the protocol - referred to as "HTTP/1.1", and is an update to RFC 2068 [33]. - - - - - - -Fielding, et al. Standards Track [Page 1] - -RFC 2616 HTTP/1.1 June 1999 - - -Table of Contents - - 1 Introduction ...................................................7 - 1.1 Purpose......................................................7 - 1.2 Requirements .................................................8 - 1.3 Terminology ..................................................8 - 1.4 Overall Operation ...........................................12 - 2 Notational Conventions and Generic Grammar ....................14 - 2.1 Augmented BNF ...............................................14 - 2.2 Basic Rules .................................................15 - 3 Protocol Parameters ...........................................17 - 3.1 HTTP Version ................................................17 - 3.2 Uniform Resource Identifiers ................................18 - 3.2.1 General Syntax ...........................................19 - 3.2.2 http URL .................................................19 - 3.2.3 URI Comparison ...........................................20 - 3.3 Date/Time Formats ...........................................20 - 3.3.1 Full Date ................................................20 - 3.3.2 Delta Seconds ............................................21 - 3.4 Character Sets ..............................................21 - 3.4.1 Missing Charset ..........................................22 - 3.5 Content Codings .............................................23 - 3.6 Transfer Codings ............................................24 - 3.6.1 Chunked Transfer Coding ..................................25 - 3.7 Media Types .................................................26 - 3.7.1 Canonicalization and Text Defaults .......................27 - 3.7.2 Multipart Types ..........................................27 - 3.8 Product Tokens ..............................................28 - 3.9 Quality Values ..............................................29 - 3.10 Language Tags ...............................................29 - 3.11 Entity Tags .................................................30 - 3.12 Range Units .................................................30 - 4 HTTP Message ..................................................31 - 4.1 Message Types ...............................................31 - 4.2 Message Headers .............................................31 - 4.3 Message Body ................................................32 - 4.4 Message Length ..............................................33 - 4.5 General Header Fields .......................................34 - 5 Request .......................................................35 - 5.1 Request-Line ................................................35 - 5.1.1 Method ...................................................36 - 5.1.2 Request-URI ..............................................36 - 5.2 The Resource Identified by a Request ........................38 - 5.3 Request Header Fields .......................................38 - 6 Response ......................................................39 - 6.1 Status-Line .................................................39 - 6.1.1 Status Code and Reason Phrase ............................39 - 6.2 Response Header Fields ......................................41 - - - -Fielding, et al. Standards Track [Page 2] - -RFC 2616 HTTP/1.1 June 1999 - - - 7 Entity ........................................................42 - 7.1 Entity Header Fields ........................................42 - 7.2 Entity Body .................................................43 - 7.2.1 Type .....................................................43 - 7.2.2 Entity Length ............................................43 - 8 Connections ...................................................44 - 8.1 Persistent Connections ......................................44 - 8.1.1 Purpose ..................................................44 - 8.1.2 Overall Operation ........................................45 - 8.1.3 Proxy Servers ............................................46 - 8.1.4 Practical Considerations .................................46 - 8.2 Message Transmission Requirements ...........................47 - 8.2.1 Persistent Connections and Flow Control ..................47 - 8.2.2 Monitoring Connections for Error Status Messages .........48 - 8.2.3 Use of the 100 (Continue) Status .........................48 - 8.2.4 Client Behavior if Server Prematurely Closes Connection ..50 - 9 Method Definitions ............................................51 - 9.1 Safe and Idempotent Methods .................................51 - 9.1.1 Safe Methods .............................................51 - 9.1.2 Idempotent Methods .......................................51 - 9.2 OPTIONS .....................................................52 - 9.3 GET .........................................................53 - 9.4 HEAD ........................................................54 - 9.5 POST ........................................................54 - 9.6 PUT .........................................................55 - 9.7 DELETE ......................................................56 - 9.8 TRACE .......................................................56 - 9.9 CONNECT .....................................................57 - 10 Status Code Definitions ......................................57 - 10.1 Informational 1xx ...........................................57 - 10.1.1 100 Continue .............................................58 - 10.1.2 101 Switching Protocols ..................................58 - 10.2 Successful 2xx ..............................................58 - 10.2.1 200 OK ...................................................58 - 10.2.2 201 Created ..............................................59 - 10.2.3 202 Accepted .............................................59 - 10.2.4 203 Non-Authoritative Information ........................59 - 10.2.5 204 No Content ...........................................60 - 10.2.6 205 Reset Content ........................................60 - 10.2.7 206 Partial Content ......................................60 - 10.3 Redirection 3xx .............................................61 - 10.3.1 300 Multiple Choices .....................................61 - 10.3.2 301 Moved Permanently ....................................62 - 10.3.3 302 Found ................................................62 - 10.3.4 303 See Other ............................................63 - 10.3.5 304 Not Modified .........................................63 - 10.3.6 305 Use Proxy ............................................64 - 10.3.7 306 (Unused) .............................................64 - - - -Fielding, et al. Standards Track [Page 3] - -RFC 2616 HTTP/1.1 June 1999 - - - 10.3.8 307 Temporary Redirect ...................................65 - 10.4 Client Error 4xx ............................................65 - 10.4.1 400 Bad Request .........................................65 - 10.4.2 401 Unauthorized ........................................66 - 10.4.3 402 Payment Required ....................................66 - 10.4.4 403 Forbidden ...........................................66 - 10.4.5 404 Not Found ...........................................66 - 10.4.6 405 Method Not Allowed ..................................66 - 10.4.7 406 Not Acceptable ......................................67 - 10.4.8 407 Proxy Authentication Required .......................67 - 10.4.9 408 Request Timeout .....................................67 - 10.4.10 409 Conflict ............................................67 - 10.4.11 410 Gone ................................................68 - 10.4.12 411 Length Required .....................................68 - 10.4.13 412 Precondition Failed .................................68 - 10.4.14 413 Request Entity Too Large ............................69 - 10.4.15 414 Request-URI Too Long ................................69 - 10.4.16 415 Unsupported Media Type ..............................69 - 10.4.17 416 Requested Range Not Satisfiable .....................69 - 10.4.18 417 Expectation Failed ..................................70 - 10.5 Server Error 5xx ............................................70 - 10.5.1 500 Internal Server Error ................................70 - 10.5.2 501 Not Implemented ......................................70 - 10.5.3 502 Bad Gateway ..........................................70 - 10.5.4 503 Service Unavailable ..................................70 - 10.5.5 504 Gateway Timeout ......................................71 - 10.5.6 505 HTTP Version Not Supported ...........................71 - 11 Access Authentication ........................................71 - 12 Content Negotiation ..........................................71 - 12.1 Server-driven Negotiation ...................................72 - 12.2 Agent-driven Negotiation ....................................73 - 12.3 Transparent Negotiation .....................................74 - 13 Caching in HTTP ..............................................74 - 13.1.1 Cache Correctness ........................................75 - 13.1.2 Warnings .................................................76 - 13.1.3 Cache-control Mechanisms .................................77 - 13.1.4 Explicit User Agent Warnings .............................78 - 13.1.5 Exceptions to the Rules and Warnings .....................78 - 13.1.6 Client-controlled Behavior ...............................79 - 13.2 Expiration Model ............................................79 - 13.2.1 Server-Specified Expiration ..............................79 - 13.2.2 Heuristic Expiration .....................................80 - 13.2.3 Age Calculations .........................................80 - 13.2.4 Expiration Calculations ..................................83 - 13.2.5 Disambiguating Expiration Values .........................84 - 13.2.6 Disambiguating Multiple Responses ........................84 - 13.3 Validation Model ............................................85 - 13.3.1 Last-Modified Dates ......................................86 - - - -Fielding, et al. Standards Track [Page 4] - -RFC 2616 HTTP/1.1 June 1999 - - - 13.3.2 Entity Tag Cache Validators ..............................86 - 13.3.3 Weak and Strong Validators ...............................86 - 13.3.4 Rules for When to Use Entity Tags and Last-Modified Dates.89 - 13.3.5 Non-validating Conditionals ..............................90 - 13.4 Response Cacheability .......................................91 - 13.5 Constructing Responses From Caches ..........................92 - 13.5.1 End-to-end and Hop-by-hop Headers ........................92 - 13.5.2 Non-modifiable Headers ...................................92 - 13.5.3 Combining Headers ........................................94 - 13.5.4 Combining Byte Ranges ....................................95 - 13.6 Caching Negotiated Responses ................................95 - 13.7 Shared and Non-Shared Caches ................................96 - 13.8 Errors or Incomplete Response Cache Behavior ................97 - 13.9 Side Effects of GET and HEAD ................................97 - 13.10 Invalidation After Updates or Deletions ...................97 - 13.11 Write-Through Mandatory ...................................98 - 13.12 Cache Replacement .........................................99 - 13.13 History Lists .............................................99 - 14 Header Field Definitions ....................................100 - 14.1 Accept .....................................................100 - 14.2 Accept-Charset .............................................102 - 14.3 Accept-Encoding ............................................102 - 14.4 Accept-Language ............................................104 - 14.5 Accept-Ranges ..............................................105 - 14.6 Age ........................................................106 - 14.7 Allow ......................................................106 - 14.8 Authorization ..............................................107 - 14.9 Cache-Control ..............................................108 - 14.9.1 What is Cacheable .......................................109 - 14.9.2 What May be Stored by Caches ............................110 - 14.9.3 Modifications of the Basic Expiration Mechanism .........111 - 14.9.4 Cache Revalidation and Reload Controls ..................113 - 14.9.5 No-Transform Directive ..................................115 - 14.9.6 Cache Control Extensions ................................116 - 14.10 Connection ...............................................117 - 14.11 Content-Encoding .........................................118 - 14.12 Content-Language .........................................118 - 14.13 Content-Length ...........................................119 - 14.14 Content-Location .........................................120 - 14.15 Content-MD5 ..............................................121 - 14.16 Content-Range ............................................122 - 14.17 Content-Type .............................................124 - 14.18 Date .....................................................124 - 14.18.1 Clockless Origin Server Operation ......................125 - 14.19 ETag .....................................................126 - 14.20 Expect ...................................................126 - 14.21 Expires ..................................................127 - 14.22 From .....................................................128 - - - -Fielding, et al. Standards Track [Page 5] - -RFC 2616 HTTP/1.1 June 1999 - - - 14.23 Host .....................................................128 - 14.24 If-Match .................................................129 - 14.25 If-Modified-Since ........................................130 - 14.26 If-None-Match ............................................132 - 14.27 If-Range .................................................133 - 14.28 If-Unmodified-Since ......................................134 - 14.29 Last-Modified ............................................134 - 14.30 Location .................................................135 - 14.31 Max-Forwards .............................................136 - 14.32 Pragma ...................................................136 - 14.33 Proxy-Authenticate .......................................137 - 14.34 Proxy-Authorization ......................................137 - 14.35 Range ....................................................138 - 14.35.1 Byte Ranges ...........................................138 - 14.35.2 Range Retrieval Requests ..............................139 - 14.36 Referer ..................................................140 - 14.37 Retry-After ..............................................141 - 14.38 Server ...................................................141 - 14.39 TE .......................................................142 - 14.40 Trailer ..................................................143 - 14.41 Transfer-Encoding..........................................143 - 14.42 Upgrade ..................................................144 - 14.43 User-Agent ...............................................145 - 14.44 Vary .....................................................145 - 14.45 Via ......................................................146 - 14.46 Warning ..................................................148 - 14.47 WWW-Authenticate .........................................150 - 15 Security Considerations .......................................150 - 15.1 Personal Information....................................151 - 15.1.1 Abuse of Server Log Information .........................151 - 15.1.2 Transfer of Sensitive Information .......................151 - 15.1.3 Encoding Sensitive Information in URI's .................152 - 15.1.4 Privacy Issues Connected to Accept Headers ..............152 - 15.2 Attacks Based On File and Path Names .......................153 - 15.3 DNS Spoofing ...............................................154 - 15.4 Location Headers and Spoofing ..............................154 - 15.5 Content-Disposition Issues .................................154 - 15.6 Authentication Credentials and Idle Clients ................155 - 15.7 Proxies and Caching ........................................155 - 15.7.1 Denial of Service Attacks on Proxies....................156 - 16 Acknowledgments .............................................156 - 17 References ..................................................158 - 18 Authors' Addresses ..........................................162 - 19 Appendices ..................................................164 - 19.1 Internet Media Type message/http and application/http ......164 - 19.2 Internet Media Type multipart/byteranges ...................165 - 19.3 Tolerant Applications ......................................166 - 19.4 Differences Between HTTP Entities and RFC 2045 Entities ....167 - - - -Fielding, et al. Standards Track [Page 6] - -RFC 2616 HTTP/1.1 June 1999 - - - 19.4.1 MIME-Version ............................................167 - 19.4.2 Conversion to Canonical Form ............................167 - 19.4.3 Conversion of Date Formats ..............................168 - 19.4.4 Introduction of Content-Encoding ........................168 - 19.4.5 No Content-Transfer-Encoding ............................168 - 19.4.6 Introduction of Transfer-Encoding .......................169 - 19.4.7 MHTML and Line Length Limitations .......................169 - 19.5 Additional Features ........................................169 - 19.5.1 Content-Disposition .....................................170 - 19.6 Compatibility with Previous Versions .......................170 - 19.6.1 Changes from HTTP/1.0 ...................................171 - 19.6.2 Compatibility with HTTP/1.0 Persistent Connections ......172 - 19.6.3 Changes from RFC 2068 ...................................172 - 20 Index .......................................................175 - 21 Full Copyright Statement ....................................176 - -1 Introduction - -1.1 Purpose - - The Hypertext Transfer Protocol (HTTP) is an application-level - protocol for distributed, collaborative, hypermedia information - systems. HTTP has been in use by the World-Wide Web global - information initiative since 1990. The first version of HTTP, - referred to as HTTP/0.9, was a simple protocol for raw data transfer - across the Internet. HTTP/1.0, as defined by RFC 1945 [6], improved - the protocol by allowing messages to be in the format of MIME-like - messages, containing metainformation about the data transferred and - modifiers on the request/response semantics. However, HTTP/1.0 does - not sufficiently take into consideration the effects of hierarchical - proxies, caching, the need for persistent connections, or virtual - hosts. In addition, the proliferation of incompletely-implemented - applications calling themselves "HTTP/1.0" has necessitated a - protocol version change in order for two communicating applications - to determine each other's true capabilities. - - This specification defines the protocol referred to as "HTTP/1.1". - This protocol includes more stringent requirements than HTTP/1.0 in - order to ensure reliable implementation of its features. - - Practical information systems require more functionality than simple - retrieval, including search, front-end update, and annotation. HTTP - allows an open-ended set of methods and headers that indicate the - purpose of a request [47]. It builds on the discipline of reference - provided by the Uniform Resource Identifier (URI) [3], as a location - (URL) [4] or name (URN) [20], for indicating the resource to which a - - - - - -Fielding, et al. Standards Track [Page 7] - -RFC 2616 HTTP/1.1 June 1999 - - - method is to be applied. Messages are passed in a format similar to - that used by Internet mail [9] as defined by the Multipurpose - Internet Mail Extensions (MIME) [7]. - - HTTP is also used as a generic protocol for communication between - user agents and proxies/gateways to other Internet systems, including - those supported by the SMTP [16], NNTP [13], FTP [18], Gopher [2], - and WAIS [10] protocols. In this way, HTTP allows basic hypermedia - access to resources available from diverse applications. - -1.2 Requirements - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this - document are to be interpreted as described in RFC 2119 [34]. - - An implementation is not compliant if it fails to satisfy one or more - of the MUST or REQUIRED level requirements for the protocols it - implements. An implementation that satisfies all the MUST or REQUIRED - level and all the SHOULD level requirements for its protocols is said - to be "unconditionally compliant"; one that satisfies all the MUST - level requirements but not all the SHOULD level requirements for its - protocols is said to be "conditionally compliant." - -1.3 Terminology - - This specification uses a number of terms to refer to the roles - played by participants in, and objects of, the HTTP communication. - - connection - A transport layer virtual circuit established between two programs - for the purpose of communication. - - message - The basic unit of HTTP communication, consisting of a structured - sequence of octets matching the syntax defined in section 4 and - transmitted via the connection. - - request - An HTTP request message, as defined in section 5. - - response - An HTTP response message, as defined in section 6. - - - - - - - - -Fielding, et al. Standards Track [Page 8] - -RFC 2616 HTTP/1.1 June 1999 - - - resource - A network data object or service that can be identified by a URI, - as defined in section 3.2. Resources may be available in multiple - representations (e.g. multiple languages, data formats, size, and - resolutions) or vary in other ways. - - entity - The information transferred as the payload of a request or - response. An entity consists of metainformation in the form of - entity-header fields and content in the form of an entity-body, as - described in section 7. - - representation - An entity included with a response that is subject to content - negotiation, as described in section 12. There may exist multiple - representations associated with a particular response status. - - content negotiation - The mechanism for selecting the appropriate representation when - servicing a request, as described in section 12. The - representation of entities in any response can be negotiated - (including error responses). - - variant - A resource may have one, or more than one, representation(s) - associated with it at any given instant. Each of these - representations is termed a `varriant'. Use of the term `variant' - does not necessarily imply that the resource is subject to content - negotiation. - - client - A program that establishes connections for the purpose of sending - requests. - - user agent - The client which initiates a request. These are often browsers, - editors, spiders (web-traversing robots), or other end user tools. - - server - An application program that accepts connections in order to - service requests by sending back responses. Any given program may - be capable of being both a client and a server; our use of these - terms refers only to the role being performed by the program for a - particular connection, rather than to the program's capabilities - in general. Likewise, any server may act as an origin server, - proxy, gateway, or tunnel, switching behavior based on the nature - of each request. - - - - -Fielding, et al. Standards Track [Page 9] - -RFC 2616 HTTP/1.1 June 1999 - - - origin server - The server on which a given resource resides or is to be created. - - proxy - An intermediary program which acts as both a server and a client - for the purpose of making requests on behalf of other clients. - Requests are serviced internally or by passing them on, with - possible translation, to other servers. A proxy MUST implement - both the client and server requirements of this specification. A - "transparent proxy" is a proxy that does not modify the request or - response beyond what is required for proxy authentication and - identification. A "non-transparent proxy" is a proxy that modifies - the request or response in order to provide some added service to - the user agent, such as group annotation services, media type - transformation, protocol reduction, or anonymity filtering. Except - where either transparent or non-transparent behavior is explicitly - stated, the HTTP proxy requirements apply to both types of - proxies. - - gateway - A server which acts as an intermediary for some other server. - Unlike a proxy, a gateway receives requests as if it were the - origin server for the requested resource; the requesting client - may not be aware that it is communicating with a gateway. - - tunnel - An intermediary program which is acting as a blind relay between - two connections. Once active, a tunnel is not considered a party - to the HTTP communication, though the tunnel may have been - initiated by an HTTP request. The tunnel ceases to exist when both - ends of the relayed connections are closed. - - cache - A program's local store of response messages and the subsystem - that controls its message storage, retrieval, and deletion. A - cache stores cacheable responses in order to reduce the response - time and network bandwidth consumption on future, equivalent - requests. Any client or server may include a cache, though a cache - cannot be used by a server that is acting as a tunnel. - - cacheable - A response is cacheable if a cache is allowed to store a copy of - the response message for use in answering subsequent requests. The - rules for determining the cacheability of HTTP responses are - defined in section 13. Even if a resource is cacheable, there may - be additional constraints on whether a cache can use the cached - copy for a particular request. - - - - -Fielding, et al. Standards Track [Page 10] - -RFC 2616 HTTP/1.1 June 1999 - - - first-hand - A response is first-hand if it comes directly and without - unnecessary delay from the origin server, perhaps via one or more - proxies. A response is also first-hand if its validity has just - been checked directly with the origin server. - - explicit expiration time - The time at which the origin server intends that an entity should - no longer be returned by a cache without further validation. - - heuristic expiration time - An expiration time assigned by a cache when no explicit expiration - time is available. - - age - The age of a response is the time since it was sent by, or - successfully validated with, the origin server. - - freshness lifetime - The length of time between the generation of a response and its - expiration time. - - fresh - A response is fresh if its age has not yet exceeded its freshness - lifetime. - - stale - A response is stale if its age has passed its freshness lifetime. - - semantically transparent - A cache behaves in a "semantically transparent" manner, with - respect to a particular response, when its use affects neither the - requesting client nor the origin server, except to improve - performance. When a cache is semantically transparent, the client - receives exactly the same response (except for hop-by-hop headers) - that it would have received had its request been handled directly - by the origin server. - - validator - A protocol element (e.g., an entity tag or a Last-Modified time) - that is used to find out whether a cache entry is an equivalent - copy of an entity. - - upstream/downstream - Upstream and downstream describe the flow of a message: all - messages flow from upstream to downstream. - - - - - -Fielding, et al. Standards Track [Page 11] - -RFC 2616 HTTP/1.1 June 1999 - - - inbound/outbound - Inbound and outbound refer to the request and response paths for - messages: "inbound" means "traveling toward the origin server", - and "outbound" means "traveling toward the user agent" - -1.4 Overall Operation - - The HTTP protocol is a request/response protocol. A client sends a - request to the server in the form of a request method, URI, and - protocol version, followed by a MIME-like message containing request - modifiers, client information, and possible body content over a - connection with a server. The server responds with a status line, - including the message's protocol version and a success or error code, - followed by a MIME-like message containing server information, entity - metainformation, and possible entity-body content. The relationship - between HTTP and MIME is described in appendix 19.4. - - Most HTTP communication is initiated by a user agent and consists of - a request to be applied to a resource on some origin server. In the - simplest case, this may be accomplished via a single connection (v) - between the user agent (UA) and the origin server (O). - - request chain ------------------------> - UA -------------------v------------------- O - <----------------------- response chain - - A more complicated situation occurs when one or more intermediaries - are present in the request/response chain. There are three common - forms of intermediary: proxy, gateway, and tunnel. A proxy is a - forwarding agent, receiving requests for a URI in its absolute form, - rewriting all or part of the message, and forwarding the reformatted - request toward the server identified by the URI. A gateway is a - receiving agent, acting as a layer above some other server(s) and, if - necessary, translating the requests to the underlying server's - protocol. A tunnel acts as a relay point between two connections - without changing the messages; tunnels are used when the - communication needs to pass through an intermediary (such as a - firewall) even when the intermediary cannot understand the contents - of the messages. - - request chain --------------------------------------> - UA -----v----- A -----v----- B -----v----- C -----v----- O - <------------------------------------- response chain - - The figure above shows three intermediaries (A, B, and C) between the - user agent and origin server. A request or response message that - travels the whole chain will pass through four separate connections. - This distinction is important because some HTTP communication options - - - -Fielding, et al. Standards Track [Page 12] - -RFC 2616 HTTP/1.1 June 1999 - - - may apply only to the connection with the nearest, non-tunnel - neighbor, only to the end-points of the chain, or to all connections - along the chain. Although the diagram is linear, each participant may - be engaged in multiple, simultaneous communications. For example, B - may be receiving requests from many clients other than A, and/or - forwarding requests to servers other than C, at the same time that it - is handling A's request. - - Any party to the communication which is not acting as a tunnel may - employ an internal cache for handling requests. The effect of a cache - is that the request/response chain is shortened if one of the - participants along the chain has a cached response applicable to that - request. The following illustrates the resulting chain if B has a - cached copy of an earlier response from O (via C) for a request which - has not been cached by UA or A. - - request chain ----------> - UA -----v----- A -----v----- B - - - - - - C - - - - - - O - <--------- response chain - - Not all responses are usefully cacheable, and some requests may - contain modifiers which place special requirements on cache behavior. - HTTP requirements for cache behavior and cacheable responses are - defined in section 13. - - In fact, there are a wide variety of architectures and configurations - of caches and proxies currently being experimented with or deployed - across the World Wide Web. These systems include national hierarchies - of proxy caches to save transoceanic bandwidth, systems that - broadcast or multicast cache entries, organizations that distribute - subsets of cached data via CD-ROM, and so on. HTTP systems are used - in corporate intranets over high-bandwidth links, and for access via - PDAs with low-power radio links and intermittent connectivity. The - goal of HTTP/1.1 is to support the wide diversity of configurations - already deployed while introducing protocol constructs that meet the - needs of those who build web applications that require high - reliability and, failing that, at least reliable indications of - failure. - - HTTP communication usually takes place over TCP/IP connections. The - default port is TCP 80 [19], but other ports can be used. This does - not preclude HTTP from being implemented on top of any other protocol - on the Internet, or on other networks. HTTP only presumes a reliable - transport; any protocol that provides such guarantees can be used; - the mapping of the HTTP/1.1 request and response structures onto the - transport data units of the protocol in question is outside the scope - of this specification. - - - - -Fielding, et al. Standards Track [Page 13] - -RFC 2616 HTTP/1.1 June 1999 - - - In HTTP/1.0, most implementations used a new connection for each - request/response exchange. In HTTP/1.1, a connection may be used for - one or more request/response exchanges, although connections may be - closed for a variety of reasons (see section 8.1). - -2 Notational Conventions and Generic Grammar - -2.1 Augmented BNF - - All of the mechanisms specified in this document are described in - both prose and an augmented Backus-Naur Form (BNF) similar to that - used by RFC 822 [9]. Implementors will need to be familiar with the - notation in order to understand this specification. The augmented BNF - includes the following constructs: - - name = definition - The name of a rule is simply the name itself (without any - enclosing "<" and ">") and is separated from its definition by the - equal "=" character. White space is only significant in that - indentation of continuation lines is used to indicate a rule - definition that spans more than one line. Certain basic rules are - in uppercase, such as SP, LWS, HT, CRLF, DIGIT, ALPHA, etc. Angle - brackets are used within definitions whenever their presence will - facilitate discerning the use of rule names. - - "literal" - Quotation marks surround literal text. Unless stated otherwise, - the text is case-insensitive. - - rule1 | rule2 - Elements separated by a bar ("|") are alternatives, e.g., "yes | - no" will accept yes or no. - - (rule1 rule2) - Elements enclosed in parentheses are treated as a single element. - Thus, "(elem (foo | bar) elem)" allows the token sequences "elem - foo elem" and "elem bar elem". - - *rule - The character "*" preceding an element indicates repetition. The - full form is "<n>*<m>element" indicating at least <n> and at most - <m> occurrences of element. Default values are 0 and infinity so - that "*(element)" allows any number, including zero; "1*element" - requires at least one; and "1*2element" allows one or two. - - [rule] - Square brackets enclose optional elements; "[foo bar]" is - equivalent to "*1(foo bar)". - - - -Fielding, et al. Standards Track [Page 14] - -RFC 2616 HTTP/1.1 June 1999 - - - N rule - Specific repetition: "<n>(element)" is equivalent to - "<n>*<n>(element)"; that is, exactly <n> occurrences of (element). - Thus 2DIGIT is a 2-digit number, and 3ALPHA is a string of three - alphabetic characters. - - #rule - A construct "#" is defined, similar to "*", for defining lists of - elements. The full form is "<n>#<m>element" indicating at least - <n> and at most <m> elements, each separated by one or more commas - (",") and OPTIONAL linear white space (LWS). This makes the usual - form of lists very easy; a rule such as - ( *LWS element *( *LWS "," *LWS element )) - can be shown as - 1#element - Wherever this construct is used, null elements are allowed, but do - not contribute to the count of elements present. That is, - "(element), , (element) " is permitted, but counts as only two - elements. Therefore, where at least one element is required, at - least one non-null element MUST be present. Default values are 0 - and infinity so that "#element" allows any number, including zero; - "1#element" requires at least one; and "1#2element" allows one or - two. - - ; comment - A semi-colon, set off some distance to the right of rule text, - starts a comment that continues to the end of line. This is a - simple way of including useful notes in parallel with the - specifications. - - implied *LWS - The grammar described by this specification is word-based. Except - where noted otherwise, linear white space (LWS) can be included - between any two adjacent words (token or quoted-string), and - between adjacent words and separators, without changing the - interpretation of a field. At least one delimiter (LWS and/or - - separators) MUST exist between any two tokens (for the definition - of "token" below), since they would otherwise be interpreted as a - single token. - -2.2 Basic Rules - - The following rules are used throughout this specification to - describe basic parsing constructs. The US-ASCII coded character set - is defined by ANSI X3.4-1986 [21]. - - - - - -Fielding, et al. Standards Track [Page 15] - -RFC 2616 HTTP/1.1 June 1999 - - - OCTET = <any 8-bit sequence of data> - CHAR = <any US-ASCII character (octets 0 - 127)> - UPALPHA = <any US-ASCII uppercase letter "A".."Z"> - LOALPHA = <any US-ASCII lowercase letter "a".."z"> - ALPHA = UPALPHA | LOALPHA - DIGIT = <any US-ASCII digit "0".."9"> - CTL = <any US-ASCII control character - (octets 0 - 31) and DEL (127)> - CR = <US-ASCII CR, carriage return (13)> - LF = <US-ASCII LF, linefeed (10)> - SP = <US-ASCII SP, space (32)> - HT = <US-ASCII HT, horizontal-tab (9)> - <"> = <US-ASCII double-quote mark (34)> - - HTTP/1.1 defines the sequence CR LF as the end-of-line marker for all - protocol elements except the entity-body (see appendix 19.3 for - tolerant applications). The end-of-line marker within an entity-body - is defined by its associated media type, as described in section 3.7. - - CRLF = CR LF - - HTTP/1.1 header field values can be folded onto multiple lines if the - continuation line begins with a space or horizontal tab. All linear - white space, including folding, has the same semantics as SP. A - recipient MAY replace any linear white space with a single SP before - interpreting the field value or forwarding the message downstream. - - LWS = [CRLF] 1*( SP | HT ) - - The TEXT rule is only used for descriptive field contents and values - that are not intended to be interpreted by the message parser. Words - of *TEXT MAY contain characters from character sets other than ISO- - 8859-1 [22] only when encoded according to the rules of RFC 2047 - [14]. - - TEXT = <any OCTET except CTLs, - but including LWS> - - A CRLF is allowed in the definition of TEXT only as part of a header - field continuation. It is expected that the folding LWS will be - replaced with a single SP before interpretation of the TEXT value. - - Hexadecimal numeric characters are used in several protocol elements. - - HEX = "A" | "B" | "C" | "D" | "E" | "F" - | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT - - - - - -Fielding, et al. Standards Track [Page 16] - -RFC 2616 HTTP/1.1 June 1999 - - - Many HTTP/1.1 header field values consist of words separated by LWS - or special characters. These special characters MUST be in a quoted - string to be used within a parameter value (as defined in section - 3.6). - - token = 1*<any CHAR except CTLs or separators> - separators = "(" | ")" | "<" | ">" | "@" - | "," | ";" | ":" | "\" | <"> - | "/" | "[" | "]" | "?" | "=" - | "{" | "}" | SP | HT - - Comments can be included in some HTTP header fields by surrounding - the comment text with parentheses. Comments are only allowed in - fields containing "comment" as part of their field value definition. - In all other fields, parentheses are considered part of the field - value. - - comment = "(" *( ctext | quoted-pair | comment ) ")" - ctext = <any TEXT excluding "(" and ")"> - - A string of text is parsed as a single word if it is quoted using - double-quote marks. - - quoted-string = ( <"> *(qdtext | quoted-pair ) <"> ) - qdtext = <any TEXT except <">> - - The backslash character ("\") MAY be used as a single-character - quoting mechanism only within quoted-string and comment constructs. - - quoted-pair = "\" CHAR - -3 Protocol Parameters - -3.1 HTTP Version - - HTTP uses a "<major>.<minor>" numbering scheme to indicate versions - of the protocol. The protocol versioning policy is intended to allow - the sender to indicate the format of a message and its capacity for - understanding further HTTP communication, rather than the features - obtained via that communication. No change is made to the version - number for the addition of message components which do not affect - communication behavior or which only add to extensible field values. - The <minor> number is incremented when the changes made to the - protocol add features which do not change the general message parsing - algorithm, but which may add to the message semantics and imply - additional capabilities of the sender. The <major> number is - incremented when the format of a message within the protocol is - changed. See RFC 2145 [36] for a fuller explanation. - - - -Fielding, et al. Standards Track [Page 17] - -RFC 2616 HTTP/1.1 June 1999 - - - The version of an HTTP message is indicated by an HTTP-Version field - in the first line of the message. [[HTTP-Version is case-sensitive.]] - - HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT - - Note that the major and minor numbers MUST be treated as separate - integers and that each MAY be incremented higher than a single digit. - Thus, HTTP/2.4 is a lower version than HTTP/2.13, which in turn is - lower than HTTP/12.3. Leading zeros MUST be ignored by recipients and - MUST NOT be sent. - - An application that sends a request or response message that includes - HTTP-Version of "HTTP/1.1" MUST be at least conditionally compliant - with this specification. Applications that are at least conditionally - compliant with this specification SHOULD use an HTTP-Version of - "HTTP/1.1" in their messages, and MUST do so for any message that is - not compatible with HTTP/1.0. For more details on when to send - specific HTTP-Version values, see RFC 2145 [36]. - - The HTTP version of an application is the highest HTTP version for - which the application is at least conditionally compliant. - - Proxy and gateway applications need to be careful when forwarding - messages in protocol versions different from that of the application. - Since the protocol version indicates the protocol capability of the - sender, a proxy/gateway MUST NOT send a message with a version - indicator which is greater than its actual version. If a higher - version request is received, the proxy/gateway MUST either downgrade - the request version, or respond with an error, or switch to tunnel - behavior. - - Due to interoperability problems with HTTP/1.0 proxies discovered - since the publication of RFC 2068[33], caching proxies MUST, gateways - MAY, and tunnels MUST NOT upgrade the request to the highest version - they support. The proxy/gateway's response to that request MUST be in - the same major version as the request. - - Note: Converting between versions of HTTP may involve modification - of header fields required or forbidden by the versions involved. - -3.2 Uniform Resource Identifiers - - URIs have been known by many names: WWW addresses, Universal Document - Identifiers, Universal Resource Identifiers [3], and finally the - combination of Uniform Resource Locators (URL) [4] and Names (URN) - [20]. As far as HTTP is concerned, Uniform Resource Identifiers are - simply formatted strings which identify--via name, location, or any - other characteristic--a resource. - - - -Fielding, et al. Standards Track [Page 18] - -RFC 2616 HTTP/1.1 June 1999 - - -3.2.1 General Syntax - - URIs in HTTP can be represented in absolute form or relative to some - known base URI [11], depending upon the context of their use. The two - forms are differentiated by the fact that absolute URIs always begin - with a scheme name followed by a colon. For definitive information on - URL syntax and semantics, see "Uniform Resource Identifiers (URI): - Generic Syntax and Semantics," RFC 2396 [42] (which replaces RFCs - 1738 [4] and RFC 1808 [11]). This specification adopts the - definitions of "URI-reference", "absoluteURI", "relativeURI", "port", - "host","abs_path", "rel_path", and "authority" from that - specification. - - The HTTP protocol does not place any a priori limit on the length of - a URI. Servers MUST be able to handle the URI of any resource they - serve, and SHOULD be able to handle URIs of unbounded length if they - provide GET-based forms that could generate such URIs. A server - SHOULD return 414 (Request-URI Too Long) status if a URI is longer - than the server can handle (see section 10.4.15). - - Note: Servers ought to be cautious about depending on URI lengths - above 255 bytes, because some older client or proxy - implementations might not properly support these lengths. - -3.2.2 http URL - - The "http" scheme is used to locate network resources via the HTTP - protocol. This section defines the scheme-specific syntax and - semantics for http URLs. - - http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]] - - If the port is empty or not given, port 80 is assumed. The semantics - are that the identified resource is located at the server listening - for TCP connections on that port of that host, and the Request-URI - for the resource is abs_path (section 5.1.2). The use of IP addresses - in URLs SHOULD be avoided whenever possible (see RFC 1900 [24]). If - the abs_path is not present in the URL, it MUST be given as "/" when - used as a Request-URI for a resource (section 5.1.2). If a proxy - receives a host name which is not a fully qualified domain name, it - MAY add its domain to the host name it received. If a proxy receives - a fully qualified domain name, the proxy MUST NOT change the host - name. - - - - - - - - -Fielding, et al. Standards Track [Page 19] - -RFC 2616 HTTP/1.1 June 1999 - - -3.2.3 URI Comparison - - When comparing two URIs to decide if they match or not, a client - SHOULD use a case-sensitive octet-by-octet comparison of the entire - URIs, with these exceptions: - - - A port that is empty or not given is equivalent to the default - port for that URI-reference; - - - Comparisons of host names MUST be case-insensitive; - - - Comparisons of scheme names MUST be case-insensitive; - - - An empty abs_path is equivalent to an abs_path of "/". - - Characters other than those in the "reserved" and "unsafe" sets (see - RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding. - [[ Ignore reference to "unsafe" set. ]] - - For example, the following three URIs are equivalent: - - http://abc.com:80/~smith/home.html - http://ABC.com/%7Esmith/home.html - http://ABC.com:/%7esmith/home.html - -3.3 Date/Time Formats - -3.3.1 Full Date - - HTTP applications have historically allowed three different formats - for the representation of date/time stamps: - - Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 - Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 - Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format - - The first format is preferred as an Internet standard and represents - a fixed-length subset of that defined by RFC 1123 [8] (an update to - RFC 822 [9]). The second format is in common use, but is based on the - obsolete RFC 850 [12] date format and lacks a four-digit year. - HTTP/1.1 clients and servers that parse the date value MUST accept - all three formats (for compatibility with HTTP/1.0), though they MUST - only generate the RFC 1123 format for representing HTTP-date values - in header fields. See section 19.3 for further information. - - Note: Recipients of date values are encouraged to be robust in - accepting date values that may have been sent by non-HTTP - applications, as is sometimes the case when retrieving or posting - messages via proxies/gateways to SMTP or NNTP. - - - -Fielding, et al. Standards Track [Page 20] - -RFC 2616 HTTP/1.1 June 1999 - - - All HTTP date/time stamps MUST be represented in Greenwich Mean Time - (GMT), without exception. For the purposes of HTTP, GMT is exactly - equal to UTC (Coordinated Universal Time). This is indicated in the - first two formats by the inclusion of "GMT" as the three-letter - abbreviation for time zone, and MUST be assumed when reading the - asctime format. HTTP-date is case sensitive and MUST NOT include - additional LWS beyond that specifically included as SP in the - grammar. - - HTTP-date = rfc1123-date | rfc850-date | asctime-date - rfc1123-date = wkday "," SP date1 SP time SP "GMT" - rfc850-date = weekday "," SP date2 SP time SP "GMT" - asctime-date = wkday SP date3 SP time SP 4DIGIT - date1 = 2DIGIT SP month SP 4DIGIT - ; day month year (e.g., 02 Jun 1982) - date2 = 2DIGIT "-" month "-" 2DIGIT - ; day-month-year (e.g., 02-Jun-82) - date3 = month SP ( 2DIGIT | ( SP 1DIGIT )) - ; month day (e.g., Jun 2) - time = 2DIGIT ":" 2DIGIT ":" 2DIGIT - ; 00:00:00 - 23:59:59 - wkday = "Mon" | "Tue" | "Wed" - | "Thu" | "Fri" | "Sat" | "Sun" - weekday = "Monday" | "Tuesday" | "Wednesday" - | "Thursday" | "Friday" | "Saturday" | "Sunday" - month = "Jan" | "Feb" | "Mar" | "Apr" - | "May" | "Jun" | "Jul" | "Aug" - | "Sep" | "Oct" | "Nov" | "Dec" - - Note: HTTP requirements for the date/time stamp format apply only - to their usage within the protocol stream. Clients and servers are - not required to use these formats for user presentation, request - logging, etc. - -3.3.2 Delta Seconds - - Some HTTP header fields allow a time value to be specified as an - integer number of seconds, represented in decimal, after the time - that the message was received. - - delta-seconds = 1*DIGIT - -3.4 Character Sets - - HTTP uses the same definition of the term "character set" as that - described for MIME: - - - - - -Fielding, et al. Standards Track [Page 21] - -RFC 2616 HTTP/1.1 June 1999 - - - The term "character set" is used in this document to refer to a - method used with one or more tables to convert a sequence of octets - into a sequence of characters. Note that unconditional conversion in - the other direction is not required, in that not all characters may - be available in a given character set and a character set may provide - more than one sequence of octets to represent a particular character. - This definition is intended to allow various kinds of character - encoding, from simple single-table mappings such as US-ASCII to - complex table switching methods such as those that use ISO-2022's - techniques. However, the definition associated with a MIME character - set name MUST fully specify the mapping to be performed from octets - to characters. In particular, use of external profiling information - to determine the exact mapping is not permitted. - - Note: This use of the term "character set" is more commonly - referred to as a "character encoding." However, since HTTP and - MIME share the same registry, it is important that the terminology - also be shared. - - HTTP character sets are identified by case-insensitive tokens. The - complete set of tokens is defined by the IANA Character Set registry - [19]. - - charset = token - -[[ HTTP uses charset in two contexts: within an Accept-Charset request ]] -[[ header (in which the charset value is an unquoted token) and as the ]] -[[ value of a parameter in a Content-type header (within a request or ]] -[[ response), in which case the parameter value of the charset ]] -[[ parameter may be quoted. ]] - - Although HTTP allows an arbitrary token to be used as a charset - value, any token that has a predefined value within the IANA - Character Set registry [19] MUST represent the character set defined - by that registry. Applications SHOULD limit their use of character - sets to those defined by the IANA registry. - - Implementors should be aware of IETF character set requirements [38] - [41]. - -3.4.1 Missing Charset - - Some HTTP/1.0 software has interpreted a Content-Type header without - charset parameter incorrectly to mean "recipient should guess." - Senders wishing to defeat this behavior MAY include a charset - parameter even when the charset is ISO-8859-1 and SHOULD do so when - it is known that it will not confuse the recipient. - - Unfortunately, some older HTTP/1.0 clients did not deal properly with - an explicit charset parameter. HTTP/1.1 recipients MUST respect the - charset label provided by the sender; and those user agents that have - a provision to "guess" a charset MUST use the charset from the - - - - - -Fielding, et al. Standards Track [Page 22] - -RFC 2616 HTTP/1.1 June 1999 - - - content-type field if they support that charset, rather than the - recipient's preference, when initially displaying a document. See - section 3.7.1. - -3.5 Content Codings - - Content coding values indicate an encoding transformation that has - been or can be applied to an entity. Content codings are primarily - used to allow a document to be compressed or otherwise usefully - transformed without losing the identity of its underlying media type - and without loss of information. Frequently, the entity is stored in - coded form, transmitted directly, and only decoded by the recipient. - - content-coding = token - - All content-coding values are case-insensitive. HTTP/1.1 uses - content-coding values in the Accept-Encoding (section 14.3) and - Content-Encoding (section 14.11) header fields. Although the value - describes the content-coding, what is more important is that it - indicates what decoding mechanism will be required to remove the - encoding. - - The Internet Assigned Numbers Authority (IANA) acts as a registry for - content-coding value tokens. Initially, the registry contains the - following tokens: - - gzip An encoding format produced by the file compression program - "gzip" (GNU zip) as described in RFC 1952 [25]. This format is a - Lempel-Ziv coding (LZ77) with a 32 bit CRC. - - compress - The encoding format produced by the common UNIX file compression - program "compress". This format is an adaptive Lempel-Ziv-Welch - coding (LZW). - - Use of program names for the identification of encoding formats - is not desirable and is discouraged for future encodings. Their - use here is representative of historical practice, not good - design. For compatibility with previous implementations of HTTP, - applications SHOULD consider "x-gzip" and "x-compress" to be - equivalent to "gzip" and "compress" respectively. - - deflate - The "zlib" format defined in RFC 1950 [31] in combination with - the "deflate" compression mechanism described in RFC 1951 [29]. - - - - - - -Fielding, et al. Standards Track [Page 23] - -RFC 2616 HTTP/1.1 June 1999 - - - identity - The default (identity) encoding; the use of no transformation - whatsoever. This content-coding is used only in the Accept- - Encoding header, and SHOULD NOT be used in the Content-Encoding - header. - - New content-coding value tokens SHOULD be registered; to allow - interoperability between clients and servers, specifications of the - content coding algorithms needed to implement a new value SHOULD be - publicly available and adequate for independent implementation, and - conform to the purpose of content coding defined in this section. - -3.6 Transfer Codings - - Transfer-coding values are used to indicate an encoding - transformation that has been, can be, or may need to be applied to an - entity-body in order to ensure "safe transport" through the network. - This differs from a content coding in that the transfer-coding is a - property of the message, not of the original entity. - - transfer-coding = "chunked" | transfer-extension - transfer-extension = token *( ";" parameter ) - - Parameters are in the form of attribute/value pairs. - - parameter = attribute "=" value - attribute = token - value = token | quoted-string - - All transfer-coding values are case-insensitive. HTTP/1.1 uses - transfer-coding values in the TE header field (section 14.39) and in - the Transfer-Encoding header field (section 14.41). - - Whenever a transfer-coding is applied to a message-body, the set of - transfer-codings MUST include "chunked", unless the message is - terminated by closing the connection. When the "chunked" transfer- - coding is used, it MUST be the last transfer-coding applied to the - message-body. The "chunked" transfer-coding MUST NOT be applied more - than once to a message-body. These rules allow the recipient to - determine the transfer-length of the message (section 4.4). - - Transfer-codings are analogous to the Content-Transfer-Encoding - values of MIME [7], which were designed to enable safe transport of - binary data over a 7-bit transport service. However, safe transport - has a different focus for an 8bit-clean transfer protocol. In HTTP, - the only unsafe characteristic of message-bodies is the difficulty in - determining the exact body length (section 7.2.2), or the desire to - encrypt data over a shared transport. - - - -Fielding, et al. Standards Track [Page 24] - -RFC 2616 HTTP/1.1 June 1999 - - - The Internet Assigned Numbers Authority (IANA) acts as a registry for - transfer-coding value tokens. Initially, the registry contains the - following tokens: "chunked" (section 3.6.1), "identity" (section - 3.6.2), "gzip" (section 3.5), "compress" (section 3.5), and "deflate" - (section 3.5). - - [[ Remove reference to "identity" token ]] - - New transfer-coding value tokens SHOULD be registered in the same way - as new content-coding value tokens (section 3.5). - - A server which receives an entity-body with a transfer-coding it does - not understand SHOULD return 501 (Unimplemented), and close the - connection. A server MUST NOT send transfer-codings to an HTTP/1.0 - client. - -3.6.1 Chunked Transfer Coding - - The chunked encoding modifies the body of a message in order to - transfer it as a series of chunks, each with its own size indicator, - followed by an OPTIONAL trailer containing entity-header fields. This - allows dynamically produced content to be transferred along with the - information necessary for the recipient to verify that it has - received the full message. - - Chunked-Body = *chunk - last-chunk - trailer - CRLF - - chunk = chunk-size [ chunk-extension ] CRLF - chunk-data CRLF - chunk-size = 1*HEX - last-chunk = 1*("0") [ chunk-extension ] CRLF - - chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] ) - chunk-ext-name = token - chunk-ext-val = token | quoted-string - chunk-data = chunk-size(OCTET) - trailer = *(entity-header CRLF) - - The chunk-size field is a string of hex digits indicating the size of - the chunk. The chunked encoding is ended by any chunk whose size is - zero, followed by the trailer, which is terminated by an empty line. - - [[ "the size of the chunk" means "the size of the chunk-data in ]] - [[ octets" ]] - - The trailer allows the sender to include additional HTTP header - fields at the end of the message. The Trailer header field can be - used to indicate which header fields are included in a trailer (see - section 14.40). - - - - -Fielding, et al. Standards Track [Page 25] - -RFC 2616 HTTP/1.1 June 1999 - - - A server using chunked transfer-coding in a response MUST NOT use the - trailer for any header fields unless at least one of the following is - true: - - a)the request included a TE header field that indicates "trailers" is - acceptable in the transfer-coding of the response, as described in - section 14.39; or, - - b)the server is the origin server for the response, the trailer - fields consist entirely of optional metadata, and the recipient - could use the message (in a manner acceptable to the origin server) - without receiving this metadata. In other words, the origin server - is willing to accept the possibility that the trailer fields might - be silently discarded along the path to the client. - - This requirement prevents an interoperability failure when the - message is being received by an HTTP/1.1 (or later) proxy and - forwarded to an HTTP/1.0 recipient. It avoids a situation where - compliance with the protocol would have necessitated a possibly - infinite buffer on the proxy. - - An example process for decoding a Chunked-Body is presented in - appendix 19.4.6. - - All HTTP/1.1 applications MUST be able to receive and decode the - "chunked" transfer-coding, and MUST ignore chunk-extension extensions - they do not understand. - -3.7 Media Types - - HTTP uses Internet Media Types [17] in the Content-Type (section - 14.17) and Accept (section 14.1) header fields in order to provide - open and extensible data typing and type negotiation. - - media-type = type "/" subtype *( ";" parameter ) - type = token - subtype = token - - Parameters MAY follow the type/subtype in the form of attribute/value - pairs (as defined in section 3.6). - - The type, subtype, and parameter attribute names are case- - insensitive. Parameter values might or might not be case-sensitive, - depending on the semantics of the parameter name. Linear white space - (LWS) MUST NOT be used between the type and subtype, nor between an - attribute and its value. The presence or absence of a parameter might - be significant to the processing of a media-type, depending on its - definition within the media type registry. - - - -Fielding, et al. Standards Track [Page 26] - -RFC 2616 HTTP/1.1 June 1999 - - - Note that some older HTTP applications do not recognize media type - parameters. When sending data to older HTTP applications, - implementations SHOULD only use media type parameters when they are - required by that type/subtype definition. - - Media-type values are registered with the Internet Assigned Number - Authority (IANA [19]). The media type registration process is - outlined in RFC 1590 [17]. Use of non-registered media types is - discouraged. - - [[ "RFC 1590" should be "RFC 2048" ]] - -3.7.1 Canonicalization and Text Defaults - - Internet media types are registered with a canonical form. An - entity-body transferred via HTTP messages MUST be represented in the - appropriate canonical form prior to its transmission except for - "text" types, as defined in the next paragraph. - - When in canonical form, media subtypes of the "text" type use CRLF as - the text line break. HTTP relaxes this requirement and allows the - transport of text media with plain CR or LF alone representing a line - break when it is done consistently for an entire entity-body. HTTP - applications MUST accept CRLF, bare CR, and bare LF as being - representative of a line break in text media received via HTTP. In - addition, if the text is represented in a character set that does not - use octets 13 and 10 for CR and LF respectively, as is the case for - some multi-byte character sets, HTTP allows the use of whatever octet - sequences are defined by that character set to represent the - equivalent of CR and LF for line breaks. This flexibility regarding - line breaks applies only to text media in the entity-body; a bare CR - or LF MUST NOT be substituted for CRLF within any of the HTTP control - structures (such as header fields and multipart boundaries). - - If an entity-body is encoded with a content-coding, the underlying - data MUST be in a form defined above prior to being encoded. - - The "charset" parameter is used with some media types to define the - character set (section 3.4) of the data. When no explicit charset - parameter is provided by the sender, media subtypes of the "text" - type are defined to have a default charset value of "ISO-8859-1" when - received via HTTP. Data in character sets other than "ISO-8859-1" or - its subsets MUST be labeled with an appropriate charset value. See - section 3.4.1 for compatibility problems. - -3.7.2 Multipart Types - - MIME provides for a number of "multipart" types -- encapsulations of - one or more entities within a single message-body. All multipart - types share a common syntax, as defined in section 5.1.1 of RFC 2046 - - - -Fielding, et al. Standards Track [Page 27] - -RFC 2616 HTTP/1.1 June 1999 - - - [40], and MUST include a boundary parameter as part of the media type - value. The message body is itself a protocol element and MUST - therefore use only CRLF to represent line breaks between body-parts. - Unlike in RFC 2046, the epilogue of any multipart message MUST be - empty; HTTP applications MUST NOT transmit the epilogue (even if the - original multipart contains an epilogue). These restrictions exist in - order to preserve the self-delimiting nature of a multipart message- - body, wherein the "end" of the message-body is indicated by the - ending multipart boundary. - - In general, HTTP treats a multipart message-body no differently than - any other media type: strictly as payload. The one exception is the - "multipart/byteranges" type (appendix 19.2) when it appears in a 206 - (Partial Content) response, which will be interpreted by some HTTP - caching mechanisms as described in sections 13.5.4 and 14.16. In all - other cases, an HTTP user agent SHOULD follow the same or similar - behavior as a MIME user agent would upon receipt of a multipart type. - The MIME header fields within each body-part of a multipart message- - body do not have any significance to HTTP beyond that defined by - their MIME semantics. - - In general, an HTTP user agent SHOULD follow the same or similar - behavior as a MIME user agent would upon receipt of a multipart type. - If an application receives an unrecognized multipart subtype, the - application MUST treat it as being equivalent to "multipart/mixed". - - Note: The "multipart/form-data" type has been specifically defined - for carrying form data suitable for processing via the POST - request method, as described in RFC 1867 [15]. - -3.8 Product Tokens - - Product tokens are used to allow communicating applications to - identify themselves by software name and version. Most fields using - product tokens also allow sub-products which form a significant part - of the application to be listed, separated by white space. By - convention, the products are listed in order of their significance - for identifying the application. - - product = token ["/" product-version] - product-version = token - - Examples: - - User-Agent: CERN-LineMode/2.15 libwww/2.17b3 - Server: Apache/0.8.4 - - - - - -Fielding, et al. Standards Track [Page 28] - -RFC 2616 HTTP/1.1 June 1999 - - - Product tokens SHOULD be short and to the point. They MUST NOT be - used for advertising or other non-essential information. Although any - token character MAY appear in a product-version, this token SHOULD - only be used for a version identifier (i.e., successive versions of - the same product SHOULD only differ in the product-version portion of - the product value). - -3.9 Quality Values - - HTTP content negotiation (section 12) uses short "floating point" - numbers to indicate the relative importance ("weight") of various - negotiable parameters. A weight is normalized to a real number in - the range 0 through 1, where 0 is the minimum and 1 the maximum - value. If a parameter has a quality value of 0, then content with - this parameter is `not acceptable' for the client. HTTP/1.1 - applications MUST NOT generate more than three digits after the - decimal point. User configuration of these values SHOULD also be - limited in this fashion. - - qvalue = ( "0" [ "." 0*3DIGIT ] ) - | ( "1" [ "." 0*3("0") ] ) - - "Quality values" is a misnomer, since these values merely represent - relative degradation in desired quality. - -3.10 Language Tags - - A language tag identifies a natural language spoken, written, or - otherwise conveyed by human beings for communication of information - to other human beings. Computer languages are explicitly excluded. - HTTP uses language tags within the Accept-Language and Content- - Language fields. - - The syntax and registry of HTTP language tags is the same as that - defined by RFC 1766 [1]. In summary, a language tag is composed of 1 - or more parts: A primary language tag and a possibly empty series of - subtags: - - language-tag = primary-tag *( "-" subtag ) - primary-tag = 1*8ALPHA - subtag = 1*8ALPHA - - [[ Updated by RFC 3066: subtags may now contain digits ]] - - White space is not allowed within the tag and all tags are case- - insensitive. The name space of language tags is administered by the - IANA. Example tags include: - - en, en-US, en-cockney, i-cherokee, x-pig-latin - - - - -Fielding, et al. Standards Track [Page 29] - -RFC 2616 HTTP/1.1 June 1999 - - - where any two-letter primary-tag is an ISO-639 language abbreviation - and any two-letter initial subtag is an ISO-3166 country code. (The - last three tags above are not registered tags; all but the last are - examples of tags which could be registered in future.) - -3.11 Entity Tags - - Entity tags are used for comparing two or more entities from the same - requested resource. HTTP/1.1 uses entity tags in the ETag (section - 14.19), If-Match (section 14.24), If-None-Match (section 14.26), and - If-Range (section 14.27) header fields. The definition of how they - are used and compared as cache validators is in section 13.3.3. An - entity tag consists of an opaque quoted string, possibly prefixed by - a weakness indicator. - - entity-tag = [ weak ] opaque-tag - weak = "W/" - opaque-tag = quoted-string - - A "strong entity tag" MAY be shared by two entities of a resource - only if they are equivalent by octet equality. - - A "weak entity tag," indicated by the "W/" prefix, MAY be shared by - two entities of a resource only if the entities are equivalent and - could be substituted for each other with no significant change in - semantics. A weak entity tag can only be used for weak comparison. - - An entity tag MUST be unique across all versions of all entities - associated with a particular resource. A given entity tag value MAY - be used for entities obtained by requests on different URIs. The use - of the same entity tag value in conjunction with entities obtained by - requests on different URIs does not imply the equivalence of those - entities. - -3.12 Range Units - - HTTP/1.1 allows a client to request that only part (a range of) the - response entity be included within the response. HTTP/1.1 uses range - units in the Range (section 14.35) and Content-Range (section 14.16) - header fields. An entity can be broken down into subranges according - to various structural units. - - range-unit = bytes-unit | other-range-unit - bytes-unit = "bytes" - other-range-unit = token - - The only range unit defined by HTTP/1.1 is "bytes". HTTP/1.1 - implementations MAY ignore ranges specified using other units. - - - -Fielding, et al. Standards Track [Page 30] - -RFC 2616 HTTP/1.1 June 1999 - - - HTTP/1.1 has been designed to allow implementations of applications - that do not depend on knowledge of ranges. - -4 HTTP Message - -4.1 Message Types - - HTTP messages consist of requests from client to server and responses - from server to client. - - HTTP-message = Request | Response ; HTTP/1.1 messages - - Request (section 5) and Response (section 6) messages use the generic - message format of RFC 822 [9] for transferring entities (the payload - of the message). Both types of message consist of a start-line, zero - or more header fields (also known as "headers"), an empty line (i.e., - a line with nothing preceding the CRLF) indicating the end of the - header fields, and possibly a message-body. - - generic-message = start-line - *(message-header CRLF) - CRLF - [ message-body ] - start-line = Request-Line | Status-Line - - In the interest of robustness, servers SHOULD ignore any empty - line(s) received where a Request-Line is expected. In other words, if - the server is reading the protocol stream at the beginning of a - message and receives a CRLF first, it should ignore the CRLF. - - Certain buggy HTTP/1.0 client implementations generate extra CRLF's - after a POST request. To restate what is explicitly forbidden by the - BNF, an HTTP/1.1 client MUST NOT preface or follow a request with an - extra CRLF. - -4.2 Message Headers - - HTTP header fields, which include general-header (section 4.5), - request-header (section 5.3), response-header (section 6.2), and - entity-header (section 7.1) fields, follow the same generic format as - that given in Section 3.1 of RFC 822 [9]. Each header field consists - of a name followed by a colon (":") and the field value. Field names - are case-insensitive. The field value MAY be preceded by any amount - of LWS, though a single SP is preferred. Header fields can be - extended over multiple lines by preceding each extra line with at - least one SP or HT. Applications ought to follow "common form", where - one is known or indicated, when generating HTTP constructs, since - there might exist some implementations that fail to accept anything - - - -Fielding, et al. Standards Track [Page 31] - -RFC 2616 HTTP/1.1 June 1999 - - - beyond the common forms. - - message-header = field-name ":" [ field-value ] - field-name = token - field-value = *( field-content | LWS ) - field-content = <the OCTETs making up the field-value - and consisting of either *TEXT or combinations - of token, separators, and quoted-string> - - The field-content does not include any leading or trailing LWS: - linear white space occurring before the first non-whitespace - character of the field-value or after the last non-whitespace - character of the field-value. Such leading or trailing LWS MAY be - removed without changing the semantics of the field value. Any LWS - that occurs between field-content MAY be replaced with a single SP - before interpreting the field value or forwarding the message - downstream. - - The order in which header fields with differing field names are - received is not significant. However, it is "good practice" to send - general-header fields first, followed by request-header or response- - header fields, and ending with the entity-header fields. - - Multiple message-header fields with the same field-name MAY be - present in a message if and only if the entire field-value for that - header field is defined as a comma-separated list [i.e., #(values)]. - It MUST be possible to combine the multiple header fields into one - "field-name: field-value" pair, without changing the semantics of the - message, by appending each subsequent field-value to the first, each - separated by a comma. The order in which header fields with the same - field-name are received is therefore significant to the - interpretation of the combined field value, and thus a proxy MUST NOT - change the order of these field values when a message is forwarded. - -4.3 Message Body - - The message-body (if any) of an HTTP message is used to carry the - entity-body associated with the request or response. The message-body - differs from the entity-body only when a transfer-coding has been - applied, as indicated by the Transfer-Encoding header field (section - 14.41). - - message-body = entity-body - | <entity-body encoded as per Transfer-Encoding> - - Transfer-Encoding MUST be used to indicate any transfer-codings - applied by an application to ensure safe and proper transfer of the - message. Transfer-Encoding is a property of the message, not of the - - - -Fielding, et al. Standards Track [Page 32] - -RFC 2616 HTTP/1.1 June 1999 - - - entity, and thus MAY be added or removed by any application along the - request/response chain. (However, section 3.6 places restrictions on - when certain transfer-codings may be used.) - - The rules for when a message-body is allowed in a message differ for - requests and responses. - - The presence of a message-body in a request is signaled by the - inclusion of a Content-Length or Transfer-Encoding header field in - the request's message-headers. A message-body MUST NOT be included in - a request if the specification of the request method (section 5.1.1) - does not allow sending an entity-body in requests. A server SHOULD - read and forward a message-body on any request; if the request method - does not include defined semantics for an entity-body, then the - message-body SHOULD be ignored when handling the request. - - For response messages, whether or not a message-body is included with - a message is dependent on both the request method and the response - status code (section 6.1.1). All responses to the HEAD request method - MUST NOT include a message-body, even though the presence of entity- - header fields might lead one to believe they do. All 1xx - (informational), 204 (no content), and 304 (not modified) responses - MUST NOT include a message-body. All other responses do include a - message-body, although it MAY be of zero length. - -4.4 Message Length - - The transfer-length of a message is the length of the message-body as - it appears in the message; that is, after any transfer-codings have - been applied. When a message-body is included with a message, the - transfer-length of that body is determined by one of the following - (in order of precedence): - - 1.Any response message which "MUST NOT" include a message-body (such - as the 1xx, 204, and 304 responses and any response to a HEAD - request) is always terminated by the first empty line after the - header fields, regardless of the entity-header fields present in - the message. - - 2.If a Transfer-Encoding header field (section 14.41) is present and - has any value other than "identity", then the transfer-length is - defined by use of the "chunked" transfer-coding (section 3.6), - unless the message is terminated by closing the connection. - - [[ Remove 'and has any value other than "identity"' ]] - - 3.If a Content-Length header field (section 14.13) is present, its - decimal value in OCTETs represents both the entity-length and the - transfer-length. The Content-Length header field MUST NOT be sent - if these two lengths are different (i.e., if a Transfer-Encoding - - - -Fielding, et al. Standards Track [Page 33] - -RFC 2616 HTTP/1.1 June 1999 - - - header field is present). If a message is received with both a - Transfer-Encoding header field and a Content-Length header field, - the latter MUST be ignored. - - 4.If the message uses the media type "multipart/byteranges", and the - ransfer-length is not otherwise specified, then this self- - elimiting media type defines the transfer-length. This media type - UST NOT be used unless the sender knows that the recipient can arse - it; the presence in a request of a Range header with ultiple byte- - range specifiers from a 1.1 client implies that the lient can parse - multipart/byteranges responses. - - A range header might be forwarded by a 1.0 proxy that does not - understand multipart/byteranges; in this case the server MUST - delimit the message using methods defined in items 1,3 or 5 of - this section. - - 5.By the server closing the connection. (Closing the connection - cannot be used to indicate the end of a request body, since that - would leave no possibility for the server to send back a response.) - - For compatibility with HTTP/1.0 applications, HTTP/1.1 requests - containing a message-body MUST include a valid Content-Length header - field unless the server is known to be HTTP/1.1 compliant. If a - request contains a message-body and a Content-Length is not given, - the server SHOULD respond with 400 (bad request) if it cannot - determine the length of the message, or with 411 (length required) if - it wishes to insist on receiving a valid Content-Length. - - All HTTP/1.1 applications that receive entities MUST accept the - "chunked" transfer-coding (section 3.6), thus allowing this mechanism - to be used for messages when the message length cannot be determined - in advance. - - Messages MUST NOT include both a Content-Length header field and a - non-identity transfer-coding. If the message does include a non- - identity transfer-coding, the Content-Length MUST be ignored. - - [[ Remove "non-identity" both times ]] - - When a Content-Length is given in a message where a message-body is - allowed, its field value MUST exactly match the number of OCTETs in - the message-body. HTTP/1.1 user agents MUST notify the user when an - invalid length is received and detected. - -4.5 General Header Fields - - There are a few header fields which have general applicability for - both request and response messages, but which do not apply to the - entity being transferred. These header fields apply only to the - - - -Fielding, et al. Standards Track [Page 34] - -RFC 2616 HTTP/1.1 June 1999 - - - message being transmitted. - - general-header = Cache-Control ; Section 14.9 - | Connection ; Section 14.10 - | Date ; Section 14.18 - | Pragma ; Section 14.32 - | Trailer ; Section 14.40 - | Transfer-Encoding ; Section 14.41 - | Upgrade ; Section 14.42 - | Via ; Section 14.45 - | Warning ; Section 14.46 - - General-header field names can be extended reliably only in - combination with a change in the protocol version. However, new or - experimental header fields may be given the semantics of general - header fields if all parties in the communication recognize them to - be general-header fields. Unrecognized header fields are treated as - entity-header fields. - -5 Request - - A request message from a client to a server includes, within the - first line of that message, the method to be applied to the resource, - the identifier of the resource, and the protocol version in use. - - Request = Request-Line ; Section 5.1 - *(( general-header ; Section 4.5 - | request-header ; Section 5.3 - | entity-header ) CRLF) ; Section 7.1 - CRLF - [ message-body ] ; Section 4.3 - -5.1 Request-Line - - The Request-Line begins with a method token, followed by the - Request-URI and the protocol version, and ending with CRLF. The - elements are separated by SP characters. No CR or LF is allowed - except in the final CRLF sequence. - - Request-Line = Method SP Request-URI SP HTTP-Version CRLF - - - - - - - - - - - -Fielding, et al. Standards Track [Page 35] - -RFC 2616 HTTP/1.1 June 1999 - - -5.1.1 Method - - The Method token indicates the method to be performed on the - resource identified by the Request-URI. The method is case-sensitive. - - Method = "OPTIONS" ; Section 9.2 - | "GET" ; Section 9.3 - | "HEAD" ; Section 9.4 - | "POST" ; Section 9.5 - | "PUT" ; Section 9.6 - | "DELETE" ; Section 9.7 - | "TRACE" ; Section 9.8 - | "CONNECT" ; Section 9.9 - | extension-method - extension-method = token - - The list of methods allowed by a resource can be specified in an - Allow header field (section 14.7). The return code of the response - always notifies the client whether a method is currently allowed on a - resource, since the set of allowed methods can change dynamically. An - origin server SHOULD return the status code 405 (Method Not Allowed) - if the method is known by the origin server but not allowed for the - requested resource, and 501 (Not Implemented) if the method is - unrecognized or not implemented by the origin server. The methods GET - and HEAD MUST be supported by all general-purpose servers. All other - methods are OPTIONAL; however, if the above methods are implemented, - they MUST be implemented with the same semantics as those specified - in section 9. - -5.1.2 Request-URI - - The Request-URI is a Uniform Resource Identifier (section 3.2) and - identifies the resource upon which to apply the request. - - Request-URI = "*" | absoluteURI | abs_path | authority - [[ Request-URI = "*" | absoluteURI | abs_path [ "?" query ] | authority ]] - - The four options for Request-URI are dependent on the nature of the - request. The asterisk "*" means that the request does not apply to a - particular resource, but to the server itself, and is only allowed - when the method used does not necessarily apply to a resource. One - example would be - - OPTIONS * HTTP/1.1 - - The absoluteURI form is REQUIRED when the request is being made to a - proxy. The proxy is requested to forward the request or service it - from a valid cache, and return the response. Note that the proxy MAY - forward the request on to another proxy or directly to the server - - - -Fielding, et al. Standards Track [Page 36] - -RFC 2616 HTTP/1.1 June 1999 - - - specified by the absoluteURI. In order to avoid request loops, a - proxy MUST be able to recognize all of its server names, including - any aliases, local variations, and the numeric IP address. An example - Request-Line would be: - - GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1 - - To allow for transition to absoluteURIs in all requests in future - versions of HTTP, all HTTP/1.1 servers MUST accept the absoluteURI - form in requests, even though HTTP/1.1 clients will only generate - them in requests to proxies. - - The authority form is only used by the CONNECT method (section 9.9). - - The most common form of Request-URI is that used to identify a - resource on an origin server or gateway. In this case the absolute - path of the URI MUST be transmitted (see section 3.2.1, abs_path) as - the Request-URI, and the network location of the URI (authority) MUST - be transmitted in a Host header field. For example, a client wishing - to retrieve the resource above directly from the origin server would - create a TCP connection to port 80 of the host "www.w3.org" and send - the lines: - - GET /pub/WWW/TheProject.html HTTP/1.1 - Host: www.w3.org - - followed by the remainder of the Request. Note that the absolute path - cannot be empty; if none is present in the original URI, it MUST be - given as "/" (the server root). - - The Request-URI is transmitted in the format specified in section - 3.2.1. If the Request-URI is encoded using the "% HEX HEX" encoding - [42], the origin server MUST decode the Request-URI in order to - properly interpret the request. Servers SHOULD respond to invalid - Request-URIs with an appropriate status code. - - A transparent proxy MUST NOT rewrite the "abs_path" part of the - received Request-URI when forwarding it to the next inbound server, - except as noted above to replace a null abs_path with "/". - - Note: The "no rewrite" rule prevents the proxy from changing the - meaning of the request when the origin server is improperly using - a non-reserved URI character for a reserved purpose. Implementors - should be aware that some pre-HTTP/1.1 proxies have been known to - rewrite the Request-URI. - - - - - - -Fielding, et al. Standards Track [Page 37] - -RFC 2616 HTTP/1.1 June 1999 - - -5.2 The Resource Identified by a Request - - The exact resource identified by an Internet request is determined by - examining both the Request-URI and the Host header field. - - An origin server that does not allow resources to differ by the - requested host MAY ignore the Host header field value when - determining the resource identified by an HTTP/1.1 request. (But see - section 19.6.1.1 for other requirements on Host support in HTTP/1.1.) - - An origin server that does differentiate resources based on the host - requested (sometimes referred to as virtual hosts or vanity host - names) MUST use the following rules for determining the requested - resource on an HTTP/1.1 request: - - 1. If Request-URI is an absoluteURI, the host is part of the - Request-URI. Any Host header field value in the request MUST be - ignored. - - 2. If the Request-URI is not an absoluteURI, and the request includes - a Host header field, the host is determined by the Host header - field value. - - 3. If the host as determined by rule 1 or 2 is not a valid host on - the server, the response MUST be a 400 (Bad Request) error message. - - Recipients of an HTTP/1.0 request that lacks a Host header field MAY - attempt to use heuristics (e.g., examination of the URI path for - something unique to a particular host) in order to determine what - exact resource is being requested. - -5.3 Request Header Fields - - The request-header fields allow the client to pass additional - information about the request, and about the client itself, to the - server. These fields act as request modifiers, with semantics - equivalent to the parameters on a programming language method - invocation. - - request-header = Accept ; Section 14.1 - | Accept-Charset ; Section 14.2 - | Accept-Encoding ; Section 14.3 - | Accept-Language ; Section 14.4 - | Authorization ; Section 14.8 - | Expect ; Section 14.20 - | From ; Section 14.22 - | Host ; Section 14.23 - | If-Match ; Section 14.24 - - - -Fielding, et al. Standards Track [Page 38] - -RFC 2616 HTTP/1.1 June 1999 - - - | If-Modified-Since ; Section 14.25 - | If-None-Match ; Section 14.26 - | If-Range ; Section 14.27 - | If-Unmodified-Since ; Section 14.28 - | Max-Forwards ; Section 14.31 - | Proxy-Authorization ; Section 14.34 - | Range ; Section 14.35 - | Referer ; Section 14.36 - | TE ; Section 14.39 - | User-Agent ; Section 14.43 - - Request-header field names can be extended reliably only in - combination with a change in the protocol version. However, new or - experimental header fields MAY be given the semantics of request- - header fields if all parties in the communication recognize them to - be request-header fields. Unrecognized header fields are treated as - entity-header fields. - -6 Response - - After receiving and interpreting a request message, a server responds - with an HTTP response message. - - Response = Status-Line ; Section 6.1 - *(( general-header ; Section 4.5 - | response-header ; Section 6.2 - | entity-header ) CRLF) ; Section 7.1 - CRLF - [ message-body ] ; Section 7.2 - -6.1 Status-Line - - The first line of a Response message is the Status-Line, consisting - of the protocol version followed by a numeric status code and its - associated textual phrase, with each element separated by SP - characters. No CR or LF is allowed except in the final CRLF sequence. - - Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF - -6.1.1 Status Code and Reason Phrase - - The Status-Code element is a 3-digit integer result code of the - attempt to understand and satisfy the request. These codes are fully - defined in section 10. The Reason-Phrase is intended to give a short - textual description of the Status-Code. The Status-Code is intended - for use by automata and the Reason-Phrase is intended for the human - user. The client is not required to examine or display the Reason- - Phrase. - - - -Fielding, et al. Standards Track [Page 39] - -RFC 2616 HTTP/1.1 June 1999 - - - The first digit of the Status-Code defines the class of response. The - last two digits do not have any categorization role. There are 5 - values for the first digit: - - - 1xx: Informational - Request received, continuing process - - - 2xx: Success - The action was successfully received, - understood, and accepted - - - 3xx: Redirection - Further action must be taken in order to - complete the request - - - 4xx: Client Error - The request contains bad syntax or cannot - be fulfilled - - - 5xx: Server Error - The server failed to fulfill an apparently - valid request - - The individual values of the numeric status codes defined for - HTTP/1.1, and an example set of corresponding Reason-Phrase's, are - presented below. The reason phrases listed here are only - recommendations -- they MAY be replaced by local equivalents without - affecting the protocol. - - Status-Code = - "100" ; Section 10.1.1: Continue - | "101" ; Section 10.1.2: Switching Protocols - | "200" ; Section 10.2.1: OK - | "201" ; Section 10.2.2: Created - | "202" ; Section 10.2.3: Accepted - | "203" ; Section 10.2.4: Non-Authoritative Information - | "204" ; Section 10.2.5: No Content - | "205" ; Section 10.2.6: Reset Content - | "206" ; Section 10.2.7: Partial Content - | "300" ; Section 10.3.1: Multiple Choices - | "301" ; Section 10.3.2: Moved Permanently - | "302" ; Section 10.3.3: Found - | "303" ; Section 10.3.4: See Other - | "304" ; Section 10.3.5: Not Modified - | "305" ; Section 10.3.6: Use Proxy - | "307" ; Section 10.3.8: Temporary Redirect - | "400" ; Section 10.4.1: Bad Request - | "401" ; Section 10.4.2: Unauthorized - | "402" ; Section 10.4.3: Payment Required - | "403" ; Section 10.4.4: Forbidden - | "404" ; Section 10.4.5: Not Found - | "405" ; Section 10.4.6: Method Not Allowed - | "406" ; Section 10.4.7: Not Acceptable - - - -Fielding, et al. Standards Track [Page 40] - -RFC 2616 HTTP/1.1 June 1999 - - - | "407" ; Section 10.4.8: Proxy Authentication Required - | "408" ; Section 10.4.9: Request Time-out - | "409" ; Section 10.4.10: Conflict - | "410" ; Section 10.4.11: Gone - | "411" ; Section 10.4.12: Length Required - | "412" ; Section 10.4.13: Precondition Failed - | "413" ; Section 10.4.14: Request Entity Too Large - | "414" ; Section 10.4.15: Request-URI Too Large - | "415" ; Section 10.4.16: Unsupported Media Type - | "416" ; Section 10.4.17: Requested range not satisfiable - | "417" ; Section 10.4.18: Expectation Failed - | "500" ; Section 10.5.1: Internal Server Error - | "501" ; Section 10.5.2: Not Implemented - | "502" ; Section 10.5.3: Bad Gateway - | "503" ; Section 10.5.4: Service Unavailable - | "504" ; Section 10.5.5: Gateway Time-out - | "505" ; Section 10.5.6: HTTP Version not supported - | extension-code - - extension-code = 3DIGIT - Reason-Phrase = *<TEXT, excluding CR, LF> - - HTTP status codes are extensible. HTTP applications are not required - to understand the meaning of all registered status codes, though such - understanding is obviously desirable. However, applications MUST - understand the class of any status code, as indicated by the first - digit, and treat any unrecognized response as being equivalent to the - x00 status code of that class, with the exception that an - unrecognized response MUST NOT be cached. For example, if an - unrecognized status code of 431 is received by the client, it can - safely assume that there was something wrong with its request and - treat the response as if it had received a 400 status code. In such - cases, user agents SHOULD present to the user the entity returned - with the response, since that entity is likely to include human- - readable information which will explain the unusual status. - -6.2 Response Header Fields - - The response-header fields allow the server to pass additional - information about the response which cannot be placed in the Status- - Line. These header fields give information about the server and about - further access to the resource identified by the Request-URI. - - response-header = Accept-Ranges ; Section 14.5 - | Age ; Section 14.6 - | ETag ; Section 14.19 - | Location ; Section 14.30 - | Proxy-Authenticate ; Section 14.33 - - - -Fielding, et al. Standards Track [Page 41] - -RFC 2616 HTTP/1.1 June 1999 - - - | Retry-After ; Section 14.37 - | Server ; Section 14.38 - | Vary ; Section 14.44 - | WWW-Authenticate ; Section 14.47 - - Response-header field names can be extended reliably only in - combination with a change in the protocol version. However, new or - experimental header fields MAY be given the semantics of response- - header fields if all parties in the communication recognize them to - be response-header fields. Unrecognized header fields are treated as - entity-header fields. - -7 Entity - - Request and Response messages MAY transfer an entity if not otherwise - restricted by the request method or response status code. An entity - consists of entity-header fields and an entity-body, although some - responses will only include the entity-headers. - - In this section, both sender and recipient refer to either the client - or the server, depending on who sends and who receives the entity. - -7.1 Entity Header Fields - - Entity-header fields define metainformation about the entity-body or, - if no body is present, about the resource identified by the request. - Some of this metainformation is OPTIONAL; some might be REQUIRED by - portions of this specification. - - entity-header = Allow ; Section 14.7 - | Content-Encoding ; Section 14.11 - | Content-Language ; Section 14.12 - | Content-Length ; Section 14.13 - | Content-Location ; Section 14.14 - | Content-MD5 ; Section 14.15 - | Content-Range ; Section 14.16 - | Content-Type ; Section 14.17 - | Expires ; Section 14.21 - | Last-Modified ; Section 14.29 - | extension-header - - extension-header = message-header - - The extension-header mechanism allows additional entity-header fields - to be defined without changing the protocol, but these fields cannot - be assumed to be recognizable by the recipient. Unrecognized header - fields SHOULD be ignored by the recipient and MUST be forwarded by - transparent proxies. - - - -Fielding, et al. Standards Track [Page 42] - -RFC 2616 HTTP/1.1 June 1999 - - -7.2 Entity Body - - The entity-body (if any) sent with an HTTP request or response is in - a format and encoding defined by the entity-header fields. - - entity-body = *OCTET - - An entity-body is only present in a message when a message-body is - present, as described in section 4.3. The entity-body is obtained - from the message-body by decoding any Transfer-Encoding that might - have been applied to ensure safe and proper transfer of the message. - -7.2.1 Type - - When an entity-body is included with a message, the data type of that - body is determined via the header fields Content-Type and Content- - Encoding. These define a two-layer, ordered encoding model: - - entity-body := Content-Encoding( Content-Type( data ) ) - - Content-Type specifies the media type of the underlying data. - Content-Encoding may be used to indicate any additional content - codings applied to the data, usually for the purpose of data - compression, that are a property of the requested resource. There is - no default encoding. - - Any HTTP/1.1 message containing an entity-body SHOULD include a - Content-Type header field defining the media type of that body. If - and only if the media type is not given by a Content-Type field, the - recipient MAY attempt to guess the media type via inspection of its - content and/or the name extension(s) of the URI used to identify the - resource. If the media type remains unknown, the recipient SHOULD - treat it as type "application/octet-stream". - -7.2.2 Entity Length - - The entity-length of a message is the length of the message-body - before any transfer-codings have been applied. Section 4.4 defines - how the transfer-length of a message-body is determined. - - - - - - - - - - - - -Fielding, et al. Standards Track [Page 43] - -RFC 2616 HTTP/1.1 June 1999 - - -8 Connections - -8.1 Persistent Connections - -8.1.1 Purpose - - Prior to persistent connections, a separate TCP connection was - established to fetch each URL, increasing the load on HTTP servers - and causing congestion on the Internet. The use of inline images and - other associated data often require a client to make multiple - requests of the same server in a short amount of time. Analysis of - these performance problems and results from a prototype - implementation are available [26] [30]. Implementation experience and - measurements of actual HTTP/1.1 (RFC 2068) implementations show good - results [39]. Alternatives have also been explored, for example, - T/TCP [27]. - - Persistent HTTP connections have a number of advantages: - - - By opening and closing fewer TCP connections, CPU time is saved - in routers and hosts (clients, servers, proxies, gateways, - tunnels, or caches), and memory used for TCP protocol control - blocks can be saved in hosts. - - - HTTP requests and responses can be pipelined on a connection. - Pipelining allows a client to make multiple requests without - waiting for each response, allowing a single TCP connection to - be used much more efficiently, with much lower elapsed time. - - - Network congestion is reduced by reducing the number of packets - caused by TCP opens, and by allowing TCP sufficient time to - determine the congestion state of the network. - - - Latency on subsequent requests is reduced since there is no time - spent in TCP's connection opening handshake. - - - HTTP can evolve more gracefully, since errors can be reported - without the penalty of closing the TCP connection. Clients using - future versions of HTTP might optimistically try a new feature, - but if communicating with an older server, retry with old - semantics after an error is reported. - - HTTP implementations SHOULD implement persistent connections. - - - - - - - - -Fielding, et al. Standards Track [Page 44] - -RFC 2616 HTTP/1.1 June 1999 - - -8.1.2 Overall Operation - - A significant difference between HTTP/1.1 and earlier versions of - HTTP is that persistent connections are the default behavior of any - HTTP connection. That is, unless otherwise indicated, the client - SHOULD assume that the server will maintain a persistent connection, - even after error responses from the server. - - Persistent connections provide a mechanism by which a client and a - server can signal the close of a TCP connection. This signaling takes - place using the Connection header field (section 14.10). Once a close - has been signaled, the client MUST NOT send any more requests on that - connection. - -8.1.2.1 Negotiation - - An HTTP/1.1 server MAY assume that a HTTP/1.1 client intends to - maintain a persistent connection unless a Connection header including - the connection-token "close" was sent in the request. If the server - chooses to close the connection immediately after sending the - response, it SHOULD send a Connection header including the - connection-token close. - - An HTTP/1.1 client MAY expect a connection to remain open, but would - decide to keep it open based on whether the response from a server - contains a Connection header with the connection-token close. In case - the client does not want to maintain a connection for more than that - request, it SHOULD send a Connection header including the - connection-token close. - - If either the client or the server sends the close token in the - Connection header, that request becomes the last one for the - connection. - - Clients and servers SHOULD NOT assume that a persistent connection is - maintained for HTTP versions less than 1.1 unless it is explicitly - signaled. See section 19.6.2 for more information on backward - compatibility with HTTP/1.0 clients. - - In order to remain persistent, all messages on the connection MUST - have a self-defined message length (i.e., one not defined by closure - of the connection), as described in section 4.4. - - - - - - - - - -Fielding, et al. Standards Track [Page 45] - -RFC 2616 HTTP/1.1 June 1999 - - -8.1.2.2 Pipelining - - A client that supports persistent connections MAY "pipeline" its - requests (i.e., send multiple requests without waiting for each - response). A server MUST send its responses to those requests in the - same order that the requests were received. - - Clients which assume persistent connections and pipeline immediately - after connection establishment SHOULD be prepared to retry their - connection if the first pipelined attempt fails. If a client does - such a retry, it MUST NOT pipeline before it knows the connection is - persistent. Clients MUST also be prepared to resend their requests if - the server closes the connection before sending all of the - corresponding responses. - - Clients SHOULD NOT pipeline requests using non-idempotent methods or - non-idempotent sequences of methods (see section 9.1.2). Otherwise, a - premature termination of the transport connection could lead to - indeterminate results. A client wishing to send a non-idempotent - request SHOULD wait to send that request until it has received the - response status for the previous request. - -8.1.3 Proxy Servers - - It is especially important that proxies correctly implement the - properties of the Connection header field as specified in section - 14.10. - - The proxy server MUST signal persistent connections separately with - its clients and the origin servers (or other proxy servers) that it - connects to. Each persistent connection applies to only one transport - link. - - A proxy server MUST NOT establish a HTTP/1.1 persistent connection - with an HTTP/1.0 client (but see RFC 2068 [33] for information and - discussion of the problems with the Keep-Alive header implemented by - many HTTP/1.0 clients). - -8.1.4 Practical Considerations - - Servers will usually have some time-out value beyond which they will - no longer maintain an inactive connection. Proxy servers might make - this a higher value since it is likely that the client will be making - more connections through the same server. The use of persistent - connections places no requirements on the length (or existence) of - this time-out for either the client or the server. - - - - - -Fielding, et al. Standards Track [Page 46] - -RFC 2616 HTTP/1.1 June 1999 - - - When a client or server wishes to time-out it SHOULD issue a graceful - close on the transport connection. Clients and servers SHOULD both - constantly watch for the other side of the transport close, and - respond to it as appropriate. If a client or server does not detect - the other side's close promptly it could cause unnecessary resource - drain on the network. - - A client, server, or proxy MAY close the transport connection at any - time. For example, a client might have started to send a new request - at the same time that the server has decided to close the "idle" - connection. From the server's point of view, the connection is being - closed while it was idle, but from the client's point of view, a - request is in progress. - - This means that clients, servers, and proxies MUST be able to recover - from asynchronous close events. Client software SHOULD reopen the - transport connection and retransmit the aborted sequence of requests - without user interaction so long as the request sequence is - idempotent (see section 9.1.2). Non-idempotent methods or sequences - MUST NOT be automatically retried, although user agents MAY offer a - human operator the choice of retrying the request(s). Confirmation by - user-agent software with semantic understanding of the application - MAY substitute for user confirmation. The automatic retry SHOULD NOT - be repeated if the second sequence of requests fails. - - Servers SHOULD always respond to at least one request per connection, - if at all possible. Servers SHOULD NOT close a connection in the - middle of transmitting a response, unless a network or client failure - is suspected. - - Clients that use persistent connections SHOULD limit the number of - simultaneous connections that they maintain to a given server. A - single-user client SHOULD NOT maintain more than 2 connections with - any server or proxy. A proxy SHOULD use up to 2*N connections to - another server or proxy, where N is the number of simultaneously - active users. These guidelines are intended to improve HTTP response - times and avoid congestion. - -8.2 Message Transmission Requirements - -8.2.1 Persistent Connections and Flow Control - - HTTP/1.1 servers SHOULD maintain persistent connections and use TCP's - flow control mechanisms to resolve temporary overloads, rather than - terminating connections with the expectation that clients will retry. - The latter technique can exacerbate network congestion. - - - - - -Fielding, et al. Standards Track [Page 47] - -RFC 2616 HTTP/1.1 June 1999 - - -8.2.2 Monitoring Connections for Error Status Messages - - An HTTP/1.1 (or later) client sending a message-body SHOULD monitor - the network connection for an error status while it is transmitting - the request. If the client sees an error status, it SHOULD - immediately cease transmitting the body. If the body is being sent - using a "chunked" encoding (section 3.6), a zero length chunk and - empty trailer MAY be used to prematurely mark the end of the message. - If the body was preceded by a Content-Length header, the client MUST - close the connection. - -8.2.3 Use of the 100 (Continue) Status - - The purpose of the 100 (Continue) status (see section 10.1.1) is to - allow a client that is sending a request message with a request body - to determine if the origin server is willing to accept the request - (based on the request headers) before the client sends the request - body. In some cases, it might either be inappropriate or highly - inefficient for the client to send the body if the server will reject - the message without looking at the body. - - Requirements for HTTP/1.1 clients: - - - If a client will wait for a 100 (Continue) response before - sending the request body, it MUST send an Expect request-header - field (section 14.20) with the "100-continue" expectation. - - - A client MUST NOT send an Expect request-header field (section - 14.20) with the "100-continue" expectation if it does not intend - to send a request body. - - Because of the presence of older implementations, the protocol allows - ambiguous situations in which a client may send "Expect: 100- - continue" without receiving either a 417 (Expectation Failed) status - or a 100 (Continue) status. Therefore, when a client sends this - header field to an origin server (possibly via a proxy) from which it - has never seen a 100 (Continue) status, the client SHOULD NOT wait - for an indefinite period before sending the request body. - - Requirements for HTTP/1.1 origin servers: - - - Upon receiving a request which includes an Expect request-header - field with the "100-continue" expectation, an origin server MUST - either respond with 100 (Continue) status and continue to read - from the input stream, or respond with a final status code. The - origin server MUST NOT wait for the request body before sending - the 100 (Continue) response. If it responds with a final status - code, it MAY close the transport connection or it MAY continue - - - -Fielding, et al. Standards Track [Page 48] - -RFC 2616 HTTP/1.1 June 1999 - - - to read and discard the rest of the request. It MUST NOT - perform the requested method if it returns a final status code. - - - An origin server SHOULD NOT send a 100 (Continue) response if - the request message does not include an Expect request-header - field with the "100-continue" expectation, and MUST NOT send a - 100 (Continue) response if such a request comes from an HTTP/1.0 - (or earlier) client. There is an exception to this rule: for - compatibility with RFC 2068, a server MAY send a 100 (Continue) - status in response to an HTTP/1.1 PUT or POST request that does - not include an Expect request-header field with the "100- - continue" expectation. This exception, the purpose of which is - to minimize any client processing delays associated with an - undeclared wait for 100 (Continue) status, applies only to - HTTP/1.1 requests, and not to requests with any other HTTP- - version value. - - - An origin server MAY omit a 100 (Continue) response if it has - already received some or all of the request body for the - corresponding request. - - - An origin server that sends a 100 (Continue) response MUST - ultimately send a final status code, once the request body is - received and processed, unless it terminates the transport - connection prematurely. - - - If an origin server receives a request that does not include an - Expect request-header field with the "100-continue" expectation, - the request includes a request body, and the server responds - with a final status code before reading the entire request body - from the transport connection, then the server SHOULD NOT close - the transport connection until it has read the entire request, - or until the client closes the connection. Otherwise, the client - might not reliably receive the response message. However, this - requirement is not be construed as preventing a server from - defending itself against denial-of-service attacks, or from - badly broken client implementations. - - Requirements for HTTP/1.1 proxies: - - - If a proxy receives a request that includes an Expect request- - header field with the "100-continue" expectation, and the proxy - either knows that the next-hop server complies with HTTP/1.1 or - higher, or does not know the HTTP version of the next-hop - server, it MUST forward the request, including the Expect header - field. - - - - - -Fielding, et al. Standards Track [Page 49] - -RFC 2616 HTTP/1.1 June 1999 - - - - If the proxy knows that the version of the next-hop server is - HTTP/1.0 or lower, it MUST NOT forward the request, and it MUST - respond with a 417 (Expectation Failed) status. - - - Proxies SHOULD maintain a cache recording the HTTP version - numbers received from recently-referenced next-hop servers. - - - A proxy MUST NOT forward a 100 (Continue) response if the - request message was received from an HTTP/1.0 (or earlier) - client and did not include an Expect request-header field with - the "100-continue" expectation. This requirement overrides the - general rule for forwarding of 1xx responses (see section 10.1). - -8.2.4 Client Behavior if Server Prematurely Closes Connection - - If an HTTP/1.1 client sends a request which includes a request body, - but which does not include an Expect request-header field with the - "100-continue" expectation, and if the client is not directly - connected to an HTTP/1.1 origin server, and if the client sees the - connection close before receiving any status from the server, the - client SHOULD retry the request. If the client does retry this - request, it MAY use the following "binary exponential backoff" - algorithm to be assured of obtaining a reliable response: - - 1. Initiate a new connection to the server - - 2. Transmit the request-headers - - 3. Initialize a variable R to the estimated round-trip time to the - server (e.g., based on the time it took to establish the - connection), or to a constant value of 5 seconds if the round- - trip time is not available. - - 4. Compute T = R * (2**N), where N is the number of previous - retries of this request. - - 5. Wait either for an error response from the server, or for T - seconds (whichever comes first) - - 6. If no error response is received, after T seconds transmit the - body of the request. - - 7. If client sees that the connection is closed prematurely, - repeat from step 1 until the request is accepted, an error - response is received, or the user becomes impatient and - terminates the retry process. - - - - - -Fielding, et al. Standards Track [Page 50] - -RFC 2616 HTTP/1.1 June 1999 - - - If at any point an error status is received, the client - - - SHOULD NOT continue and - - - SHOULD close the connection if it has not completed sending the - request message. - -9 Method Definitions - - The set of common methods for HTTP/1.1 is defined below. Although - this set can be expanded, additional methods cannot be assumed to - share the same semantics for separately extended clients and servers. - - The Host request-header field (section 14.23) MUST accompany all - HTTP/1.1 requests. - -9.1 Safe and Idempotent Methods - -9.1.1 Safe Methods - - Implementors should be aware that the software represents the user in - their interactions over the Internet, and should be careful to allow - the user to be aware of any actions they might take which may have an - unexpected significance to themselves or others. - - In particular, the convention has been established that the GET and - HEAD methods SHOULD NOT have the significance of taking an action - other than retrieval. These methods ought to be considered "safe". - This allows user agents to represent other methods, such as POST, PUT - and DELETE, in a special way, so that the user is made aware of the - fact that a possibly unsafe action is being requested. - - Naturally, it is not possible to ensure that the server does not - generate side-effects as a result of performing a GET request; in - fact, some dynamic resources consider that a feature. The important - distinction here is that the user did not request the side-effects, - so therefore cannot be held accountable for them. - -9.1.2 Idempotent Methods - - Methods can also have the property of "idempotence" in that (aside - from error or expiration issues) the side-effects of N > 0 identical - requests is the same as for a single request. The methods GET, HEAD, - PUT and DELETE share this property. Also, the methods OPTIONS and - TRACE SHOULD NOT have side effects, and so are inherently idempotent. - - - - - - -Fielding, et al. Standards Track [Page 51] - -RFC 2616 HTTP/1.1 June 1999 - - - However, it is possible that a sequence of several requests is non- - idempotent, even if all of the methods executed in that sequence are - idempotent. (A sequence is idempotent if a single execution of the - entire sequence always yields a result that is not changed by a - reexecution of all, or part, of that sequence.) For example, a - sequence is non-idempotent if its result depends on a value that is - later modified in the same sequence. - - A sequence that never has side effects is idempotent, by definition - (provided that no concurrent operations are being executed on the - same set of resources). - -9.2 OPTIONS - - The OPTIONS method represents a request for information about the - communication options available on the request/response chain - identified by the Request-URI. This method allows the client to - determine the options and/or requirements associated with a resource, - or the capabilities of a server, without implying a resource action - or initiating a resource retrieval. - - Responses to this method are not cacheable. - - If the OPTIONS request includes an entity-body (as indicated by the - presence of Content-Length or Transfer-Encoding), then the media type - MUST be indicated by a Content-Type field. Although this - specification does not define any use for such a body, future - extensions to HTTP might use the OPTIONS body to make more detailed - queries on the server. A server that does not support such an - extension MAY discard the request body. - - If the Request-URI is an asterisk ("*"), the OPTIONS request is - intended to apply to the server in general rather than to a specific - resource. Since a server's communication options typically depend on - the resource, the "*" request is only useful as a "ping" or "no-op" - type of method; it does nothing beyond allowing the client to test - the capabilities of the server. For example, this can be used to test - a proxy for HTTP/1.1 compliance (or lack thereof). - - If the Request-URI is not an asterisk, the OPTIONS request applies - only to the options that are available when communicating with that - resource. - - A 200 response SHOULD include any header fields that indicate - optional features implemented by the server and applicable to that - resource (e.g., Allow), possibly including extensions not defined by - this specification. The response body, if any, SHOULD also include - information about the communication options. The format for such a - - - -Fielding, et al. Standards Track [Page 52] - -RFC 2616 HTTP/1.1 June 1999 - - - body is not defined by this specification, but might be defined by - future extensions to HTTP. Content negotiation MAY be used to select - the appropriate response format. If no response body is included, the - response MUST include a Content-Length field with a field-value of - "0". - - The Max-Forwards request-header field MAY be used to target a - specific proxy in the request chain. When a proxy receives an OPTIONS - request on an absoluteURI for which request forwarding is permitted, - the proxy MUST check for a Max-Forwards field. If the Max-Forwards - field-value is zero ("0"), the proxy MUST NOT forward the message; - instead, the proxy SHOULD respond with its own communication options. - If the Max-Forwards field-value is an integer greater than zero, the - proxy MUST decrement the field-value when it forwards the request. If - no Max-Forwards field is present in the request, then the forwarded - request MUST NOT include a Max-Forwards field. - -9.3 GET - - The GET method means retrieve whatever information (in the form of an - entity) is identified by the Request-URI. If the Request-URI refers - to a data-producing process, it is the produced data which shall be - returned as the entity in the response and not the source text of the - process, unless that text happens to be the output of the process. - - The semantics of the GET method change to a "conditional GET" if the - request message includes an If-Modified-Since, If-Unmodified-Since, - If-Match, If-None-Match, or If-Range header field. A conditional GET - method requests that the entity be transferred only under the - circumstances described by the conditional header field(s). The - conditional GET method is intended to reduce unnecessary network - usage by allowing cached entities to be refreshed without requiring - multiple requests or transferring data already held by the client. - - The semantics of the GET method change to a "partial GET" if the - request message includes a Range header field. A partial GET requests - that only part of the entity be transferred, as described in section - 14.35. The partial GET method is intended to reduce unnecessary - network usage by allowing partially-retrieved entities to be - completed without transferring data already held by the client. - - The response to a GET request is cacheable if and only if it meets - the requirements for HTTP caching described in section 13. - - See section 15.1.3 for security considerations when used for forms. - - - - - - -Fielding, et al. Standards Track [Page 53] - -RFC 2616 HTTP/1.1 June 1999 - - -9.4 HEAD - - The HEAD method is identical to GET except that the server MUST NOT - return a message-body in the response. The metainformation contained - in the HTTP headers in response to a HEAD request SHOULD be identical - to the information sent in response to a GET request. This method can - be used for obtaining metainformation about the entity implied by the - request without transferring the entity-body itself. This method is - often used for testing hypertext links for validity, accessibility, - and recent modification. - - The response to a HEAD request MAY be cacheable in the sense that the - information contained in the response MAY be used to update a - previously cached entity from that resource. If the new field values - indicate that the cached entity differs from the current entity (as - would be indicated by a change in Content-Length, Content-MD5, ETag - or Last-Modified), then the cache MUST treat the cache entry as - stale. - -9.5 POST - - The POST method is used to request that the origin server accept the - entity enclosed in the request as a new subordinate of the resource - identified by the Request-URI in the Request-Line. POST is designed - to allow a uniform method to cover the following functions: - -[[ Should be: ]] -[[ The POST method is used to request that the origin server accept the ]] -[[ entity enclosed in the request as data to be processed by the resource ]] -[[ identified by the Request-URI in the Request-Line. POST is designed ]] -[[ to allow a uniform method to cover the following functions: ]] - - - Annotation of existing resources; - - - Posting a message to a bulletin board, newsgroup, mailing list, - or similar group of articles; - - - Providing a block of data, such as the result of submitting a - form, to a data-handling process; - - - Extending a database through an append operation. - - The actual function performed by the POST method is determined by the - server and is usually dependent on the Request-URI. The posted entity - is subordinate to that URI in the same way that a file is subordinate - to a directory containing it, a news article is subordinate to a - newsgroup to which it is posted, or a record is subordinate to a - database. - - [[ Remove second sentence ("The posted entity is subordinate") above ]] - - The action performed by the POST method might not result in a - resource that can be identified by a URI. In this case, either 200 - (OK) or 204 (No Content) is the appropriate response status, - depending on whether or not the response includes an entity that - describes the result. - - - -Fielding, et al. Standards Track [Page 54] - -RFC 2616 HTTP/1.1 June 1999 - - - If a resource has been created on the origin server, the response - SHOULD be 201 (Created) and contain an entity which describes the - status of the request and refers to the new resource, and a Location - header (see section 14.30). - - Responses to this method are not cacheable, unless the response - includes appropriate Cache-Control or Expires header fields. However, - the 303 (See Other) response can be used to direct the user agent to - retrieve a cacheable resource. - - POST requests MUST obey the message transmission requirements set out - in section 8.2. - - See section 15.1.3 for security considerations. - -9.6 PUT - - The PUT method requests that the enclosed entity be stored under the - supplied Request-URI. If the Request-URI refers to an already - existing resource, the enclosed entity SHOULD be considered as a - modified version of the one residing on the origin server. If the - Request-URI does not point to an existing resource, and that URI is - capable of being defined as a new resource by the requesting user - agent, the origin server can create the resource with that URI. If a - new resource is created, the origin server MUST inform the user agent - via the 201 (Created) response. If an existing resource is modified, - either the 200 (OK) or 204 (No Content) response codes SHOULD be sent - to indicate successful completion of the request. If the resource - could not be created or modified with the Request-URI, an appropriate - error response SHOULD be given that reflects the nature of the - problem. The recipient of the entity MUST NOT ignore any Content-* - (e.g. Content-Range) headers that it does not understand or implement - and MUST return a 501 (Not Implemented) response in such cases. - - If the request passes through a cache and the Request-URI identifies - one or more currently cached entities, those entries SHOULD be - treated as stale. Responses to this method are not cacheable. - - The fundamental difference between the POST and PUT requests is - reflected in the different meaning of the Request-URI. The URI in a - POST request identifies the resource that will handle the enclosed - entity. That resource might be a data-accepting process, a gateway to - some other protocol, or a separate entity that accepts annotations. - In contrast, the URI in a PUT request identifies the entity enclosed - with the request -- the user agent knows what URI is intended and the - server MUST NOT attempt to apply the request to some other resource. - If the server desires that the request be applied to a different URI, - - - - -Fielding, et al. Standards Track [Page 55] - -RFC 2616 HTTP/1.1 June 1999 - - - it MUST send a 301 (Moved Permanently) response; the user agent MAY - then make its own decision regarding whether or not to redirect the - request. - - A single resource MAY be identified by many different URIs. For - example, an article might have a URI for identifying "the current - version" which is separate from the URI identifying each particular - version. In this case, a PUT request on a general URI might result in - several other URIs being defined by the origin server. - - HTTP/1.1 does not define how a PUT method affects the state of an - origin server. - - PUT requests MUST obey the message transmission requirements set out - in section 8.2. - - Unless otherwise specified for a particular entity-header, the - entity-headers in the PUT request SHOULD be applied to the resource - created or modified by the PUT. - -9.7 DELETE - - The DELETE method requests that the origin server delete the resource - identified by the Request-URI. This method MAY be overridden by human - intervention (or other means) on the origin server. The client cannot - be guaranteed that the operation has been carried out, even if the - status code returned from the origin server indicates that the action - has been completed successfully. However, the server SHOULD NOT - indicate success unless, at the time the response is given, it - intends to delete the resource or move it to an inaccessible - location. - - A successful response SHOULD be 200 (OK) if the response includes an - entity describing the status, 202 (Accepted) if the action has not - yet been enacted, or 204 (No Content) if the action has been enacted - but the response does not include an entity. - - If the request passes through a cache and the Request-URI identifies - one or more currently cached entities, those entries SHOULD be - treated as stale. Responses to this method are not cacheable. - -9.8 TRACE - - The TRACE method is used to invoke a remote, application-layer loop- - back of the request message. The final recipient of the request - SHOULD reflect the message received back to the client as the - entity-body of a 200 (OK) response. The final recipient is either the - - - - -Fielding, et al. Standards Track [Page 56] - -RFC 2616 HTTP/1.1 June 1999 - - - origin server or the first proxy or gateway to receive a Max-Forwards - value of zero (0) in the request (see section 14.31). A TRACE request - MUST NOT include an entity. - - TRACE allows the client to see what is being received at the other - end of the request chain and use that data for testing or diagnostic - information. The value of the Via header field (section 14.45) is of - particular interest, since it acts as a trace of the request chain. - Use of the Max-Forwards header field allows the client to limit the - length of the request chain, which is useful for testing a chain of - proxies forwarding messages in an infinite loop. - - If the request is valid, the response SHOULD contain the entire - request message in the entity-body, with a Content-Type of - "message/http". Responses to this method MUST NOT be cached. - -9.9 CONNECT - - This specification reserves the method name CONNECT for use with a - proxy that can dynamically switch to being a tunnel (e.g. SSL - tunneling [44]). - -10 Status Code Definitions - - Each Status-Code is described below, including a description of which - method(s) it can follow and any metainformation required in the - response. - -10.1 Informational 1xx - - This class of status code indicates a provisional response, - consisting only of the Status-Line and optional headers, and is - terminated by an empty line. There are no required headers for this - class of status code. Since HTTP/1.0 did not define any 1xx status - codes, servers MUST NOT send a 1xx response to an HTTP/1.0 client - except under experimental conditions. - - A client MUST be prepared to accept one or more 1xx status responses - prior to a regular response, even if the client does not expect a 100 - (Continue) status message. Unexpected 1xx status responses MAY be - ignored by a user agent. - - Proxies MUST forward 1xx responses, unless the connection between the - proxy and its client has been closed, or unless the proxy itself - requested the generation of the 1xx response. (For example, if a - - - - - - -Fielding, et al. Standards Track [Page 57] - -RFC 2616 HTTP/1.1 June 1999 - - - proxy adds a "Expect: 100-continue" field when it forwards a request, - then it need not forward the corresponding 100 (Continue) - response(s).) - -10.1.1 100 Continue - - The client SHOULD continue with its request. This interim response is - used to inform the client that the initial part of the request has - been received and has not yet been rejected by the server. The client - SHOULD continue by sending the remainder of the request or, if the - request has already been completed, ignore this response. The server - MUST send a final response after the request has been completed. See - section 8.2.3 for detailed discussion of the use and handling of this - status code. - -10.1.2 101 Switching Protocols - - The server understands and is willing to comply with the client's - request, via the Upgrade message header field (section 14.42), for a - change in the application protocol being used on this connection. The - server will switch protocols to those defined by the response's - Upgrade header field immediately after the empty line which - terminates the 101 response. - - The protocol SHOULD be switched only when it is advantageous to do - so. For example, switching to a newer version of HTTP is advantageous - over older versions, and switching to a real-time, synchronous - protocol might be advantageous when delivering resources that use - such features. - -10.2 Successful 2xx - - This class of status code indicates that the client's request was - successfully received, understood, and accepted. - -10.2.1 200 OK - - The request has succeeded. The information returned with the response - is dependent on the method used in the request, for example: - - GET an entity corresponding to the requested resource is sent in - the response; - - HEAD the entity-header fields corresponding to the requested - resource are sent in the response without any message-body; - - POST an entity describing or containing the result of the action; - - - - -Fielding, et al. Standards Track [Page 58] - -RFC 2616 HTTP/1.1 June 1999 - - - TRACE an entity containing the request message as received by the - end server. - -10.2.2 201 Created - - The request has been fulfilled and resulted in a new resource being - created. The newly created resource can be referenced by the URI(s) - returned in the entity of the response, with the most specific URI - for the resource given by a Location header field. The response - SHOULD include an entity containing a list of resource - characteristics and location(s) from which the user or user agent can - choose the one most appropriate. The entity format is specified by - the media type given in the Content-Type header field. The origin - server MUST create the resource before returning the 201 status code. - If the action cannot be carried out immediately, the server SHOULD - respond with 202 (Accepted) response instead. - - A 201 response MAY contain an ETag response header field indicating - the current value of the entity tag for the requested variant just - created, see section 14.19. - -10.2.3 202 Accepted - - The request has been accepted for processing, but the processing has - not been completed. The request might or might not eventually be - acted upon, as it might be disallowed when processing actually takes - place. There is no facility for re-sending a status code from an - asynchronous operation such as this. - - The 202 response is intentionally non-committal. Its purpose is to - allow a server to accept a request for some other process (perhaps a - batch-oriented process that is only run once per day) without - requiring that the user agent's connection to the server persist - until the process is completed. The entity returned with this - response SHOULD include an indication of the request's current status - and either a pointer to a status monitor or some estimate of when the - user can expect the request to be fulfilled. - -10.2.4 203 Non-Authoritative Information - - The returned metainformation in the entity-header is not the - definitive set as available from the origin server, but is gathered - from a local or a third-party copy. The set presented MAY be a subset - or superset of the original version. For example, including local - annotation information about the resource might result in a superset - of the metainformation known by the origin server. Use of this - response code is not required and is only appropriate when the - response would otherwise be 200 (OK). - - - -Fielding, et al. Standards Track [Page 59] - -RFC 2616 HTTP/1.1 June 1999 - - -10.2.5 204 No Content - - The server has fulfilled the request but does not need to return an - entity-body, and might want to return updated metainformation. The - response MAY include new or updated metainformation in the form of - entity-headers, which if present SHOULD be associated with the - requested variant. - - If the client is a user agent, it SHOULD NOT change its document view - from that which caused the request to be sent. This response is - primarily intended to allow input for actions to take place without - causing a change to the user agent's active document view, although - any new or updated metainformation SHOULD be applied to the document - currently in the user agent's active view. - - The 204 response MUST NOT include a message-body, and thus is always - terminated by the first empty line after the header fields. - -10.2.6 205 Reset Content - - The server has fulfilled the request and the user agent SHOULD reset - the document view which caused the request to be sent. This response - is primarily intended to allow input for actions to take place via - user input, followed by a clearing of the form in which the input is - given so that the user can easily initiate another input action. The - response MUST NOT include an entity. - -10.2.7 206 Partial Content - - The server has fulfilled the partial GET request for the resource. - The request MUST have included a Range header field (section 14.35) - indicating the desired range, and MAY have included an If-Range - header field (section 14.27) to make the request conditional. - - The response MUST include the following header fields: - - - Either a Content-Range header field (section 14.16) indicating - the range included with this response, or a multipart/byteranges - Content-Type including Content-Range fields for each part. If a - Content-Length header field is present in the response, its - value MUST match the actual number of OCTETs transmitted in the - message-body. - - - Date - - - ETag and/or Content-Location, if the header would have been sent - in a 200 response to the same request - - - - -Fielding, et al. Standards Track [Page 60] - -RFC 2616 HTTP/1.1 June 1999 - - - - Expires, Cache-Control, and/or Vary, if the field-value might - differ from that sent in any previous response for the same - variant - - If the 206 response is the result of an If-Range request that used a - strong cache validator (see section 13.3.3), the response SHOULD NOT - include other entity-headers. If the response is the result of an - If-Range request that used a weak validator, the response MUST NOT - include other entity-headers; this prevents inconsistencies between - cached entity-bodies and updated headers. Otherwise, the response - MUST include all of the entity-headers that would have been returned - with a 200 (OK) response to the same request. - -[[ Should be: ]] -[[ If the 206 response is the result of an If-Range request, the ]] -[[ response SHOULD NOT include other entity-headers. Otherwise, the ]] -[[ response MUST include all of the entity-headers that would have ]] -[[ been returned with a 200 (OK) response to the same request. ]] - - A cache MUST NOT combine a 206 response with other previously cached - content if the ETag or Last-Modified headers do not match exactly, - see 13.5.4. - - A cache that does not support the Range and Content-Range headers - MUST NOT cache 206 (Partial) responses. - -10.3 Redirection 3xx - - This class of status code indicates that further action needs to be - taken by the user agent in order to fulfill the request. The action - required MAY be carried out by the user agent without interaction - with the user if and only if the method used in the second request is - GET or HEAD. A client SHOULD detect infinite redirection loops, since - such loops generate network traffic for each redirection. - - Note: previous versions of this specification recommended a - maximum of five redirections. Content developers should be aware - that there might be clients that implement such a fixed - limitation. - -10.3.1 300 Multiple Choices - - The requested resource corresponds to any one of a set of - representations, each with its own specific location, and agent- - driven negotiation information (section 12) is being provided so that - the user (or user agent) can select a preferred representation and - redirect its request to that location. - - Unless it was a HEAD request, the response SHOULD include an entity - containing a list of resource characteristics and location(s) from - which the user or user agent can choose the one most appropriate. The - entity format is specified by the media type given in the Content- - Type header field. Depending upon the format and the capabilities of - - - - -Fielding, et al. Standards Track [Page 61] - -RFC 2616 HTTP/1.1 June 1999 - - - the user agent, selection of the most appropriate choice MAY be - performed automatically. However, this specification does not define - any standard for such automatic selection. - - If the server has a preferred choice of representation, it SHOULD - include the specific URI for that representation in the Location - field; user agents MAY use the Location field value for automatic - redirection. This response is cacheable unless indicated otherwise. - -10.3.2 301 Moved Permanently - - The requested resource has been assigned a new permanent URI and any - future references to this resource SHOULD use one of the returned - URIs. Clients with link editing capabilities ought to automatically - re-link references to the Request-URI to one or more of the new - references returned by the server, where possible. This response is - cacheable unless indicated otherwise. - - The new permanent URI SHOULD be given by the Location field in the - response. Unless the request method was HEAD, the entity of the - response SHOULD contain a short hypertext note with a hyperlink to - the new URI(s). - - If the 301 status code is received in response to a request other - than GET or HEAD, the user agent MUST NOT automatically redirect the - request unless it can be confirmed by the user, since this might - change the conditions under which the request was issued. - -[[ Should be: ]] -[[ If the 301 status code is received in response to a request method ]] -[[ that is known to be "safe", as defined in section 9.1.1, then the ]] -[[ request MAY be automatically redirected by the user agent without ]] -[[ confirmation. Otherwise, the user agent MUST NOT automatically ]] -[[ redirect the request unless it is confirmed by the user, since the ]] -[[ new URI might change the conditions under which the request was ]] -[[ issued. ]] - - Note: When automatically redirecting a POST request after - receiving a 301 status code, some existing HTTP/1.0 user agents - will erroneously change it into a GET request. - -10.3.3 302 Found - - The requested resource resides temporarily under a different URI. - Since the redirection might be altered on occasion, the client SHOULD - continue to use the Request-URI for future requests. This response - is only cacheable if indicated by a Cache-Control or Expires header - field. - - The temporary URI SHOULD be given by the Location field in the - response. Unless the request method was HEAD, the entity of the - response SHOULD contain a short hypertext note with a hyperlink to - the new URI(s). - - - - - - - -Fielding, et al. Standards Track [Page 62] - -RFC 2616 HTTP/1.1 June 1999 - - - If the 302 status code is received in response to a request other - than GET or HEAD, the user agent MUST NOT automatically redirect the - request unless it can be confirmed by the user, since this might - change the conditions under which the request was issued. - - [[ See errata to 10.3.3 ]] - - Note: RFC 1945 and RFC 2068 specify that the client is not allowed - to change the method on the redirected request. However, most - existing user agent implementations treat 302 as if it were a 303 - response, performing a GET on the Location field-value regardless - of the original request method. The status codes 303 and 307 have - been added for servers that wish to make unambiguously clear which - kind of reaction is expected of the client. - -10.3.4 303 See Other - - The response to the request can be found under a different URI and - SHOULD be retrieved using a GET method on that resource. This method - exists primarily to allow the output of a POST-activated script to - redirect the user agent to a selected resource. The new URI is not a - substitute reference for the originally requested resource. The 303 - response MUST NOT be cached, but the response to the second - (redirected) request might be cacheable. - - The different URI SHOULD be given by the Location field in the - response. Unless the request method was HEAD, the entity of the - response SHOULD contain a short hypertext note with a hyperlink to - the new URI(s). - - Note: Many pre-HTTP/1.1 user agents do not understand the 303 - status. When interoperability with such clients is a concern, the - 302 status code may be used instead, since most user agents react - to a 302 response as described here for 303. - -10.3.5 304 Not Modified - - If the client has performed a conditional GET request and access is - allowed, but the document has not been modified, the server SHOULD - respond with this status code. The 304 response MUST NOT contain a - message-body, and thus is always terminated by the first empty line - after the header fields. - - The response MUST include the following header fields: - - - Date, unless its omission is required by section 14.18.1 - - - - - - - -Fielding, et al. Standards Track [Page 63] - -RFC 2616 HTTP/1.1 June 1999 - - - If a clockless origin server obeys these rules, and proxies and - clients add their own Date to any response received without one (as - already specified by [RFC 2068], section 14.19), caches will operate - correctly. - - - ETag and/or Content-Location, if the header would have been sent - in a 200 response to the same request - - - Expires, Cache-Control, and/or Vary, if the field-value might - differ from that sent in any previous response for the same - variant - - If the conditional GET used a strong cache validator (see section - 13.3.3), the response SHOULD NOT include other entity-headers. - Otherwise (i.e., the conditional GET used a weak validator), the - response MUST NOT include other entity-headers; this prevents - inconsistencies between cached entity-bodies and updated headers. - - If a 304 response indicates an entity not currently cached, then the - cache MUST disregard the response and repeat the request without the - conditional. - - If a cache uses a received 304 response to update a cache entry, the - cache MUST update the entry to reflect any new field values given in - the response. - -10.3.6 305 Use Proxy - - The requested resource MUST be accessed through the proxy given by - the Location field. The Location field gives the URI of the proxy. - The recipient is expected to repeat this single request via the - proxy. 305 responses MUST only be generated by origin servers. - - Note: RFC 2068 was not clear that 305 was intended to redirect a - single request, and to be generated by origin servers only. Not - observing these limitations has significant security consequences. - -10.3.7 306 (Unused) - - The 306 status code was used in a previous version of the - specification, is no longer used, and the code is reserved. - - - - - - - - - - -Fielding, et al. Standards Track [Page 64] - -RFC 2616 HTTP/1.1 June 1999 - - -10.3.8 307 Temporary Redirect - - The requested resource resides temporarily under a different URI. - Since the redirection MAY be altered on occasion, the client SHOULD - continue to use the Request-URI for future requests. This response - is only cacheable if indicated by a Cache-Control or Expires header - field. - - The temporary URI SHOULD be given by the Location field in the - response. Unless the request method was HEAD, the entity of the - response SHOULD contain a short hypertext note with a hyperlink to - the new URI(s) , since many pre-HTTP/1.1 user agents do not - understand the 307 status. Therefore, the note SHOULD contain the - information necessary for a user to repeat the original request on - the new URI. - - If the 307 status code is received in response to a request other - than GET or HEAD, the user agent MUST NOT automatically redirect the - request unless it can be confirmed by the user, since this might - change the conditions under which the request was issued. - - [[ See errata to 10.3.3 ]] - -10.4 Client Error 4xx - - The 4xx class of status code is intended for cases in which the - client seems to have erred. Except when responding to a HEAD request, - the server SHOULD include an entity containing an explanation of the - error situation, and whether it is a temporary or permanent - condition. These status codes are applicable to any request method. - User agents SHOULD display any included entity to the user. - - If the client is sending data, a server implementation using TCP - SHOULD be careful to ensure that the client acknowledges receipt of - the packet(s) containing the response, before the server closes the - input connection. If the client continues sending data to the server - after the close, the server's TCP stack will send a reset packet to - the client, which may erase the client's unacknowledged input buffers - before they can be read and interpreted by the HTTP application. - -10.4.1 400 Bad Request - - The request could not be understood by the server due to malformed - syntax. The client SHOULD NOT repeat the request without - modifications. - - - - - - - - -Fielding, et al. Standards Track [Page 65] - -RFC 2616 HTTP/1.1 June 1999 - - -10.4.2 401 Unauthorized - - The request requires user authentication. The response MUST include a - WWW-Authenticate header field (section 14.47) containing a challenge - applicable to the requested resource. The client MAY repeat the - request with a suitable Authorization header field (section 14.8). If - the request already included Authorization credentials, then the 401 - response indicates that authorization has been refused for those - credentials. If the 401 response contains the same challenge as the - prior response, and the user agent has already attempted - authentication at least once, then the user SHOULD be presented the - entity that was given in the response, since that entity might - include relevant diagnostic information. HTTP access authentication - is explained in "HTTP Authentication: Basic and Digest Access - Authentication" [43]. - -10.4.3 402 Payment Required - - This code is reserved for future use. - -10.4.4 403 Forbidden - - The server understood the request, but is refusing to fulfill it. - Authorization will not help and the request SHOULD NOT be repeated. - If the request method was not HEAD and the server wishes to make - public why the request has not been fulfilled, it SHOULD describe the - reason for the refusal in the entity. If the server does not wish to - make this information available to the client, the status code 404 - (Not Found) can be used instead. - -10.4.5 404 Not Found - - The server has not found anything matching the Request-URI. No - indication is given of whether the condition is temporary or - permanent. The 410 (Gone) status code SHOULD be used if the server - knows, through some internally configurable mechanism, that an old - resource is permanently unavailable and has no forwarding address. - This status code is commonly used when the server does not wish to - reveal exactly why the request has been refused, or when no other - response is applicable. - -10.4.6 405 Method Not Allowed - - The method specified in the Request-Line is not allowed for the - resource identified by the Request-URI. The response MUST include an - Allow header containing a list of valid methods for the requested - resource. - - - - -Fielding, et al. Standards Track [Page 66] - -RFC 2616 HTTP/1.1 June 1999 - - -10.4.7 406 Not Acceptable - - The resource identified by the request is only capable of generating - response entities which have content characteristics not acceptable - according to the accept headers sent in the request. - - Unless it was a HEAD request, the response SHOULD include an entity - containing a list of available entity characteristics and location(s) - from which the user or user agent can choose the one most - appropriate. The entity format is specified by the media type given - in the Content-Type header field. Depending upon the format and the - capabilities of the user agent, selection of the most appropriate - choice MAY be performed automatically. However, this specification - does not define any standard for such automatic selection. - - Note: HTTP/1.1 servers are allowed to return responses which are - not acceptable according to the accept headers sent in the - request. In some cases, this may even be preferable to sending a - 406 response. User agents are encouraged to inspect the headers of - an incoming response to determine if it is acceptable. - - If the response could be unacceptable, a user agent SHOULD - temporarily stop receipt of more data and query the user for a - decision on further actions. - -10.4.8 407 Proxy Authentication Required - - This code is similar to 401 (Unauthorized), but indicates that the - client must first authenticate itself with the proxy. The proxy MUST - return a Proxy-Authenticate header field (section 14.33) containing a - challenge applicable to the proxy for the requested resource. The - client MAY repeat the request with a suitable Proxy-Authorization - header field (section 14.34). HTTP access authentication is explained - in "HTTP Authentication: Basic and Digest Access Authentication" - [43]. - -10.4.9 408 Request Timeout - - The client did not produce a request within the time that the server - was prepared to wait. The client MAY repeat the request without - modifications at any later time. - -10.4.10 409 Conflict - - The request could not be completed due to a conflict with the current - state of the resource. This code is only allowed in situations where - it is expected that the user might be able to resolve the conflict - and resubmit the request. The response body SHOULD include enough - - - -Fielding, et al. Standards Track [Page 67] - -RFC 2616 HTTP/1.1 June 1999 - - - information for the user to recognize the source of the conflict. - Ideally, the response entity would include enough information for the - user or user agent to fix the problem; however, that might not be - possible and is not required. - - Conflicts are most likely to occur in response to a PUT request. For - example, if versioning were being used and the entity being PUT - included changes to a resource which conflict with those made by an - earlier (third-party) request, the server might use the 409 response - to indicate that it can't complete the request. In this case, the - response entity would likely contain a list of the differences - between the two versions in a format defined by the response - Content-Type. - -10.4.11 410 Gone - - The requested resource is no longer available at the server and no - forwarding address is known. This condition is expected to be - considered permanent. Clients with link editing capabilities SHOULD - delete references to the Request-URI after user approval. If the - server does not know, or has no facility to determine, whether or not - the condition is permanent, the status code 404 (Not Found) SHOULD be - used instead. This response is cacheable unless indicated otherwise. - - The 410 response is primarily intended to assist the task of web - maintenance by notifying the recipient that the resource is - intentionally unavailable and that the server owners desire that - remote links to that resource be removed. Such an event is common for - limited-time, promotional services and for resources belonging to - individuals no longer working at the server's site. It is not - necessary to mark all permanently unavailable resources as "gone" or - to keep the mark for any length of time -- that is left to the - discretion of the server owner. - -10.4.12 411 Length Required - - The server refuses to accept the request without a defined Content- - Length. The client MAY repeat the request if it adds a valid - Content-Length header field containing the length of the message-body - in the request message. - -10.4.13 412 Precondition Failed - - The precondition given in one or more of the request-header fields - evaluated to false when it was tested on the server. This response - code allows the client to place preconditions on the current resource - metainformation (header field data) and thus prevent the requested - method from being applied to a resource other than the one intended. - - - -Fielding, et al. Standards Track [Page 68] - -RFC 2616 HTTP/1.1 June 1999 - - -10.4.14 413 Request Entity Too Large - - The server is refusing to process a request because the request - entity is larger than the server is willing or able to process. The - server MAY close the connection to prevent the client from continuing - the request. - - If the condition is temporary, the server SHOULD include a Retry- - After header field to indicate that it is temporary and after what - time the client MAY try again. - -10.4.15 414 Request-URI Too Long - - The server is refusing to service the request because the Request-URI - is longer than the server is willing to interpret. This rare - condition is only likely to occur when a client has improperly - converted a POST request to a GET request with long query - information, when the client has descended into a URI "black hole" of - redirection (e.g., a redirected URI prefix that points to a suffix of - itself), or when the server is under attack by a client attempting to - exploit security holes present in some servers using fixed-length - buffers for reading or manipulating the Request-URI. - -10.4.16 415 Unsupported Media Type - - The server is refusing to service the request because the entity of - the request is in a format not supported by the requested resource - for the requested method. - -10.4.17 416 Requested Range Not Satisfiable - - A server SHOULD return a response with this status code if a request - included a Range request-header field (section 14.35), and none of - the range-specifier values in this field overlap the current extent - of the selected resource, and the request did not include an If-Range - request-header field. (For byte-ranges, this means that the first- - byte-pos of all of the byte-range-spec values were greater than the - current length of the selected resource.) - - When this status code is returned for a byte-range request, the - response SHOULD include a Content-Range entity-header field - specifying the current length of the selected resource (see section - 14.16). This response MUST NOT use the multipart/byteranges content- - type. - - - - - - - -Fielding, et al. Standards Track [Page 69] - -RFC 2616 HTTP/1.1 June 1999 - - -10.4.18 417 Expectation Failed - - The expectation given in an Expect request-header field (see section - 14.20) could not be met by this server, or, if the server is a proxy, - the server has unambiguous evidence that the request could not be met - by the next-hop server. - -10.5 Server Error 5xx - - Response status codes beginning with the digit "5" indicate cases in - which the server is aware that it has erred or is incapable of - performing the request. Except when responding to a HEAD request, the - server SHOULD include an entity containing an explanation of the - error situation, and whether it is a temporary or permanent - condition. User agents SHOULD display any included entity to the - user. These response codes are applicable to any request method. - -10.5.1 500 Internal Server Error - - The server encountered an unexpected condition which prevented it - from fulfilling the request. - -10.5.2 501 Not Implemented - - The server does not support the functionality required to fulfill the - request. This is the appropriate response when the server does not - recognize the request method and is not capable of supporting it for - any resource. - -10.5.3 502 Bad Gateway - - The server, while acting as a gateway or proxy, received an invalid - response from the upstream server it accessed in attempting to - fulfill the request. - -10.5.4 503 Service Unavailable - - The server is currently unable to handle the request due to a - temporary overloading or maintenance of the server. The implication - is that this is a temporary condition which will be alleviated after - some delay. If known, the length of the delay MAY be indicated in a - Retry-After header. If no Retry-After is given, the client SHOULD - handle the response as it would for a 500 response. - - Note: The existence of the 503 status code does not imply that a - server must use it when becoming overloaded. Some servers may wish - to simply refuse the connection. - - - - -Fielding, et al. Standards Track [Page 70] - -RFC 2616 HTTP/1.1 June 1999 - - -10.5.5 504 Gateway Timeout - - The server, while acting as a gateway or proxy, did not receive a - timely response from the upstream server specified by the URI (e.g. - HTTP, FTP, LDAP) or some other auxiliary server (e.g. DNS) it needed - to access in attempting to complete the request. - - Note: Note to implementors: some deployed proxies are known to - return 400 or 500 when DNS lookups time out. - -10.5.6 505 HTTP Version Not Supported - - The server does not support, or refuses to support, the HTTP protocol - version that was used in the request message. The server is - indicating that it is unable or unwilling to complete the request - using the same major version as the client, as described in section - 3.1, other than with this error message. The response SHOULD contain - an entity describing why that version is not supported and what other - protocols are supported by that server. - -11 Access Authentication - - HTTP provides several OPTIONAL challenge-response authentication - mechanisms which can be used by a server to challenge a client - request and by a client to provide authentication information. The - general framework for access authentication, and the specification of - "basic" and "digest" authentication, are specified in "HTTP - Authentication: Basic and Digest Access Authentication" [43]. This - specification adopts the definitions of "challenge" and "credentials" - from that specification. - -12 Content Negotiation - - Most HTTP responses include an entity which contains information for - interpretation by a human user. Naturally, it is desirable to supply - the user with the "best available" entity corresponding to the - request. Unfortunately for servers and caches, not all users have the - same preferences for what is "best," and not all user agents are - equally capable of rendering all entity types. For that reason, HTTP - has provisions for several mechanisms for "content negotiation" -- - the process of selecting the best representation for a given response - when there are multiple representations available. - - Note: This is not called "format negotiation" because the - alternate representations may be of the same media type, but use - different capabilities of that type, be in different languages, - etc. - - - - -Fielding, et al. Standards Track [Page 71] - -RFC 2616 HTTP/1.1 June 1999 - - - Any response containing an entity-body MAY be subject to negotiation, - including error responses. - - There are two kinds of content negotiation which are possible in - HTTP: server-driven and agent-driven negotiation. These two kinds of - negotiation are orthogonal and thus may be used separately or in - combination. One method of combination, referred to as transparent - negotiation, occurs when a cache uses the agent-driven negotiation - information provided by the origin server in order to provide - server-driven negotiation for subsequent requests. - -12.1 Server-driven Negotiation - - If the selection of the best representation for a response is made by - an algorithm located at the server, it is called server-driven - negotiation. Selection is based on the available representations of - the response (the dimensions over which it can vary; e.g. language, - content-coding, etc.) and the contents of particular header fields in - the request message or on other information pertaining to the request - (such as the network address of the client). - - Server-driven negotiation is advantageous when the algorithm for - selecting from among the available representations is difficult to - describe to the user agent, or when the server desires to send its - "best guess" to the client along with the first response (hoping to - avoid the round-trip delay of a subsequent request if the "best - guess" is good enough for the user). In order to improve the server's - guess, the user agent MAY include request header fields (Accept, - Accept-Language, Accept-Encoding, etc.) which describe its - preferences for such a response. - - Server-driven negotiation has disadvantages: - - 1. It is impossible for the server to accurately determine what - might be "best" for any given user, since that would require - complete knowledge of both the capabilities of the user agent - and the intended use for the response (e.g., does the user want - to view it on screen or print it on paper?). - - 2. Having the user agent describe its capabilities in every - request can be both very inefficient (given that only a small - percentage of responses have multiple representations) and a - potential violation of the user's privacy. - - 3. It complicates the implementation of an origin server and the - algorithms for generating responses to a request. - - - - - -Fielding, et al. Standards Track [Page 72] - -RFC 2616 HTTP/1.1 June 1999 - - - 4. It may limit a public cache's ability to use the same response - for multiple user's requests. - - HTTP/1.1 includes the following request-header fields for enabling - server-driven negotiation through description of user agent - capabilities and user preferences: Accept (section 14.1), Accept- - Charset (section 14.2), Accept-Encoding (section 14.3), Accept- - Language (section 14.4), and User-Agent (section 14.43). However, an - origin server is not limited to these dimensions and MAY vary the - response based on any aspect of the request, including information - outside the request-header fields or within extension header fields - not defined by this specification. - - The Vary header field can be used to express the parameters the - server uses to select a representation that is subject to server- - driven negotiation. See section 13.6 for use of the Vary header field - by caches and section 14.44 for use of the Vary header field by - servers. - -12.2 Agent-driven Negotiation - - With agent-driven negotiation, selection of the best representation - for a response is performed by the user agent after receiving an - initial response from the origin server. Selection is based on a list - of the available representations of the response included within the - header fields or entity-body of the initial response, with each - representation identified by its own URI. Selection from among the - representations may be performed automatically (if the user agent is - capable of doing so) or manually by the user selecting from a - generated (possibly hypertext) menu. - - Agent-driven negotiation is advantageous when the response would vary - over commonly-used dimensions (such as type, language, or encoding), - when the origin server is unable to determine a user agent's - capabilities from examining the request, and generally when public - caches are used to distribute server load and reduce network usage. - - Agent-driven negotiation suffers from the disadvantage of needing a - second request to obtain the best alternate representation. This - second request is only efficient when caching is used. In addition, - this specification does not define any mechanism for supporting - automatic selection, though it also does not prevent any such - mechanism from being developed as an extension and used within - HTTP/1.1. - - - - - - - -Fielding, et al. Standards Track [Page 73] - -RFC 2616 HTTP/1.1 June 1999 - - - HTTP/1.1 defines the 300 (Multiple Choices) and 406 (Not Acceptable) - status codes for enabling agent-driven negotiation when the server is - unwilling or unable to provide a varying response using server-driven - negotiation. - -12.3 Transparent Negotiation - - Transparent negotiation is a combination of both server-driven and - agent-driven negotiation. When a cache is supplied with a form of the - list of available representations of the response (as in agent-driven - negotiation) and the dimensions of variance are completely understood - by the cache, then the cache becomes capable of performing server- - driven negotiation on behalf of the origin server for subsequent - requests on that resource. - - Transparent negotiation has the advantage of distributing the - negotiation work that would otherwise be required of the origin - server and also removing the second request delay of agent-driven - negotiation when the cache is able to correctly guess the right - response. - - This specification does not define any mechanism for transparent - negotiation, though it also does not prevent any such mechanism from - being developed as an extension that could be used within HTTP/1.1. - -13 Caching in HTTP - - HTTP is typically used for distributed information systems, where - performance can be improved by the use of response caches. The - HTTP/1.1 protocol includes a number of elements intended to make - caching work as well as possible. Because these elements are - inextricable from other aspects of the protocol, and because they - interact with each other, it is useful to describe the basic caching - design of HTTP separately from the detailed descriptions of methods, - headers, response codes, etc. - - Caching would be useless if it did not significantly improve - performance. The goal of caching in HTTP/1.1 is to eliminate the need - to send requests in many cases, and to eliminate the need to send - full responses in many other cases. The former reduces the number of - network round-trips required for many operations; we use an - "expiration" mechanism for this purpose (see section 13.2). The - latter reduces network bandwidth requirements; we use a "validation" - mechanism for this purpose (see section 13.3). - - Requirements for performance, availability, and disconnected - operation require us to be able to relax the goal of semantic - transparency. The HTTP/1.1 protocol allows origin servers, caches, - - - -Fielding, et al. Standards Track [Page 74] - -RFC 2616 HTTP/1.1 June 1999 - - - and clients to explicitly reduce transparency when necessary. - However, because non-transparent operation may confuse non-expert - users, and might be incompatible with certain server applications - (such as those for ordering merchandise), the protocol requires that - transparency be relaxed - - - only by an explicit protocol-level request when relaxed by - client or origin server - - - only with an explicit warning to the end user when relaxed by - cache or client - - Therefore, the HTTP/1.1 protocol provides these important elements: - - 1. Protocol features that provide full semantic transparency when - this is required by all parties. - - 2. Protocol features that allow an origin server or user agent to - explicitly request and control non-transparent operation. - - 3. Protocol features that allow a cache to attach warnings to - responses that do not preserve the requested approximation of - semantic transparency. - - A basic principle is that it must be possible for the clients to - detect any potential relaxation of semantic transparency. - - Note: The server, cache, or client implementor might be faced with - design decisions not explicitly discussed in this specification. - If a decision might affect semantic transparency, the implementor - ought to err on the side of maintaining transparency unless a - careful and complete analysis shows significant benefits in - breaking transparency. - -13.1.1 Cache Correctness - - A correct cache MUST respond to a request with the most up-to-date - response held by the cache that is appropriate to the request (see - sections 13.2.5, 13.2.6, and 13.12) which meets one of the following - conditions: - - 1. It has been checked for equivalence with what the origin server - would have returned by revalidating the response with the - origin server (section 13.3); - - - - - - - -Fielding, et al. Standards Track [Page 75] - -RFC 2616 HTTP/1.1 June 1999 - - - 2. It is "fresh enough" (see section 13.2). In the default case, - this means it meets the least restrictive freshness requirement - of the client, origin server, and cache (see section 14.9); if - the origin server so specifies, it is the freshness requirement - of the origin server alone. - - If a stored response is not "fresh enough" by the most - restrictive freshness requirement of both the client and the - origin server, in carefully considered circumstances the cache - MAY still return the response with the appropriate Warning - header (see section 13.1.5 and 14.46), unless such a response - is prohibited (e.g., by a "no-store" cache-directive, or by a - "no-cache" cache-request-directive; see section 14.9). - - 3. It is an appropriate 304 (Not Modified), 305 (Proxy Redirect), - or error (4xx or 5xx) response message. - - If the cache can not communicate with the origin server, then a - correct cache SHOULD respond as above if the response can be - correctly served from the cache; if not it MUST return an error or - warning indicating that there was a communication failure. - - If a cache receives a response (either an entire response, or a 304 - (Not Modified) response) that it would normally forward to the - requesting client, and the received response is no longer fresh, the - cache SHOULD forward it to the requesting client without adding a new - Warning (but without removing any existing Warning headers). A cache - SHOULD NOT attempt to revalidate a response simply because that - response became stale in transit; this might lead to an infinite - loop. A user agent that receives a stale response without a Warning - MAY display a warning indication to the user. - -13.1.2 Warnings - - Whenever a cache returns a response that is neither first-hand nor - "fresh enough" (in the sense of condition 2 in section 13.1.1), it - MUST attach a warning to that effect, using a Warning general-header. - The Warning header and the currently defined warnings are described - in section 14.46. The warning allows clients to take appropriate - action. - - Warnings MAY be used for other purposes, both cache-related and - otherwise. The use of a warning, rather than an error status code, - distinguish these responses from true failures. - - Warnings are assigned three digit warn-codes. The first digit - indicates whether the Warning MUST or MUST NOT be deleted from a - stored cache entry after a successful revalidation: - - - -Fielding, et al. Standards Track [Page 76] - -RFC 2616 HTTP/1.1 June 1999 - - - 1xx Warnings that describe the freshness or revalidation status of - the response, and so MUST be deleted after a successful - revalidation. 1XX warn-codes MAY be generated by a cache only when - validating a cached entry. It MUST NOT be generated by clients. - - 2xx Warnings that describe some aspect of the entity body or entity - headers that is not rectified by a revalidation (for example, a - lossy compression of the entity bodies) and which MUST NOT be - deleted after a successful revalidation. - - See section 14.46 for the definitions of the codes themselves. - - HTTP/1.0 caches will cache all Warnings in responses, without - deleting the ones in the first category. Warnings in responses that - are passed to HTTP/1.0 caches carry an extra warning-date field, - which prevents a future HTTP/1.1 recipient from believing an - erroneously cached Warning. - - Warnings also carry a warning text. The text MAY be in any - appropriate natural language (perhaps based on the client's Accept - headers), and include an OPTIONAL indication of what character set is - used. - - Multiple warnings MAY be attached to a response (either by the origin - server or by a cache), including multiple warnings with the same code - number. For example, a server might provide the same warning with - texts in both English and Basque. - - When multiple warnings are attached to a response, it might not be - practical or reasonable to display all of them to the user. This - version of HTTP does not specify strict priority rules for deciding - which warnings to display and in what order, but does suggest some - heuristics. - -13.1.3 Cache-control Mechanisms - - The basic cache mechanisms in HTTP/1.1 (server-specified expiration - times and validators) are implicit directives to caches. In some - cases, a server or client might need to provide explicit directives - to the HTTP caches. We use the Cache-Control header for this purpose. - - The Cache-Control header allows a client or server to transmit a - variety of directives in either requests or responses. These - directives typically override the default caching algorithms. As a - general rule, if there is any apparent conflict between header - values, the most restrictive interpretation is applied (that is, the - one that is most likely to preserve semantic transparency). However, - - - - -Fielding, et al. Standards Track [Page 77] - -RFC 2616 HTTP/1.1 June 1999 - - - in some cases, cache-control directives are explicitly specified as - weakening the approximation of semantic transparency (for example, - "max-stale" or "public"). - - The cache-control directives are described in detail in section 14.9. - -13.1.4 Explicit User Agent Warnings - - Many user agents make it possible for users to override the basic - caching mechanisms. For example, the user agent might allow the user - to specify that cached entities (even explicitly stale ones) are - never validated. Or the user agent might habitually add "Cache- - Control: max-stale=3600" to every request. The user agent SHOULD NOT - default to either non-transparent behavior, or behavior that results - in abnormally ineffective caching, but MAY be explicitly configured - to do so by an explicit action of the user. - - If the user has overridden the basic caching mechanisms, the user - agent SHOULD explicitly indicate to the user whenever this results in - the display of information that might not meet the server's - transparency requirements (in particular, if the displayed entity is - known to be stale). Since the protocol normally allows the user agent - to determine if responses are stale or not, this indication need only - be displayed when this actually happens. The indication need not be a - dialog box; it could be an icon (for example, a picture of a rotting - fish) or some other indicator. - - If the user has overridden the caching mechanisms in a way that would - abnormally reduce the effectiveness of caches, the user agent SHOULD - continually indicate this state to the user (for example, by a - display of a picture of currency in flames) so that the user does not - inadvertently consume excess resources or suffer from excessive - latency. - -13.1.5 Exceptions to the Rules and Warnings - - In some cases, the operator of a cache MAY choose to configure it to - return stale responses even when not requested by clients. This - decision ought not be made lightly, but may be necessary for reasons - of availability or performance, especially when the cache is poorly - connected to the origin server. Whenever a cache returns a stale - response, it MUST mark it as such (using a Warning header) enabling - the client software to alert the user that there might be a potential - problem. - - - - - - - -Fielding, et al. Standards Track [Page 78] - -RFC 2616 HTTP/1.1 June 1999 - - - It also allows the user agent to take steps to obtain a first-hand or - fresh response. For this reason, a cache SHOULD NOT return a stale - response if the client explicitly requests a first-hand or fresh one, - unless it is impossible to comply for technical or policy reasons. - -13.1.6 Client-controlled Behavior - - While the origin server (and to a lesser extent, intermediate caches, - by their contribution to the age of a response) are the primary - source of expiration information, in some cases the client might need - to control a cache's decision about whether to return a cached - response without validating it. Clients do this using several - directives of the Cache-Control header. - - A client's request MAY specify the maximum age it is willing to - accept of an unvalidated response; specifying a value of zero forces - the cache(s) to revalidate all responses. A client MAY also specify - the minimum time remaining before a response expires. Both of these - options increase constraints on the behavior of caches, and so cannot - further relax the cache's approximation of semantic transparency. - - A client MAY also specify that it will accept stale responses, up to - some maximum amount of staleness. This loosens the constraints on the - caches, and so might violate the origin server's specified - constraints on semantic transparency, but might be necessary to - support disconnected operation, or high availability in the face of - poor connectivity. - -13.2 Expiration Model - -13.2.1 Server-Specified Expiration - - HTTP caching works best when caches can entirely avoid making - requests to the origin server. The primary mechanism for avoiding - requests is for an origin server to provide an explicit expiration - time in the future, indicating that a response MAY be used to satisfy - subsequent requests. In other words, a cache can return a fresh - response without first contacting the server. - - Our expectation is that servers will assign future explicit - expiration times to responses in the belief that the entity is not - likely to change, in a semantically significant way, before the - expiration time is reached. This normally preserves semantic - transparency, as long as the server's expiration times are carefully - chosen. - - - - - - -Fielding, et al. Standards Track [Page 79] - -RFC 2616 HTTP/1.1 June 1999 - - - The expiration mechanism applies only to responses taken from a cache - and not to first-hand responses forwarded immediately to the - requesting client. - - If an origin server wishes to force a semantically transparent cache - to validate every request, it MAY assign an explicit expiration time - in the past. This means that the response is always stale, and so the - cache SHOULD validate it before using it for subsequent requests. See - section 14.9.4 for a more restrictive way to force revalidation. - - If an origin server wishes to force any HTTP/1.1 cache, no matter how - it is configured, to validate every request, it SHOULD use the "must- - revalidate" cache-control directive (see section 14.9). - - Servers specify explicit expiration times using either the Expires - header, or the max-age directive of the Cache-Control header. - - An expiration time cannot be used to force a user agent to refresh - its display or reload a resource; its semantics apply only to caching - mechanisms, and such mechanisms need only check a resource's - expiration status when a new request for that resource is initiated. - See section 13.13 for an explanation of the difference between caches - and history mechanisms. - -13.2.2 Heuristic Expiration - - Since origin servers do not always provide explicit expiration times, - HTTP caches typically assign heuristic expiration times, employing - algorithms that use other header values (such as the Last-Modified - time) to estimate a plausible expiration time. The HTTP/1.1 - specification does not provide specific algorithms, but does impose - worst-case constraints on their results. Since heuristic expiration - times might compromise semantic transparency, they ought to used - cautiously, and we encourage origin servers to provide explicit - expiration times as much as possible. - -13.2.3 Age Calculations - - In order to know if a cached entry is fresh, a cache needs to know if - its age exceeds its freshness lifetime. We discuss how to calculate - the latter in section 13.2.4; this section describes how to calculate - the age of a response or cache entry. - - In this discussion, we use the term "now" to mean "the current value - of the clock at the host performing the calculation." Hosts that use - HTTP, but especially hosts running origin servers and caches, SHOULD - use NTP [28] or some similar protocol to synchronize their clocks to - a globally accurate time standard. - - - -Fielding, et al. Standards Track [Page 80] - -RFC 2616 HTTP/1.1 June 1999 - - - HTTP/1.1 requires origin servers to send a Date header, if possible, - with every response, giving the time at which the response was - generated (see section 14.18). We use the term "date_value" to denote - the value of the Date header, in a form appropriate for arithmetic - operations. - - HTTP/1.1 uses the Age response-header to convey the estimated age of - the response message when obtained from a cache. The Age field value - is the cache's estimate of the amount of time since the response was - generated or revalidated by the origin server. - - In essence, the Age value is the sum of the time that the response - has been resident in each of the caches along the path from the - origin server, plus the amount of time it has been in transit along - network paths. - - We use the term "age_value" to denote the value of the Age header, in - a form appropriate for arithmetic operations. - - A response's age can be calculated in two entirely independent ways: - - 1. now minus date_value, if the local clock is reasonably well - synchronized to the origin server's clock. If the result is - negative, the result is replaced by zero. - - 2. age_value, if all of the caches along the response path - implement HTTP/1.1. - - Given that we have two independent ways to compute the age of a - response when it is received, we can combine these as - - corrected_received_age = max(now - date_value, age_value) - - and as long as we have either nearly synchronized clocks or all- - HTTP/1.1 paths, one gets a reliable (conservative) result. - - Because of network-imposed delays, some significant interval might - pass between the time that a server generates a response and the time - it is received at the next outbound cache or client. If uncorrected, - this delay could result in improperly low ages. - - Because the request that resulted in the returned Age value must have - been initiated prior to that Age value's generation, we can correct - for delays imposed by the network by recording the time at which the - request was initiated. Then, when an Age value is received, it MUST - be interpreted relative to the time the request was initiated, not - - - - - -Fielding, et al. Standards Track [Page 81] - -RFC 2616 HTTP/1.1 June 1999 - - - the time that the response was received. This algorithm results in - conservative behavior no matter how much delay is experienced. So, we - compute: - - corrected_initial_age = corrected_received_age - + (now - request_time) - - where "request_time" is the time (according to the local clock) when - the request that elicited this response was sent. - - Summary of age calculation algorithm, when a cache receives a - response: - - /* - * age_value - * is the value of Age: header received by the cache with - * this response. - * date_value - * is the value of the origin server's Date: header - * request_time - * is the (local) time when the cache made the request - * that resulted in this cached response - * response_time - * is the (local) time when the cache received the - * response - * now - * is the current (local) time - */ - - apparent_age = max(0, response_time - date_value); - corrected_received_age = max(apparent_age, age_value); - response_delay = response_time - request_time; - corrected_initial_age = corrected_received_age + response_delay; - resident_time = now - response_time; - current_age = corrected_initial_age + resident_time; - - The current_age of a cache entry is calculated by adding the amount - of time (in seconds) since the cache entry was last validated by the - origin server to the corrected_initial_age. When a response is - generated from a cache entry, the cache MUST include a single Age - header field in the response with a value equal to the cache entry's - current_age. - - The presence of an Age header field in a response implies that a - response is not first-hand. However, the converse is not true, since - the lack of an Age header field in a response does not imply that the - - - - - -Fielding, et al. Standards Track [Page 82] - -RFC 2616 HTTP/1.1 June 1999 - - - response is first-hand unless all caches along the request path are - compliant with HTTP/1.1 (i.e., older HTTP caches did not implement - the Age header field). - -13.2.4 Expiration Calculations - - In order to decide whether a response is fresh or stale, we need to - compare its freshness lifetime to its age. The age is calculated as - described in section 13.2.3; this section describes how to calculate - the freshness lifetime, and to determine if a response has expired. - In the discussion below, the values can be represented in any form - appropriate for arithmetic operations. - - We use the term "expires_value" to denote the value of the Expires - header. We use the term "max_age_value" to denote an appropriate - value of the number of seconds carried by the "max-age" directive of - the Cache-Control header in a response (see section 14.9.3). - - The max-age directive takes priority over Expires, so if max-age is - present in a response, the calculation is simply: - - freshness_lifetime = max_age_value - - Otherwise, if Expires is present in the response, the calculation is: - - freshness_lifetime = expires_value - date_value - - Note that neither of these calculations is vulnerable to clock skew, - since all of the information comes from the origin server. - - If none of Expires, Cache-Control: max-age, or Cache-Control: s- - maxage (see section 14.9.3) appears in the response, and the response - does not include other restrictions on caching, the cache MAY compute - a freshness lifetime using a heuristic. The cache MUST attach Warning - 113 to any response whose age is more than 24 hours if such warning - has not already been added. - - Also, if the response does have a Last-Modified time, the heuristic - expiration value SHOULD be no more than some fraction of the interval - since that time. A typical setting of this fraction might be 10%. - - The calculation to determine if a response has expired is quite - simple: - - response_is_fresh = (freshness_lifetime > current_age) - - - - - - -Fielding, et al. Standards Track [Page 83] - -RFC 2616 HTTP/1.1 June 1999 - - -13.2.5 Disambiguating Expiration Values - - Because expiration values are assigned optimistically, it is possible - for two caches to contain fresh values for the same resource that are - different. - - If a client performing a retrieval receives a non-first-hand response - for a request that was already fresh in its own cache, and the Date - header in its existing cache entry is newer than the Date on the new - response, then the client MAY ignore the response. If so, it MAY - retry the request with a "Cache-Control: max-age=0" directive (see - section 14.9), to force a check with the origin server. - - If a cache has two fresh responses for the same representation with - different validators, it MUST use the one with the more recent Date - header. This situation might arise because the cache is pooling - responses from other caches, or because a client has asked for a - reload or a revalidation of an apparently fresh cache entry. - -13.2.6 Disambiguating Multiple Responses - - Because a client might be receiving responses via multiple paths, so - that some responses flow through one set of caches and other - responses flow through a different set of caches, a client might - receive responses in an order different from that in which the origin - server sent them. We would like the client to use the most recently - generated response, even if older responses are still apparently - fresh. - - Neither the entity tag nor the expiration value can impose an - ordering on responses, since it is possible that a later response - intentionally carries an earlier expiration time. The Date values are - ordered to a granularity of one second. - - When a client tries to revalidate a cache entry, and the response it - receives contains a Date header that appears to be older than the one - for the existing entry, then the client SHOULD repeat the request - unconditionally, and include - - Cache-Control: max-age=0 - - to force any intermediate caches to validate their copies directly - with the origin server, or - - Cache-Control: no-cache - - to force any intermediate caches to obtain a new copy from the origin - server. - - - -Fielding, et al. Standards Track [Page 84] - -RFC 2616 HTTP/1.1 June 1999 - - - If the Date values are equal, then the client MAY use either response - (or MAY, if it is being extremely prudent, request a new response). - Servers MUST NOT depend on clients being able to choose - deterministically between responses generated during the same second, - if their expiration times overlap. - -13.3 Validation Model - - When a cache has a stale entry that it would like to use as a - response to a client's request, it first has to check with the origin - server (or possibly an intermediate cache with a fresh response) to - see if its cached entry is still usable. We call this "validating" - the cache entry. Since we do not want to have to pay the overhead of - retransmitting the full response if the cached entry is good, and we - do not want to pay the overhead of an extra round trip if the cached - entry is invalid, the HTTP/1.1 protocol supports the use of - conditional methods. - - The key protocol features for supporting conditional methods are - those concerned with "cache validators." When an origin server - generates a full response, it attaches some sort of validator to it, - which is kept with the cache entry. When a client (user agent or - proxy cache) makes a conditional request for a resource for which it - has a cache entry, it includes the associated validator in the - request. - - The server then checks that validator against the current validator - for the entity, and, if they match (see section 13.3.3), it responds - with a special status code (usually, 304 (Not Modified)) and no - entity-body. Otherwise, it returns a full response (including - entity-body). Thus, we avoid transmitting the full response if the - validator matches, and we avoid an extra round trip if it does not - match. - - In HTTP/1.1, a conditional request looks exactly the same as a normal - request for the same resource, except that it carries a special - header (which includes the validator) that implicitly turns the - method (usually, GET) into a conditional. - - The protocol includes both positive and negative senses of cache- - validating conditions. That is, it is possible to request either that - a method be performed if and only if a validator matches or if and - only if no validators match. - - - - - - - - -Fielding, et al. Standards Track [Page 85] - -RFC 2616 HTTP/1.1 June 1999 - - - Note: a response that lacks a validator may still be cached, and - served from cache until it expires, unless this is explicitly - prohibited by a cache-control directive. However, a cache cannot - do a conditional retrieval if it does not have a validator for the - entity, which means it will not be refreshable after it expires. - -13.3.1 Last-Modified Dates - - The Last-Modified entity-header field value is often used as a cache - validator. In simple terms, a cache entry is considered to be valid - if the entity has not been modified since the Last-Modified value. - -13.3.2 Entity Tag Cache Validators - - The ETag response-header field value, an entity tag, provides for an - "opaque" cache validator. This might allow more reliable validation - in situations where it is inconvenient to store modification dates, - where the one-second resolution of HTTP date values is not - sufficient, or where the origin server wishes to avoid certain - paradoxes that might arise from the use of modification dates. - - Entity Tags are described in section 3.11. The headers used with - entity tags are described in sections 14.19, 14.24, 14.26 and 14.44. - -13.3.3 Weak and Strong Validators - - Since both origin servers and caches will compare two validators to - decide if they represent the same or different entities, one normally - would expect that if the entity (the entity-body or any entity- - headers) changes in any way, then the associated validator would - change as well. If this is true, then we call this validator a - "strong validator." - - However, there might be cases when a server prefers to change the - validator only on semantically significant changes, and not when - insignificant aspects of the entity change. A validator that does not - always change when the resource changes is a "weak validator." - - Entity tags are normally "strong validators," but the protocol - provides a mechanism to tag an entity tag as "weak." One can think of - a strong validator as one that changes whenever the bits of an entity - changes, while a weak value changes whenever the meaning of an entity - changes. Alternatively, one can think of a strong validator as part - of an identifier for a specific entity, while a weak validator is - part of an identifier for a set of semantically equivalent entities. - - Note: One example of a strong validator is an integer that is - incremented in stable storage every time an entity is changed. - - - -Fielding, et al. Standards Track [Page 86] - -RFC 2616 HTTP/1.1 June 1999 - - - An entity's modification time, if represented with one-second - resolution, could be a weak validator, since it is possible that - the resource might be modified twice during a single second. - - Support for weak validators is optional. However, weak validators - allow for more efficient caching of equivalent objects; for - example, a hit counter on a site is probably good enough if it is - updated every few days or weeks, and any value during that period - is likely "good enough" to be equivalent. - - A "use" of a validator is either when a client generates a request - and includes the validator in a validating header field, or when a - server compares two validators. - - Strong validators are usable in any context. Weak validators are only - usable in contexts that do not depend on exact equality of an entity. - For example, either kind is usable for a conditional GET of a full - entity. However, only a strong validator is usable for a sub-range - retrieval, since otherwise the client might end up with an internally - inconsistent entity. - - Clients MAY issue simple (non-subrange) GET requests with either weak - validators or strong validators. Clients MUST NOT use weak validators - in other forms of request. - - The only function that the HTTP/1.1 protocol defines on validators is - comparison. There are two validator comparison functions, depending - on whether the comparison context allows the use of weak validators - or not: - - - The strong comparison function: in order to be considered equal, - both validators MUST be identical in every way, and both MUST - NOT be weak. - - - The weak comparison function: in order to be considered equal, - both validators MUST be identical in every way, but either or - both of them MAY be tagged as "weak" without affecting the - result. - - An entity tag is strong unless it is explicitly tagged as weak. - Section 3.11 gives the syntax for entity tags. - - A Last-Modified time, when used as a validator in a request, is - implicitly weak unless it is possible to deduce that it is strong, - using the following rules: - - - The validator is being compared by an origin server to the - actual current validator for the entity and, - - - -Fielding, et al. Standards Track [Page 87] - -RFC 2616 HTTP/1.1 June 1999 - - - - That origin server reliably knows that the associated entity did - not change twice during the second covered by the presented - validator. - - or - - - The validator is about to be used by a client in an If- - Modified-Since or If-Unmodified-Since header, because the client - has a cache entry for the associated entity, and - - - That cache entry includes a Date value, which gives the time - when the origin server sent the original response, and - - - The presented Last-Modified time is at least 60 seconds before - the Date value. - - or - - - The validator is being compared by an intermediate cache to the - validator stored in its cache entry for the entity, and - - - That cache entry includes a Date value, which gives the time - when the origin server sent the original response, and - - - The presented Last-Modified time is at least 60 seconds before - the Date value. - - This method relies on the fact that if two different responses were - sent by the origin server during the same second, but both had the - same Last-Modified time, then at least one of those responses would - have a Date value equal to its Last-Modified time. The arbitrary 60- - second limit guards against the possibility that the Date and Last- - Modified values are generated from different clocks, or at somewhat - different times during the preparation of the response. An - implementation MAY use a value larger than 60 seconds, if it is - believed that 60 seconds is too short. - - If a client wishes to perform a sub-range retrieval on a value for - which it has only a Last-Modified time and no opaque validator, it - MAY do this only if the Last-Modified time is strong in the sense - described here. - - A cache or origin server receiving a conditional request, other than - a full-body GET request, MUST use the strong comparison function to - evaluate the condition. - - These rules allow HTTP/1.1 caches and clients to safely perform sub- - range retrievals on values that have been obtained from HTTP/1.0 - - - -Fielding, et al. Standards Track [Page 88] - -RFC 2616 HTTP/1.1 June 1999 - - - servers. - -13.3.4 Rules for When to Use Entity Tags and Last-Modified Dates - - We adopt a set of rules and recommendations for origin servers, - clients, and caches regarding when various validator types ought to - be used, and for what purposes. - - HTTP/1.1 origin servers: - - - SHOULD send an entity tag validator unless it is not feasible to - generate one. - - - MAY send a weak entity tag instead of a strong entity tag, if - performance considerations support the use of weak entity tags, - or if it is unfeasible to send a strong entity tag. - - - SHOULD send a Last-Modified value if it is feasible to send one, - unless the risk of a breakdown in semantic transparency that - could result from using this date in an If-Modified-Since header - would lead to serious problems. - - In other words, the preferred behavior for an HTTP/1.1 origin server - is to send both a strong entity tag and a Last-Modified value. - - In order to be legal, a strong entity tag MUST change whenever the - associated entity value changes in any way. A weak entity tag SHOULD - change whenever the associated entity changes in a semantically - significant way. - - Note: in order to provide semantically transparent caching, an - origin server must avoid reusing a specific strong entity tag - value for two different entities, or reusing a specific weak - entity tag value for two semantically different entities. Cache - entries might persist for arbitrarily long periods, regardless of - expiration times, so it might be inappropriate to expect that a - cache will never again attempt to validate an entry using a - validator that it obtained at some point in the past. - - HTTP/1.1 clients: - - - If an entity tag has been provided by the origin server, MUST - use that entity tag in any cache-conditional request (using If- - Match or If-None-Match). - - - If only a Last-Modified value has been provided by the origin - server, SHOULD use that value in non-subrange cache-conditional - requests (using If-Modified-Since). - - - -Fielding, et al. Standards Track [Page 89] - -RFC 2616 HTTP/1.1 June 1999 - - - - If only a Last-Modified value has been provided by an HTTP/1.0 - origin server, MAY use that value in subrange cache-conditional - requests (using If-Unmodified-Since:). The user agent SHOULD - provide a way to disable this, in case of difficulty. - - - If both an entity tag and a Last-Modified value have been - provided by the origin server, SHOULD use both validators in - cache-conditional requests. This allows both HTTP/1.0 and - HTTP/1.1 caches to respond appropriately. - - An HTTP/1.1 origin server, upon receiving a conditional request that - includes both a Last-Modified date (e.g., in an If-Modified-Since or - If-Unmodified-Since header field) and one or more entity tags (e.g., - in an If-Match, If-None-Match, or If-Range header field) as cache - validators, MUST NOT return a response status of 304 (Not Modified) - unless doing so is consistent with all of the conditional header - fields in the request. - - An HTTP/1.1 caching proxy, upon receiving a conditional request that - includes both a Last-Modified date and one or more entity tags as - cache validators, MUST NOT return a locally cached response to the - client unless that cached response is consistent with all of the - conditional header fields in the request. - - Note: The general principle behind these rules is that HTTP/1.1 - servers and clients should transmit as much non-redundant - information as is available in their responses and requests. - HTTP/1.1 systems receiving this information will make the most - conservative assumptions about the validators they receive. - - HTTP/1.0 clients and caches will ignore entity tags. Generally, - last-modified values received or used by these systems will - support transparent and efficient caching, and so HTTP/1.1 origin - servers should provide Last-Modified values. In those rare cases - where the use of a Last-Modified value as a validator by an - HTTP/1.0 system could result in a serious problem, then HTTP/1.1 - origin servers should not provide one. - -13.3.5 Non-validating Conditionals - - The principle behind entity tags is that only the service author - knows the semantics of a resource well enough to select an - appropriate cache validation mechanism, and the specification of any - validator comparison function more complex than byte-equality would - open up a can of worms. Thus, comparisons of any other headers - (except Last-Modified, for compatibility with HTTP/1.0) are never - used for purposes of validating a cache entry. - - - - -Fielding, et al. Standards Track [Page 90] - -RFC 2616 HTTP/1.1 June 1999 - - -13.4 Response Cacheability - - Unless specifically constrained by a cache-control (section 14.9) - directive, a caching system MAY always store a successful response - (see section 13.8) as a cache entry, MAY return it without validation - if it is fresh, and MAY return it after successful validation. If - there is neither a cache validator nor an explicit expiration time - associated with a response, we do not expect it to be cached, but - certain caches MAY violate this expectation (for example, when little - or no network connectivity is available). A client can usually detect - that such a response was taken from a cache by comparing the Date - header to the current time. - - Note: some HTTP/1.0 caches are known to violate this expectation - without providing any Warning. - - However, in some cases it might be inappropriate for a cache to - retain an entity, or to return it in response to a subsequent - request. This might be because absolute semantic transparency is - deemed necessary by the service author, or because of security or - privacy considerations. Certain cache-control directives are - therefore provided so that the server can indicate that certain - resource entities, or portions thereof, are not to be cached - regardless of other considerations. - - Note that section 14.8 normally prevents a shared cache from saving - and returning a response to a previous request if that request - included an Authorization header. - - A response received with a status code of 200, 203, 206, 300, 301 or - 410 MAY be stored by a cache and used in reply to a subsequent - request, subject to the expiration mechanism, unless a cache-control - directive prohibits caching. However, a cache that does not support - the Range and Content-Range headers MUST NOT cache 206 (Partial - Content) responses. - - A response received with any other status code (e.g. status codes 302 - and 307) MUST NOT be returned in a reply to a subsequent request - unless there are cache-control directives or another header(s) that - explicitly allow it. For example, these include the following: an - Expires header (section 14.21); a "max-age", "s-maxage", "must- - revalidate", "proxy-revalidate", "public" or "private" cache-control - directive (section 14.9). - - - - - - - - -Fielding, et al. Standards Track [Page 91] - -RFC 2616 HTTP/1.1 June 1999 - - -13.5 Constructing Responses From Caches - - The purpose of an HTTP cache is to store information received in - response to requests for use in responding to future requests. In - many cases, a cache simply returns the appropriate parts of a - response to the requester. However, if the cache holds a cache entry - based on a previous response, it might have to combine parts of a new - response with what is held in the cache entry. - -13.5.1 End-to-end and Hop-by-hop Headers - - For the purpose of defining the behavior of caches and non-caching - proxies, we divide HTTP headers into two categories: - - - End-to-end headers, which are transmitted to the ultimate - recipient of a request or response. End-to-end headers in - responses MUST be stored as part of a cache entry and MUST be - transmitted in any response formed from a cache entry. - - - Hop-by-hop headers, which are meaningful only for a single - transport-level connection, and are not stored by caches or - forwarded by proxies. - - The following HTTP/1.1 headers are hop-by-hop headers: - - - Connection - - Keep-Alive - - Proxy-Authenticate - - Proxy-Authorization - - TE - - Trailers [[should be "Trailer"]] - - Transfer-Encoding - - Upgrade - - All other headers defined by HTTP/1.1 are end-to-end headers. - - Other hop-by-hop headers MUST be listed in a Connection header, - (section 14.10) to be introduced into HTTP/1.1 (or later). - -13.5.2 Non-modifiable Headers - - Some features of the HTTP/1.1 protocol, such as Digest - Authentication, depend on the value of certain end-to-end headers. A - transparent proxy SHOULD NOT modify an end-to-end header unless the - definition of that header requires or specifically allows that. - - - - - - -Fielding, et al. Standards Track [Page 92] - -RFC 2616 HTTP/1.1 June 1999 - - - A transparent proxy MUST NOT modify any of the following fields in a - request or response, and it MUST NOT add any of these fields if not - already present: - - - Content-Location - - - Content-MD5 - - - ETag - - - Last-Modified - - A transparent proxy MUST NOT modify any of the following fields in a - response: - - - Expires - - but it MAY add any of these fields if not already present. If an - Expires header is added, it MUST be given a field-value identical to - that of the Date header in that response. - - A proxy MUST NOT modify or add any of the following fields in a - message that contains the no-transform cache-control directive, or in - any request: - - - Content-Encoding - - - Content-Range - - - Content-Type - - A non-transparent proxy MAY modify or add these fields to a message - that does not include no-transform, but if it does so, it MUST add a - Warning 214 (Transformation applied) if one does not already appear - in the message (see section 14.46). - - Warning: unnecessary modification of end-to-end headers might - cause authentication failures if stronger authentication - mechanisms are introduced in later versions of HTTP. Such - authentication mechanisms MAY rely on the values of header fields - not listed here. - - The Content-Length field of a request or response is added or deleted - according to the rules in section 4.4. A transparent proxy MUST - preserve the entity-length (section 7.2.2) of the entity-body, - although it MAY change the transfer-length (section 4.4). - - - - - -Fielding, et al. Standards Track [Page 93] - -RFC 2616 HTTP/1.1 June 1999 - - -13.5.3 Combining Headers - - When a cache makes a validating request to a server, and the server - provides a 304 (Not Modified) response or a 206 (Partial Content) - response, the cache then constructs a response to send to the - requesting client. - - If the status code is 304 (Not Modified), the cache uses the entity- - body stored in the cache entry as the entity-body of this outgoing - response. If the status code is 206 (Partial Content) and the ETag or - Last-Modified headers match exactly, the cache MAY combine the - contents stored in the cache entry with the new contents received in - the response and use the result as the entity-body of this outgoing - response, (see 13.5.4). - - The end-to-end headers stored in the cache entry are used for the - constructed response, except that - - - any stored Warning headers with warn-code 1xx (see section - 14.46) MUST be deleted from the cache entry and the forwarded - response. - - - any stored Warning headers with warn-code 2xx MUST be retained - in the cache entry and the forwarded response. - - - any end-to-end headers provided in the 304 or 206 response MUST - replace the corresponding headers from the cache entry. - - Unless the cache decides to remove the cache entry, it MUST also - replace the end-to-end headers stored with the cache entry with - corresponding headers received in the incoming response, except for - Warning headers as described immediately above. If a header field- - name in the incoming response matches more than one header in the - cache entry, all such old headers MUST be replaced. - - In other words, the set of end-to-end headers received in the - incoming response overrides all corresponding end-to-end headers - stored with the cache entry (except for stored Warning headers with - warn-code 1xx, which are deleted even if not overridden). - - Note: this rule allows an origin server to use a 304 (Not - Modified) or a 206 (Partial Content) response to update any header - associated with a previous response for the same entity or sub- - ranges thereof, although it might not always be meaningful or - correct to do so. This rule does not allow an origin server to use - a 304 (Not Modified) or a 206 (Partial Content) response to - entirely delete a header that it had provided with a previous - response. - - - -Fielding, et al. Standards Track [Page 94] - -RFC 2616 HTTP/1.1 June 1999 - - -13.5.4 Combining Byte Ranges - - A response might transfer only a subrange of the bytes of an entity- - body, either because the request included one or more Range - specifications, or because a connection was broken prematurely. After - several such transfers, a cache might have received several ranges of - the same entity-body. - - If a cache has a stored non-empty set of subranges for an entity, and - an incoming response transfers another subrange, the cache MAY - combine the new subrange with the existing set if both the following - conditions are met: - - - Both the incoming response and the cache entry have a cache - validator. - - - The two cache validators match using the strong comparison - function (see section 13.3.3). - - If either requirement is not met, the cache MUST use only the most - recent partial response (based on the Date values transmitted with - every response, and using the incoming response if these values are - equal or missing), and MUST discard the other partial information. - -13.6 Caching Negotiated Responses - - Use of server-driven content negotiation (section 12.1), as indicated - by the presence of a Vary header field in a response, alters the - conditions and procedure by which a cache can use the response for - subsequent requests. See section 14.44 for use of the Vary header - field by servers. - - A server SHOULD use the Vary header field to inform a cache of what - request-header fields were used to select among multiple - representations of a cacheable response subject to server-driven - negotiation. The set of header fields named by the Vary field value - is known as the "selecting" request-headers. - - When the cache receives a subsequent request whose Request-URI - specifies one or more cache entries including a Vary header field, - the cache MUST NOT use such a cache entry to construct a response to - the new request unless all of the selecting request-headers present - in the new request match the corresponding stored request-headers in - the original request. - - The selecting request-headers from two requests are defined to match - if and only if the selecting request-headers in the first request can - be transformed to the selecting request-headers in the second request - - - -Fielding, et al. Standards Track [Page 95] - -RFC 2616 HTTP/1.1 June 1999 - - - by adding or removing linear white space (LWS) at places where this - is allowed by the corresponding BNF, and/or combining multiple - message-header fields with the same field name following the rules - about message headers in section 4.2. - - A Vary header field-value of "*" always fails to match and subsequent - requests on that resource can only be properly interpreted by the - origin server. - - If the selecting request header fields for the cached entry do not - match the selecting request header fields of the new request, then - the cache MUST NOT use a cached entry to satisfy the request unless - it first relays the new request to the origin server in a conditional - request and the server responds with 304 (Not Modified), including an - entity tag or Content-Location that indicates the entity to be used. - - If an entity tag was assigned to a cached representation, the - forwarded request SHOULD be conditional and include the entity tags - in an If-None-Match header field from all its cache entries for the - resource. This conveys to the server the set of entities currently - held by the cache, so that if any one of these entities matches the - requested entity, the server can use the ETag header field in its 304 - (Not Modified) response to tell the cache which entry is appropriate. - If the entity-tag of the new response matches that of an existing - entry, the new response SHOULD be used to update the header fields of - the existing entry, and the result MUST be returned to the client. - - If any of the existing cache entries contains only partial content - for the associated entity, its entity-tag SHOULD NOT be included in - the If-None-Match header field unless the request is for a range that - would be fully satisfied by that entry. - - If a cache receives a successful response whose Content-Location - field matches that of an existing cache entry for the same Request- - ]URI, whose entity-tag differs from that of the existing entry, and - whose Date is more recent than that of the existing entry, the - existing entry SHOULD NOT be returned in response to future requests - and SHOULD be deleted from the cache. - -13.7 Shared and Non-Shared Caches - - For reasons of security and privacy, it is necessary to make a - distinction between "shared" and "non-shared" caches. A non-shared - cache is one that is accessible only to a single user. Accessibility - in this case SHOULD be enforced by appropriate security mechanisms. - All other caches are considered to be "shared." Other sections of - - - - - -Fielding, et al. Standards Track [Page 96] - -RFC 2616 HTTP/1.1 June 1999 - - - this specification place certain constraints on the operation of - shared caches in order to prevent loss of privacy or failure of - access controls. - -13.8 Errors or Incomplete Response Cache Behavior - - A cache that receives an incomplete response (for example, with fewer - bytes of data than specified in a Content-Length header) MAY store - the response. However, the cache MUST treat this as a partial - response. Partial responses MAY be combined as described in section - 13.5.4; the result might be a full response or might still be - partial. A cache MUST NOT return a partial response to a client - without explicitly marking it as such, using the 206 (Partial - Content) status code. A cache MUST NOT return a partial response - using a status code of 200 (OK). - - If a cache receives a 5xx response while attempting to revalidate an - entry, it MAY either forward this response to the requesting client, - or act as if the server failed to respond. In the latter case, it MAY - return a previously received response unless the cached entry - includes the "must-revalidate" cache-control directive (see section - 14.9). - -13.9 Side Effects of GET and HEAD - - Unless the origin server explicitly prohibits the caching of their - responses, the application of GET and HEAD methods to any resources - SHOULD NOT have side effects that would lead to erroneous behavior if - these responses are taken from a cache. They MAY still have side - effects, but a cache is not required to consider such side effects in - its caching decisions. Caches are always expected to observe an - origin server's explicit restrictions on caching. - - We note one exception to this rule: since some applications have - traditionally used GETs and HEADs with query URLs (those containing a - "?" in the rel_path part) to perform operations with significant side - effects, caches MUST NOT treat responses to such URIs as fresh unless - the server provides an explicit expiration time. This specifically - means that responses from HTTP/1.0 servers for such URIs SHOULD NOT - be taken from a cache. See section 9.1.1 for related information. - -13.10 Invalidation After Updates or Deletions - - The effect of certain methods performed on a resource at the origin - server might cause one or more existing cache entries to become non- - transparently invalid. That is, although they might continue to be - "fresh," they do not accurately reflect what the origin server would - return for a new request on that resource. - - - -Fielding, et al. Standards Track [Page 97] - -RFC 2616 HTTP/1.1 June 1999 - - - There is no way for the HTTP protocol to guarantee that all such - cache entries are marked invalid. For example, the request that - caused the change at the origin server might not have gone through - the proxy where a cache entry is stored. However, several rules help - reduce the likelihood of erroneous behavior. - - In this section, the phrase "invalidate an entity" means that the - cache will either remove all instances of that entity from its - storage, or will mark these as "invalid" and in need of a mandatory - revalidation before they can be returned in response to a subsequent - request. - - Some HTTP methods MUST cause a cache to invalidate an entity. This is - either the entity referred to by the Request-URI, or by the Location - or Content-Location headers (if present). These methods are: - - - PUT - - - DELETE - - - POST - - In order to prevent denial of service attacks, an invalidation based - on the URI in a Location or Content-Location header MUST only be - performed if the host part is the same as in the Request-URI. - -[[ Should be: ]] -[[ An invalidation based on the URI in a Location or Content-Location ]] -[[ header MUST NOT be performed if the host part of that URI differs ]] -[[ from the host part in the Request-URI. This helps prevent denial of ]] -[[ service attacks. ]] - - A cache that passes through requests for methods it does not - understand SHOULD invalidate any entities referred to by the - Request-URI. - -13.11 Write-Through Mandatory - - All methods that might be expected to cause modifications to the - origin server's resources MUST be written through to the origin - server. This currently includes all methods except for GET and HEAD. - A cache MUST NOT reply to such a request from a client before having - transmitted the request to the inbound server, and having received a - corresponding response from the inbound server. This does not prevent - a proxy cache from sending a 100 (Continue) response before the - inbound server has sent its final reply. - - The alternative (known as "write-back" or "copy-back" caching) is not - allowed in HTTP/1.1, due to the difficulty of providing consistent - updates and the problems arising from server, cache, or network - failure prior to write-back. - - - - - - -Fielding, et al. Standards Track [Page 98] - -RFC 2616 HTTP/1.1 June 1999 - - -13.12 Cache Replacement - - If a new cacheable (see sections 14.9.2, 13.2.5, 13.2.6 and 13.8) - response is received from a resource while any existing responses for - the same resource are cached, the cache SHOULD use the new response - to reply to the current request. It MAY insert it into cache storage - and MAY, if it meets all other requirements, use it to respond to any - future requests that would previously have caused the old response to - be returned. If it inserts the new response into cache storage the - rules in section 13.5.3 apply. - - Note: a new response that has an older Date header value than - existing cached responses is not cacheable. - -13.13 History Lists - - User agents often have history mechanisms, such as "Back" buttons and - history lists, which can be used to redisplay an entity retrieved - earlier in a session. - - History mechanisms and caches are different. In particular history - mechanisms SHOULD NOT try to show a semantically transparent view of - the current state of a resource. Rather, a history mechanism is meant - to show exactly what the user saw at the time when the resource was - retrieved. - - By default, an expiration time does not apply to history mechanisms. - If the entity is still in storage, a history mechanism SHOULD display - it even if the entity has expired, unless the user has specifically - configured the agent to refresh expired history documents. - - This is not to be construed to prohibit the history mechanism from - telling the user that a view might be stale. - - Note: if history list mechanisms unnecessarily prevent users from - viewing stale resources, this will tend to force service authors - to avoid using HTTP expiration controls and cache controls when - they would otherwise like to. Service authors may consider it - important that users not be presented with error messages or - warning messages when they use navigation controls (such as BACK) - to view previously fetched resources. Even though sometimes such - resources ought not to cached, or ought to expire quickly, user - interface considerations may force service authors to resort to - other means of preventing caching (e.g. "once-only" URLs) in order - not to suffer the effects of improperly functioning history - mechanisms. - - - - - -Fielding, et al. Standards Track [Page 99] - -RFC 2616 HTTP/1.1 June 1999 - - -14 Header Field Definitions - - This section defines the syntax and semantics of all standard - HTTP/1.1 header fields. For entity-header fields, both sender and - recipient refer to either the client or the server, depending on who - sends and who receives the entity. - -14.1 Accept - - The Accept request-header field can be used to specify certain media - types which are acceptable for the response. Accept headers can be - used to indicate that the request is specifically limited to a small - set of desired types, as in the case of a request for an in-line - image. - - Accept = "Accept" ":" - #( media-range [ accept-params ] ) - - media-range = ( "*/*" - | ( type "/" "*" ) - | ( type "/" subtype ) - ) *( ";" parameter ) - accept-params = ";" "q" "=" qvalue *( accept-extension ) - accept-extension = ";" token [ "=" ( token | quoted-string ) ] - - The asterisk "*" character is used to group media types into ranges, - with "*/*" indicating all media types and "type/*" indicating all - subtypes of that type. The media-range MAY include media type - parameters that are applicable to that range. - - Each media-range MAY be followed by one or more accept-params, - beginning with the "q" parameter for indicating a relative quality - factor. The first "q" parameter (if any) separates the media-range - parameter(s) from the accept-params. Quality factors allow the user - or user agent to indicate the relative degree of preference for that - media-range, using the qvalue scale from 0 to 1 (section 3.9). The - default value is q=1. - - Note: Use of the "q" parameter name to separate media type - parameters from Accept extension parameters is due to historical - practice. Although this prevents any media type parameter named - "q" from being used with a media range, such an event is believed - to be unlikely given the lack of any "q" parameters in the IANA - media type registry and the rare usage of any media type - parameters in Accept. Future media types are discouraged from - registering any parameter named "q". - - - - - -Fielding, et al. Standards Track [Page 100] - -RFC 2616 HTTP/1.1 June 1999 - - - The example - - Accept: audio/*; q=0.2, audio/basic - - SHOULD be interpreted as "I prefer audio/basic, but send me any audio - type if it is the best available after an 80% mark-down in quality." - - If no Accept header field is present, then it is assumed that the - client accepts all media types. If an Accept header field is present, - and if the server cannot send a response which is acceptable - according to the combined Accept field value, then the server SHOULD - send a 406 (not acceptable) response. - - A more elaborate example is - - Accept: text/plain; q=0.5, text/html, - text/x-dvi; q=0.8, text/x-c - - Verbally, this would be interpreted as "text/html and text/x-c are - the preferred media types, but if they do not exist, then send the - text/x-dvi entity, and if that does not exist, send the text/plain - entity." - - Media ranges can be overridden by more specific media ranges or - specific media types. If more than one media range applies to a given - type, the most specific reference has precedence. For example, - - Accept: text/*, text/html, text/html;level=1, */* - - have the following precedence: - - 1) text/html;level=1 - 2) text/html - 3) text/* - 4) */* - - The media type quality factor associated with a given type is - determined by finding the media range with the highest precedence - which matches that type. For example, - - Accept: text/*;q=0.3, text/html;q=0.7, text/html;level=1, - text/html;level=2;q=0.4, */*;q=0.5 - - would cause the following values to be associated: - - text/html;level=1 = 1 - text/html = 0.7 - text/plain = 0.3 - - - -Fielding, et al. Standards Track [Page 101] - -RFC 2616 HTTP/1.1 June 1999 - - - image/jpeg = 0.5 - text/html;level=2 = 0.4 - text/html;level=3 = 0.7 - - Note: A user agent might be provided with a default set of quality - values for certain media ranges. However, unless the user agent is - a closed system which cannot interact with other rendering agents, - this default set ought to be configurable by the user. - -14.2 Accept-Charset - - The Accept-Charset request-header field can be used to indicate what - character sets are acceptable for the response. This field allows - clients capable of understanding more comprehensive or special- - purpose character sets to signal that capability to a server which is - capable of representing documents in those character sets. - - Accept-Charset = "Accept-Charset" ":" - 1#( ( charset | "*" )[ ";" "q" "=" qvalue ] ) - - - Character set values are described in section 3.4. Each charset MAY - be given an associated quality value which represents the user's - preference for that charset. The default value is q=1. An example is - - Accept-Charset: iso-8859-5, unicode-1-1;q=0.8 - - The special value "*", if present in the Accept-Charset field, - matches every character set (including ISO-8859-1) which is not - mentioned elsewhere in the Accept-Charset field. If no "*" is present - in an Accept-Charset field, then all character sets not explicitly - mentioned get a quality value of 0, except for ISO-8859-1, which gets - a quality value of 1 if not explicitly mentioned. - - If no Accept-Charset header is present, the default is that any - character set is acceptable. If an Accept-Charset header is present, - and if the server cannot send a response which is acceptable - according to the Accept-Charset header, then the server SHOULD send - an error response with the 406 (not acceptable) status code, though - the sending of an unacceptable response is also allowed. - -14.3 Accept-Encoding - - The Accept-Encoding request-header field is similar to Accept, but - restricts the content-codings (section 3.5) that are acceptable in - the response. - - Accept-Encoding = "Accept-Encoding" ":" - - - -Fielding, et al. Standards Track [Page 102] - -RFC 2616 HTTP/1.1 June 1999 - - - 1#( codings [ ";" "q" "=" qvalue ] ) - codings = ( content-coding | "*" ) - - [[ http://lists.w3.org/Archives/Public/ietf-http-wg/2005AprJun/0029.html ]] - [[ points out that the "1#" must be "#" to make the examples below and ]] - [[ the text of rule 4 correct. ]] - - Examples of its use are: - - Accept-Encoding: compress, gzip - Accept-Encoding: - Accept-Encoding: * - Accept-Encoding: compress;q=0.5, gzip;q=1.0 - Accept-Encoding: gzip;q=1.0, identity; q=0.5, *;q=0 - - A server tests whether a content-coding is acceptable, according to - an Accept-Encoding field, using these rules: - - 1. If the content-coding is one of the content-codings listed in - the Accept-Encoding field, then it is acceptable, unless it is - accompanied by a qvalue of 0. (As defined in section 3.9, a - qvalue of 0 means "not acceptable.") - - 2. The special "*" symbol in an Accept-Encoding field matches any - available content-coding not explicitly listed in the header - field. - - 3. If multiple content-codings are acceptable, then the acceptable - content-coding with the highest non-zero qvalue is preferred. - - 4. The "identity" content-coding is always acceptable, unless - specifically refused because the Accept-Encoding field includes - "identity;q=0", or because the field includes "*;q=0" and does - not explicitly include the "identity" content-coding. If the - Accept-Encoding field-value is empty, then only the "identity" - encoding is acceptable. - - If an Accept-Encoding field is present in a request, and if the - server cannot send a response which is acceptable according to the - Accept-Encoding header, then the server SHOULD send an error response - with the 406 (Not Acceptable) status code. - - If no Accept-Encoding field is present in a request, the server MAY - assume that the client will accept any content coding. In this case, - if "identity" is one of the available content-codings, then the - server SHOULD use the "identity" content-coding, unless it has - additional information that a different content-coding is meaningful - to the client. - - Note: If the request does not include an Accept-Encoding field, - and if the "identity" content-coding is unavailable, then - content-codings commonly understood by HTTP/1.0 clients (i.e., - - - -Fielding, et al. Standards Track [Page 103] - -RFC 2616 HTTP/1.1 June 1999 - - - "gzip" and "compress") are preferred; some older clients - improperly display messages sent with other content-codings. The - server might also make this decision based on information about - the particular user-agent or client. - - Note: Most HTTP/1.0 applications do not recognize or obey qvalues - associated with content-codings. This means that qvalues will not - work and are not permitted with x-gzip or x-compress. - -14.4 Accept-Language - - The Accept-Language request-header field is similar to Accept, but - restricts the set of natural languages that are preferred as a - response to the request. Language tags are defined in section 3.10. - - Accept-Language = "Accept-Language" ":" - 1#( language-range [ ";" "q" "=" qvalue ] ) - language-range = ( ( 1*8ALPHA *( "-" 1*8ALPHA ) ) | "*" ) - - Each language-range MAY be given an associated quality value which - represents an estimate of the user's preference for the languages - specified by that range. The quality value defaults to "q=1". For - example, - - Accept-Language: da, en-gb;q=0.8, en;q=0.7 - - would mean: "I prefer Danish, but will accept British English and - other types of English." A language-range matches a language-tag if - it exactly equals the tag, or if it exactly equals a prefix of the - tag such that the first tag character following the prefix is "-". - The special range "*", if present in the Accept-Language field, - matches every tag not matched by any other range present in the - Accept-Language field. - - Note: This use of a prefix matching rule does not imply that - language tags are assigned to languages in such a way that it is - always true that if a user understands a language with a certain - tag, then this user will also understand all languages with tags - for which this tag is a prefix. The prefix rule simply allows the - use of prefix tags if this is the case. - - The language quality factor assigned to a language-tag by the - Accept-Language field is the quality value of the longest language- - range in the field that matches the language-tag. If no language- - range in the field matches the tag, the language quality factor - assigned is 0. If no Accept-Language header is present in the - request, the server - - - - -Fielding, et al. Standards Track [Page 104] - -RFC 2616 HTTP/1.1 June 1999 - - - SHOULD assume that all languages are equally acceptable. If an - Accept-Language header is present, then all languages which are - assigned a quality factor greater than 0 are acceptable. - - It might be contrary to the privacy expectations of the user to send - an Accept-Language header with the complete linguistic preferences of - the user in every request. For a discussion of this issue, see - section 15.1.4. - - As intelligibility is highly dependent on the individual user, it is - recommended that client applications make the choice of linguistic - preference available to the user. If the choice is not made - available, then the Accept-Language header field MUST NOT be given in - the request. - - Note: When making the choice of linguistic preference available to - the user, we remind implementors of the fact that users are not - familiar with the details of language matching as described above, - and should provide appropriate guidance. As an example, users - might assume that on selecting "en-gb", they will be served any - kind of English document if British English is not available. A - user agent might suggest in such a case to add "en" to get the - best matching behavior. - -14.5 Accept-Ranges - - The Accept-Ranges response-header field allows the server to - indicate its acceptance of range requests for a resource: - - Accept-Ranges = "Accept-Ranges" ":" acceptable-ranges - acceptable-ranges = 1#range-unit | "none" - - Origin servers that accept byte-range requests MAY send - - Accept-Ranges: bytes - - but are not required to do so. Clients MAY generate byte-range - requests without having received this header for the resource - involved. Range units are defined in section 3.12. - - Servers that do not accept any kind of range request for a - resource MAY send - - Accept-Ranges: none - - to advise the client not to attempt a range request. - - - - - -Fielding, et al. Standards Track [Page 105] - -RFC 2616 HTTP/1.1 June 1999 - - -14.6 Age - - The Age response-header field conveys the sender's estimate of the - amount of time since the response (or its revalidation) was - generated at the origin server. A cached response is "fresh" if - its age does not exceed its freshness lifetime. Age values are - calculated as specified in section 13.2.3. - - Age = "Age" ":" age-value - age-value = delta-seconds - - Age values are non-negative decimal integers, representing time in - seconds. - - If a cache receives a value larger than the largest positive - integer it can represent, or if any of its age calculations - overflows, it MUST transmit an Age header with a value of - 2147483648 (2^31). An HTTP/1.1 server that includes a cache MUST - include an Age header field in every response generated from its - own cache. Caches SHOULD use an arithmetic type of at least 31 - bits of range. - -14.7 Allow - - The Allow entity-header field lists the set of methods supported - by the resource identified by the Request-URI. The purpose of this - field is strictly to inform the recipient of valid methods - associated with the resource. An Allow header field MUST be - present in a 405 (Method Not Allowed) response. - - Allow = "Allow" ":" #Method - - Example of use: - - Allow: GET, HEAD, PUT - - This field cannot prevent a client from trying other methods. - However, the indications given by the Allow header field value - SHOULD be followed. The actual set of allowed methods is defined - by the origin server at the time of each request. - - The Allow header field MAY be provided with a PUT request to - recommend the methods to be supported by the new or modified - resource. The server is not required to support these methods and - SHOULD include an Allow header in the response giving the actual - supported methods. - - - - - -Fielding, et al. Standards Track [Page 106] - -RFC 2616 HTTP/1.1 June 1999 - - - A proxy MUST NOT modify the Allow header field even if it does not - understand all the methods specified, since the user agent might - have other means of communicating with the origin server. - -14.8 Authorization - - A user agent that wishes to authenticate itself with a server-- - usually, but not necessarily, after receiving a 401 response--does - so by including an Authorization request-header field with the - request. The Authorization field value consists of credentials - containing the authentication information of the user agent for - the realm of the resource being requested. - - Authorization = "Authorization" ":" credentials - - HTTP access authentication is described in "HTTP Authentication: - Basic and Digest Access Authentication" [43]. If a request is - authenticated and a realm specified, the same credentials SHOULD - be valid for all other requests within this realm (assuming that - the authentication scheme itself does not require otherwise, such - as credentials that vary according to a challenge value or using - synchronized clocks). - - When a shared cache (see section 13.7) receives a request - containing an Authorization field, it MUST NOT return the - corresponding response as a reply to any other request, unless one - of the following specific exceptions holds: - - 1. If the response includes the "s-maxage" cache-control - directive, the cache MAY use that response in replying to a - subsequent request. But (if the specified maximum age has - passed) a proxy cache MUST first revalidate it with the origin - server, using the request-headers from the new request to allow - the origin server to authenticate the new request. (This is the - defined behavior for s-maxage.) If the response includes "s- - maxage=0", the proxy MUST always revalidate it before re-using - it. - - 2. If the response includes the "must-revalidate" cache-control - directive, the cache MAY use that response in replying to a - subsequent request. But if the response is stale, all caches - MUST first revalidate it with the origin server, using the - request-headers from the new request to allow the origin server - to authenticate the new request. - - 3. If the response includes the "public" cache-control directive, - it MAY be returned in reply to any subsequent request. - - - - -Fielding, et al. Standards Track [Page 107] - -RFC 2616 HTTP/1.1 June 1999 - - -14.9 Cache-Control - - The Cache-Control general-header field is used to specify directives - that MUST be obeyed by all caching mechanisms along the - request/response chain. The directives specify behavior intended to - prevent caches from adversely interfering with the request or - response. These directives typically override the default caching - algorithms. Cache directives are unidirectional in that the presence - of a directive in a request does not imply that the same directive is - to be given in the response. - - Note that HTTP/1.0 caches might not implement Cache-Control and - might only implement Pragma: no-cache (see section 14.32). - - Cache directives MUST be passed through by a proxy or gateway - application, regardless of their significance to that application, - since the directives might be applicable to all recipients along the - request/response chain. It is not possible to specify a cache- - directive for a specific cache. - - Cache-Control = "Cache-Control" ":" 1#cache-directive - - cache-directive = cache-request-directive - | cache-response-directive - - cache-request-directive = - "no-cache" ; Section 14.9.1 - | "no-store" ; Section 14.9.2 - | "max-age" "=" delta-seconds ; Section 14.9.3, 14.9.4 - | "max-stale" [ "=" delta-seconds ] ; Section 14.9.3 - | "min-fresh" "=" delta-seconds ; Section 14.9.3 - | "no-transform" ; Section 14.9.5 - | "only-if-cached" ; Section 14.9.4 - | cache-extension ; Section 14.9.6 - - cache-response-directive = - "public" ; Section 14.9.1 - | "private" [ "=" <"> 1#field-name <"> ] ; Section 14.9.1 - | "no-cache" [ "=" <"> 1#field-name <"> ]; Section 14.9.1 - | "no-store" ; Section 14.9.2 - | "no-transform" ; Section 14.9.5 - | "must-revalidate" ; Section 14.9.4 - | "proxy-revalidate" ; Section 14.9.4 - | "max-age" "=" delta-seconds ; Section 14.9.3 - | "s-maxage" "=" delta-seconds ; Section 14.9.3 - | cache-extension ; Section 14.9.6 - - cache-extension = token [ "=" ( token | quoted-string ) ] - - - -Fielding, et al. Standards Track [Page 108] - -RFC 2616 HTTP/1.1 June 1999 - - - When a directive appears without any 1#field-name parameter, the - directive applies to the entire request or response. When such a - directive appears with a 1#field-name parameter, it applies only to - the named field or fields, and not to the rest of the request or - response. This mechanism supports extensibility; implementations of - future versions of the HTTP protocol might apply these directives to - header fields not defined in HTTP/1.1. - - The cache-control directives can be broken down into these general - categories: - - - Restrictions on what are cacheable; these may only be imposed by - the origin server. - - - Restrictions on what may be stored by a cache; these may be - imposed by either the origin server or the user agent. - - - Modifications of the basic expiration mechanism; these may be - imposed by either the origin server or the user agent. - - - Controls over cache revalidation and reload; these may only be - imposed by a user agent. - - - Control over transformation of entities. - - - Extensions to the caching system. - -14.9.1 What is Cacheable - - By default, a response is cacheable if the requirements of the - request method, request header fields, and the response status - indicate that it is cacheable. Section 13.4 summarizes these defaults - for cacheability. The following Cache-Control response directives - allow an origin server to override the default cacheability of a - response: - - public - Indicates that the response MAY be cached by any cache, even if it - would normally be non-cacheable or cacheable only within a non- - shared cache. (See also Authorization, section 14.8, for - additional details.) - - private - Indicates that all or part of the response message is intended for - a single user and MUST NOT be cached by a shared cache. This - allows an origin server to state that the specified parts of the - - - - - -Fielding, et al. Standards Track [Page 109] - -RFC 2616 HTTP/1.1 June 1999 - - - response are intended for only one user and are not a valid - response for requests by other users. A private (non-shared) cache - MAY cache the response. - - Note: This usage of the word private only controls where the - response may be cached, and cannot ensure the privacy of the - message content. - - no-cache - If the no-cache directive does not specify a field-name, then a - cache MUST NOT use the response to satisfy a subsequent request - without successful revalidation with the origin server. This - allows an origin server to prevent caching even by caches that - have been configured to return stale responses to client requests. - - If the no-cache directive does specify one or more field-names, - then a cache MAY use the response to satisfy a subsequent request, - subject to any other restrictions on caching. However, the - specified field-name(s) MUST NOT be sent in the response to a - subsequent request without successful revalidation with the origin - server. This allows an origin server to prevent the re-use of - certain header fields in a response, while still allowing caching - of the rest of the response. - - Note: Most HTTP/1.0 caches will not recognize or obey this - directive. - -14.9.2 What May be Stored by Caches - - no-store - The purpose of the no-store directive is to prevent the - inadvertent release or retention of sensitive information (for - example, on backup tapes). The no-store directive applies to the - entire message, and MAY be sent either in a response or in a - request. If sent in a request, a cache MUST NOT store any part of - either this request or any response to it. If sent in a response, - a cache MUST NOT store any part of either this response or the - request that elicited it. This directive applies to both non- - shared and shared caches. "MUST NOT store" in this context means - that the cache MUST NOT intentionally store the information in - non-volatile storage, and MUST make a best-effort attempt to - remove the information from volatile storage as promptly as - possible after forwarding it. - - Even when this directive is associated with a response, users - might explicitly store such a response outside of the caching - system (e.g., with a "Save As" dialog). History buffers MAY store - such responses as part of their normal operation. - - - -Fielding, et al. Standards Track [Page 110] - -RFC 2616 HTTP/1.1 June 1999 - - - The purpose of this directive is to meet the stated requirements - of certain users and service authors who are concerned about - accidental releases of information via unanticipated accesses to - cache data structures. While the use of this directive might - improve privacy in some cases, we caution that it is NOT in any - way a reliable or sufficient mechanism for ensuring privacy. In - particular, malicious or compromised caches might not recognize or - obey this directive, and communications networks might be - vulnerable to eavesdropping. - -14.9.3 Modifications of the Basic Expiration Mechanism - - The expiration time of an entity MAY be specified by the origin - server using the Expires header (see section 14.21). Alternatively, - it MAY be specified using the max-age directive in a response. When - the max-age cache-control directive is present in a cached response, - the response is stale if its current age is greater than the age - value given (in seconds) at the time of a new request for that - resource. The max-age directive on a response implies that the - response is cacheable (i.e., "public") unless some other, more - restrictive cache directive is also present. - - If a response includes both an Expires header and a max-age - directive, the max-age directive overrides the Expires header, even - if the Expires header is more restrictive. This rule allows an origin - server to provide, for a given response, a longer expiration time to - an HTTP/1.1 (or later) cache than to an HTTP/1.0 cache. This might be - useful if certain HTTP/1.0 caches improperly calculate ages or - expiration times, perhaps due to desynchronized clocks. - - Many HTTP/1.0 cache implementations will treat an Expires value that - is less than or equal to the response Date value as being equivalent - to the Cache-Control response directive "no-cache". If an HTTP/1.1 - cache receives such a response, and the response does not include a - Cache-Control header field, it SHOULD consider the response to be - non-cacheable in order to retain compatibility with HTTP/1.0 servers. - - Note: An origin server might wish to use a relatively new HTTP - cache control feature, such as the "private" directive, on a - network including older caches that do not understand that - feature. The origin server will need to combine the new feature - with an Expires field whose value is less than or equal to the - Date value. This will prevent older caches from improperly - caching the response. - - - - - - - -Fielding, et al. Standards Track [Page 111] - -RFC 2616 HTTP/1.1 June 1999 - - - s-maxage - If a response includes an s-maxage directive, then for a shared - cache (but not for a private cache), the maximum age specified by - this directive overrides the maximum age specified by either the - max-age directive or the Expires header. The s-maxage directive - also implies the semantics of the proxy-revalidate directive (see - section 14.9.4), i.e., that the shared cache must not use the - entry after it becomes stale to respond to a subsequent request - without first revalidating it with the origin server. The s- - maxage directive is always ignored by a private cache. - - Note that most older caches, not compliant with this specification, - do not implement any cache-control directives. An origin server - wishing to use a cache-control directive that restricts, but does not - prevent, caching by an HTTP/1.1-compliant cache MAY exploit the - requirement that the max-age directive overrides the Expires header, - and the fact that pre-HTTP/1.1-compliant caches do not observe the - max-age directive. - - Other directives allow a user agent to modify the basic expiration - mechanism. These directives MAY be specified on a request: - - max-age - Indicates that the client is willing to accept a response whose - age is no greater than the specified time in seconds. Unless max- - stale directive is also included, the client is not willing to - accept a stale response. - - min-fresh - Indicates that the client is willing to accept a response whose - freshness lifetime is no less than its current age plus the - specified time in seconds. That is, the client wants a response - that will still be fresh for at least the specified number of - seconds. - - max-stale - Indicates that the client is willing to accept a response that has - exceeded its expiration time. If max-stale is assigned a value, - then the client is willing to accept a response that has exceeded - its expiration time by no more than the specified number of - seconds. If no value is assigned to max-stale, then the client is - willing to accept a stale response of any age. - - If a cache returns a stale response, either because of a max-stale - directive on a request, or because the cache is configured to - override the expiration time of a response, the cache MUST attach a - Warning header to the stale response, using Warning 110 (Response is - stale). - - - -Fielding, et al. Standards Track [Page 112] - -RFC 2616 HTTP/1.1 June 1999 - - - A cache MAY be configured to return stale responses without - validation, but only if this does not conflict with any "MUST"-level - requirements concerning cache validation (e.g., a "must-revalidate" - cache-control directive). - - If both the new request and the cached entry include "max-age" - directives, then the lesser of the two values is used for determining - the freshness of the cached entry for that request. - -14.9.4 Cache Revalidation and Reload Controls - - Sometimes a user agent might want or need to insist that a cache - revalidate its cache entry with the origin server (and not just with - the next cache along the path to the origin server), or to reload its - cache entry from the origin server. End-to-end revalidation might be - necessary if either the cache or the origin server has overestimated - the expiration time of the cached response. End-to-end reload may be - necessary if the cache entry has become corrupted for some reason. - - End-to-end revalidation may be requested either when the client does - not have its own local cached copy, in which case we call it - "unspecified end-to-end revalidation", or when the client does have a - local cached copy, in which case we call it "specific end-to-end - revalidation." - - The client can specify these three kinds of action using Cache- - Control request directives: - - End-to-end reload - The request includes a "no-cache" cache-control directive or, for - compatibility with HTTP/1.0 clients, "Pragma: no-cache". Field - names MUST NOT be included with the no-cache directive in a - request. The server MUST NOT use a cached copy when responding to - such a request. - - Specific end-to-end revalidation - The request includes a "max-age=0" cache-control directive, which - forces each cache along the path to the origin server to - revalidate its own entry, if any, with the next cache or server. - The initial request includes a cache-validating conditional with - the client's current validator. - - Unspecified end-to-end revalidation - The request includes "max-age=0" cache-control directive, which - forces each cache along the path to the origin server to - revalidate its own entry, if any, with the next cache or server. - The initial request does not include a cache-validating - - - - -Fielding, et al. Standards Track [Page 113] - -RFC 2616 HTTP/1.1 June 1999 - - - conditional; the first cache along the path (if any) that holds a - cache entry for this resource includes a cache-validating - conditional with its current validator. - - max-age - When an intermediate cache is forced, by means of a max-age=0 - directive, to revalidate its own cache entry, and the client has - supplied its own validator in the request, the supplied validator - might differ from the validator currently stored with the cache - entry. In this case, the cache MAY use either validator in making - its own request without affecting semantic transparency. - - However, the choice of validator might affect performance. The - best approach is for the intermediate cache to use its own - validator when making its request. If the server replies with 304 - (Not Modified), then the cache can return its now validated copy - to the client with a 200 (OK) response. If the server replies with - a new entity and cache validator, however, the intermediate cache - can compare the returned validator with the one provided in the - client's request, using the strong comparison function. If the - client's validator is equal to the origin server's, then the - intermediate cache simply returns 304 (Not Modified). Otherwise, - it returns the new entity with a 200 (OK) response. - - If a request includes the no-cache directive, it SHOULD NOT - include min-fresh, max-stale, or max-age. - - only-if-cached - In some cases, such as times of extremely poor network - connectivity, a client may want a cache to return only those - responses that it currently has stored, and not to reload or - revalidate with the origin server. To do this, the client may - include the only-if-cached directive in a request. If it receives - this directive, a cache SHOULD either respond using a cached entry - that is consistent with the other constraints of the request, or - respond with a 504 (Gateway Timeout) status. However, if a group - of caches is being operated as a unified system with good internal - connectivity, such a request MAY be forwarded within that group of - caches. - - must-revalidate - Because a cache MAY be configured to ignore a server's specified - expiration time, and because a client request MAY include a max- - stale directive (which has a similar effect), the protocol also - includes a mechanism for the origin server to require revalidation - of a cache entry on any subsequent use. When the must-revalidate - directive is present in a response received by a cache, that cache - MUST NOT use the entry after it becomes stale to respond to a - - - -Fielding, et al. Standards Track [Page 114] - -RFC 2616 HTTP/1.1 June 1999 - - - subsequent request without first revalidating it with the origin - server. (I.e., the cache MUST do an end-to-end revalidation every - time, if, based solely on the origin server's Expires or max-age - value, the cached response is stale.) - - The must-revalidate directive is necessary to support reliable - operation for certain protocol features. In all circumstances an - HTTP/1.1 cache MUST obey the must-revalidate directive; in - particular, if the cache cannot reach the origin server for any - reason, it MUST generate a 504 (Gateway Timeout) response. - - Servers SHOULD send the must-revalidate directive if and only if - failure to revalidate a request on the entity could result in - incorrect operation, such as a silently unexecuted financial - transaction. Recipients MUST NOT take any automated action that - violates this directive, and MUST NOT automatically provide an - unvalidated copy of the entity if revalidation fails. - - Although this is not recommended, user agents operating under - severe connectivity constraints MAY violate this directive but, if - so, MUST explicitly warn the user that an unvalidated response has - been provided. The warning MUST be provided on each unvalidated - access, and SHOULD require explicit user confirmation. - - proxy-revalidate - The proxy-revalidate directive has the same meaning as the must- - revalidate directive, except that it does not apply to non-shared - user agent caches. It can be used on a response to an - authenticated request to permit the user's cache to store and - later return the response without needing to revalidate it (since - it has already been authenticated once by that user), while still - requiring proxies that service many users to revalidate each time - (in order to make sure that each user has been authenticated). - Note that such authenticated responses also need the public cache - control directive in order to allow them to be cached at all. - -14.9.5 No-Transform Directive - - no-transform - Implementors of intermediate caches (proxies) have found it useful - to convert the media type of certain entity bodies. A non- - transparent proxy might, for example, convert between image - formats in order to save cache space or to reduce the amount of - traffic on a slow link. - - Serious operational problems occur, however, when these - transformations are applied to entity bodies intended for certain - kinds of applications. For example, applications for medical - - - -Fielding, et al. Standards Track [Page 115] - -RFC 2616 HTTP/1.1 June 1999 - - - imaging, scientific data analysis and those using end-to-end - authentication, all depend on receiving an entity body that is bit - for bit identical to the original entity-body. - - Therefore, if a message includes the no-transform directive, an - intermediate cache or proxy MUST NOT change those headers that are - listed in section 13.5.2 as being subject to the no-transform - directive. This implies that the cache or proxy MUST NOT change - any aspect of the entity-body that is specified by these headers, - including the value of the entity-body itself. - -14.9.6 Cache Control Extensions - - The Cache-Control header field can be extended through the use of one - or more cache-extension tokens, each with an optional assigned value. - Informational extensions (those which do not require a change in - cache behavior) MAY be added without changing the semantics of other - directives. Behavioral extensions are designed to work by acting as - modifiers to the existing base of cache directives. Both the new - directive and the standard directive are supplied, such that - applications which do not understand the new directive will default - to the behavior specified by the standard directive, and those that - understand the new directive will recognize it as modifying the - requirements associated with the standard directive. In this way, - extensions to the cache-control directives can be made without - requiring changes to the base protocol. - - This extension mechanism depends on an HTTP cache obeying all of the - cache-control directives defined for its native HTTP-version, obeying - certain extensions, and ignoring all directives that it does not - understand. - - For example, consider a hypothetical new response directive called - community which acts as a modifier to the private directive. We - define this new directive to mean that, in addition to any non-shared - cache, any cache which is shared only by members of the community - named within its value may cache the response. An origin server - wishing to allow the UCI community to use an otherwise private - response in their shared cache(s) could do so by including - - Cache-Control: private, community="UCI" - - A cache seeing this header field will act correctly even if the cache - does not understand the community cache-extension, since it will also - see and understand the private directive and thus default to the safe - behavior. - - - - - -Fielding, et al. Standards Track [Page 116] - -RFC 2616 HTTP/1.1 June 1999 - - - Unrecognized cache-directives MUST be ignored; it is assumed that any - cache-directive likely to be unrecognized by an HTTP/1.1 cache will - be combined with standard directives (or the response's default - cacheability) such that the cache behavior will remain minimally - correct even if the cache does not understand the extension(s). - -14.10 Connection - - The Connection general-header field allows the sender to specify - options that are desired for that particular connection and MUST NOT - be communicated by proxies over further connections. - - The Connection header has the following grammar: - - Connection = "Connection" ":" 1#(connection-token) - connection-token = token - - HTTP/1.1 proxies MUST parse the Connection header field before a - message is forwarded and, for each connection-token in this field, - remove any header field(s) from the message with the same name as the - connection-token. Connection options are signaled by the presence of - a connection-token in the Connection header field, not by any - corresponding additional header field(s), since the additional header - field may not be sent if there are no parameters associated with that - connection option. - - Message headers listed in the Connection header MUST NOT include - end-to-end headers, such as Cache-Control. - - HTTP/1.1 defines the "close" connection option for the sender to - signal that the connection will be closed after completion of the - response. For example, - - Connection: close - - in either the request or the response header fields indicates that - the connection SHOULD NOT be considered `persistent' (section 8.1) - after the current request/response is complete. - - HTTP/1.1 applications that do not support persistent connections MUST - include the "close" connection option in every message. - -[[ Should say: ]] -[[ An HTTP/1.1 client that does not support persistent connections ]] -[[ MUST include the "close" connection option in every request message. ]] -[[ ]] -[[ An HTTP/1.1 server that does not support persistent connections ]] -[[ MUST include the "close" connection option in every response ]] -[[ message that does not have a 1xx (informational) status code. ]] - - A system receiving an HTTP/1.0 (or lower-version) message that - includes a Connection header MUST, for each connection-token in this - field, remove and ignore any header field(s) from the message with - the same name as the connection-token. This protects against mistaken - forwarding of such header fields by pre-HTTP/1.1 proxies. See section - 19.6.2. - - - -Fielding, et al. Standards Track [Page 117] - -RFC 2616 HTTP/1.1 June 1999 - - -14.11 Content-Encoding - - The Content-Encoding entity-header field is used as a modifier to the - media-type. When present, its value indicates what additional content - codings have been applied to the entity-body, and thus what decoding - mechanisms must be applied in order to obtain the media-type - referenced by the Content-Type header field. Content-Encoding is - primarily used to allow a document to be compressed without losing - the identity of its underlying media type. - - Content-Encoding = "Content-Encoding" ":" 1#content-coding - - Content codings are defined in section 3.5. An example of its use is - - Content-Encoding: gzip - - The content-coding is a characteristic of the entity identified by - the Request-URI. Typically, the entity-body is stored with this - encoding and is only decoded before rendering or analogous usage. - However, a non-transparent proxy MAY modify the content-coding if the - new coding is known to be acceptable to the recipient, unless the - "no-transform" cache-control directive is present in the message. - - If the content-coding of an entity is not "identity", then the - response MUST include a Content-Encoding entity-header (section - 14.11) that lists the non-identity content-coding(s) used. - - If the content-coding of an entity in a request message is not - acceptable to the origin server, the server SHOULD respond with a - status code of 415 (Unsupported Media Type). - - If multiple encodings have been applied to an entity, the content - codings MUST be listed in the order in which they were applied. - Additional information about the encoding parameters MAY be provided - by other entity-header fields not defined by this specification. - -14.12 Content-Language - - The Content-Language entity-header field describes the natural - language(s) of the intended audience for the enclosed entity. Note - that this might not be equivalent to all the languages used within - the entity-body. - - Content-Language = "Content-Language" ":" 1#language-tag - - - - - - - -Fielding, et al. Standards Track [Page 118] - -RFC 2616 HTTP/1.1 June 1999 - - - Language tags are defined in section 3.10. The primary purpose of - Content-Language is to allow a user to identify and differentiate - entities according to the user's own preferred language. Thus, if the - body content is intended only for a Danish-literate audience, the - appropriate field is - - Content-Language: da - - If no Content-Language is specified, the default is that the content - is intended for all language audiences. This might mean that the - sender does not consider it to be specific to any natural language, - or that the sender does not know for which language it is intended. - - Multiple languages MAY be listed for content that is intended for - multiple audiences. For example, a rendition of the "Treaty of - Waitangi," presented simultaneously in the original Maori and English - versions, would call for - - Content-Language: mi, en - - However, just because multiple languages are present within an entity - does not mean that it is intended for multiple linguistic audiences. - An example would be a beginner's language primer, such as "A First - Lesson in Latin," which is clearly intended to be used by an - English-literate audience. In this case, the Content-Language would - properly only include "en". - - Content-Language MAY be applied to any media type -- it is not - limited to textual documents. - -14.13 Content-Length - - The Content-Length entity-header field indicates the size of the - entity-body, in decimal number of OCTETs, sent to the recipient or, - in the case of the HEAD method, the size of the entity-body that - would have been sent had the request been a GET. - - Content-Length = "Content-Length" ":" 1*DIGIT - - An example is - - Content-Length: 3495 - - Applications SHOULD use this field to indicate the transfer-length of - the message-body, unless this is prohibited by the rules in section - 4.4. - - - - - -Fielding, et al. Standards Track [Page 119] - -RFC 2616 HTTP/1.1 June 1999 - - - Any Content-Length greater than or equal to zero is a valid value. - Section 4.4 describes how to determine the length of a message-body - if a Content-Length is not given. - - Note that the meaning of this field is significantly different from - the corresponding definition in MIME, where it is an optional field - used within the "message/external-body" content-type. In HTTP, it - SHOULD be sent whenever the message's length can be determined prior - to being transferred, unless this is prohibited by the rules in - section 4.4. - -14.14 Content-Location - - The Content-Location entity-header field MAY be used to supply the - resource location for the entity enclosed in the message when that - entity is accessible from a location separate from the requested - resource's URI. A server SHOULD provide a Content-Location for the - variant corresponding to the response entity; especially in the case - where a resource has multiple entities associated with it, and those - entities actually have separate locations by which they might be - individually accessed, the server SHOULD provide a Content-Location - for the particular variant which is returned. - - Content-Location = "Content-Location" ":" - ( absoluteURI | relativeURI ) - - The value of Content-Location also defines the base URI for the - entity. - - The Content-Location value is not a replacement for the original - requested URI; it is only a statement of the location of the resource - corresponding to this particular entity at the time of the request. - Future requests MAY specify the Content-Location URI as the request- - URI if the desire is to identify the source of that particular - entity. - - A cache cannot assume that an entity with a Content-Location - different from the URI used to retrieve it can be used to respond to - later requests on that Content-Location URI. However, the Content- - Location can be used to differentiate between multiple entities - retrieved from a single requested resource, as described in section - 13.6. - - If the Content-Location is a relative URI, the relative URI is - interpreted relative to the Request-URI. - - The meaning of the Content-Location header in PUT or POST requests is - undefined; servers are free to ignore it in those cases. - - - -Fielding, et al. Standards Track [Page 120] - -RFC 2616 HTTP/1.1 June 1999 - - -14.15 Content-MD5 - - The Content-MD5 entity-header field, as defined in RFC 1864 [23], is - an MD5 digest of the entity-body for the purpose of providing an - end-to-end message integrity check (MIC) of the entity-body. (Note: a - MIC is good for detecting accidental modification of the entity-body - in transit, but is not proof against malicious attacks.) - - Content-MD5 = "Content-MD5" ":" md5-digest - md5-digest = <base64 of 128 bit MD5 digest as per RFC 1864> - - The Content-MD5 header field MAY be generated by an origin server or - client to function as an integrity check of the entity-body. Only - origin servers or clients MAY generate the Content-MD5 header field; - proxies and gateways MUST NOT generate it, as this would defeat its - value as an end-to-end integrity check. Any recipient of the entity- - body, including gateways and proxies, MAY check that the digest value - in this header field matches that of the entity-body as received. - - The MD5 digest is computed based on the content of the entity-body, - including any content-coding that has been applied, but not including - any transfer-encoding applied to the message-body. If the message is - received with a transfer-encoding, that encoding MUST be removed - prior to checking the Content-MD5 value against the received entity. - - This has the result that the digest is computed on the octets of the - entity-body exactly as, and in the order that, they would be sent if - no transfer-encoding were being applied. - - HTTP extends RFC 1864 to permit the digest to be computed for MIME - composite media-types (e.g., multipart/* and message/rfc822), but - this does not change how the digest is computed as defined in the - preceding paragraph. - - There are several consequences of this. The entity-body for composite - types MAY contain many body-parts, each with its own MIME and HTTP - headers (including Content-MD5, Content-Transfer-Encoding, and - Content-Encoding headers). If a body-part has a Content-Transfer- - Encoding or Content-Encoding header, it is assumed that the content - of the body-part has had the encoding applied, and the body-part is - included in the Content-MD5 digest as is -- i.e., after the - application. The Transfer-Encoding header field is not allowed within - body-parts. - - Conversion of all line breaks to CRLF MUST NOT be done before - computing or checking the digest: the line break convention used in - the text actually transmitted MUST be left unaltered when computing - the digest. - - - -Fielding, et al. Standards Track [Page 121] - -RFC 2616 HTTP/1.1 June 1999 - - - Note: while the definition of Content-MD5 is exactly the same for - HTTP as in RFC 1864 for MIME entity-bodies, there are several ways - in which the application of Content-MD5 to HTTP entity-bodies - differs from its application to MIME entity-bodies. One is that - HTTP, unlike MIME, does not use Content-Transfer-Encoding, and - does use Transfer-Encoding and Content-Encoding. Another is that - HTTP more frequently uses binary content types than MIME, so it is - worth noting that, in such cases, the byte order used to compute - the digest is the transmission byte order defined for the type. - Lastly, HTTP allows transmission of text types with any of several - line break conventions and not just the canonical form using CRLF. - -14.16 Content-Range - - The Content-Range entity-header is sent with a partial entity-body to - specify where in the full entity-body the partial body should be - applied. Range units are defined in section 3.12. - - Content-Range = "Content-Range" ":" content-range-spec - - content-range-spec = byte-content-range-spec - byte-content-range-spec = bytes-unit SP - byte-range-resp-spec "/" - ( instance-length | "*" ) - - byte-range-resp-spec = (first-byte-pos "-" last-byte-pos) - | "*" - instance-length = 1*DIGIT - - The header SHOULD indicate the total length of the full entity-body, - unless this length is unknown or difficult to determine. The asterisk - "*" character means that the instance-length is unknown at the time - when the response was generated. - - Unlike byte-ranges-specifier values (see section 14.35.1), a byte- - range-resp-spec MUST only specify one range, and MUST contain - absolute byte positions for both the first and last byte of the - range. - - A byte-content-range-spec with a byte-range-resp-spec whose last- - byte-pos value is less than its first-byte-pos value, or whose - instance-length value is less than or equal to its last-byte-pos - value, is invalid. The recipient of an invalid byte-content-range- - spec MUST ignore it and any content transferred along with it. - - A server sending a response with status code 416 (Requested range not - satisfiable) SHOULD include a Content-Range field with a byte-range- - resp-spec of "*". The instance-length specifies the current length of - - - -Fielding, et al. Standards Track [Page 122] - -RFC 2616 HTTP/1.1 June 1999 - - - the selected resource. A response with status code 206 (Partial - Content) MUST NOT include a Content-Range field with a byte-range- - resp-spec of "*". - - Examples of byte-content-range-spec values, assuming that the entity - contains a total of 1234 bytes: - - . The first 500 bytes: - bytes 0-499/1234 - - . The second 500 bytes: - bytes 500-999/1234 - - . All except for the first 500 bytes: - bytes 500-1233/1234 - - . The last 500 bytes: - bytes 734-1233/1234 - - When an HTTP message includes the content of a single range (for - example, a response to a request for a single range, or to a request - for a set of ranges that overlap without any holes), this content is - transmitted with a Content-Range header, and a Content-Length header - showing the number of bytes actually transferred. For example, - - HTTP/1.1 206 Partial content - Date: Wed, 15 Nov 1995 06:25:24 GMT - Last-Modified: Wed, 15 Nov 1995 04:58:08 GMT - Content-Range: bytes 21010-47021/47022 - Content-Length: 26012 - Content-Type: image/gif - - When an HTTP message includes the content of multiple ranges (for - example, a response to a request for multiple non-overlapping - ranges), these are transmitted as a multipart message. The multipart - media type used for this purpose is "multipart/byteranges" as defined - in appendix 19.2. See appendix 19.6.3 for a compatibility issue. - - A response to a request for a single range MUST NOT be sent using the - multipart/byteranges media type. A response to a request for - multiple ranges, whose result is a single range, MAY be sent as a - multipart/byteranges media type with one part. A client that cannot - decode a multipart/byteranges message MUST NOT ask for multiple - byte-ranges in a single request. - - When a client requests multiple byte-ranges in one request, the - server SHOULD return them in the order that they appeared in the - request. - - - -Fielding, et al. Standards Track [Page 123] - -RFC 2616 HTTP/1.1 June 1999 - - - If the server ignores a byte-range-spec because it is syntactically - invalid, the server SHOULD treat the request as if the invalid Range - header field did not exist. (Normally, this means return a 200 - response containing the full entity). - - If the server receives a request (other than one including an If- - Range request-header field) with an unsatisfiable Range request- - header field (that is, all of whose byte-range-spec values have a - first-byte-pos value greater than the current length of the selected - resource), it SHOULD return a response code of 416 (Requested range - not satisfiable) (section 10.4.17). - - Note: clients cannot depend on servers to send a 416 (Requested - range not satisfiable) response instead of a 200 (OK) response for - an unsatisfiable Range request-header, since not all servers - implement this request-header. - -14.17 Content-Type - - The Content-Type entity-header field indicates the media type of the - entity-body sent to the recipient or, in the case of the HEAD method, - the media type that would have been sent had the request been a GET. - - Content-Type = "Content-Type" ":" media-type - - Media types are defined in section 3.7. An example of the field is - - Content-Type: text/html; charset=ISO-8859-4 - - Further discussion of methods for identifying the media type of an - entity is provided in section 7.2.1. - -14.18 Date - - The Date general-header field represents the date and time at which - the message was originated, having the same semantics as orig-date in - RFC 822. The field value is an HTTP-date, as described in section - 3.3.1; it MUST be sent in RFC 1123 [8]-date format. - - Date = "Date" ":" HTTP-date - - An example is - - Date: Tue, 15 Nov 1994 08:12:31 GMT - - Origin servers MUST include a Date header field in all responses, - except in these cases: - - - - -Fielding, et al. Standards Track [Page 124] - -RFC 2616 HTTP/1.1 June 1999 - - - 1. If the response status code is 100 (Continue) or 101 (Switching - Protocols), the response MAY include a Date header field, at - the server's option. - - 2. If the response status code conveys a server error, e.g. 500 - (Internal Server Error) or 503 (Service Unavailable), and it is - inconvenient or impossible to generate a valid Date. - - 3. If the server does not have a clock that can provide a - reasonable approximation of the current time, its responses - MUST NOT include a Date header field. In this case, the rules - in section 14.18.1 MUST be followed. - - A received message that does not have a Date header field MUST be - assigned one by the recipient if the message will be cached by that - recipient or gatewayed via a protocol which requires a Date. An HTTP - implementation without a clock MUST NOT cache responses without - revalidating them on every use. An HTTP cache, especially a shared - cache, SHOULD use a mechanism, such as NTP [28], to synchronize its - clock with a reliable external standard. - - Clients SHOULD only send a Date header field in messages that include - an entity-body, as in the case of the PUT and POST requests, and even - then it is optional. A client without a clock MUST NOT send a Date - header field in a request. - - The HTTP-date sent in a Date header SHOULD NOT represent a date and - time subsequent to the generation of the message. It SHOULD represent - the best available approximation of the date and time of message - generation, unless the implementation has no means of generating a - reasonably accurate date and time. In theory, the date ought to - represent the moment just before the entity is generated. In - practice, the date can be generated at any time during the message - origination without affecting its semantic value. - -14.18.1 Clockless Origin Server Operation - - Some origin server implementations might not have a clock available. - An origin server without a clock MUST NOT assign Expires or Last- - Modified values to a response, unless these values were associated - with the resource by a system or user with a reliable clock. It MAY - assign an Expires value that is known, at or before server - configuration time, to be in the past (this allows "pre-expiration" - of responses without storing separate Expires values for each - resource). - - - - - - -Fielding, et al. Standards Track [Page 125] - -RFC 2616 HTTP/1.1 June 1999 - - -14.19 ETag - - The ETag response-header field provides the current value of the - entity tag for the requested variant. The headers used with entity - tags are described in sections 14.24, 14.26 and 14.44. The entity tag - MAY be used for comparison with other entities from the same resource - (see section 13.3.3). - - ETag = "ETag" ":" entity-tag - - Examples: - - ETag: "xyzzy" - ETag: W/"xyzzy" - ETag: "" - -14.20 Expect - - The Expect request-header field is used to indicate that particular - server behaviors are required by the client. - - Expect = "Expect" ":" 1#expectation - - expectation = "100-continue" | expectation-extension - expectation-extension = token [ "=" ( token | quoted-string ) - *expect-params ] - expect-params = ";" token [ "=" ( token | quoted-string ) ] - - - A server that does not understand or is unable to comply with any of - the expectation values in the Expect field of a request MUST respond - with appropriate error status. The server MUST respond with a 417 - (Expectation Failed) status if any of the expectations cannot be met - or, if there are other problems with the request, some other 4xx - status. - - This header field is defined with extensible syntax to allow for - future extensions. If a server receives a request containing an - Expect field that includes an expectation-extension that it does not - support, it MUST respond with a 417 (Expectation Failed) status. - - Comparison of expectation values is case-insensitive for unquoted - tokens (including the 100-continue token), and is case-sensitive for - quoted-string expectation-extensions. - - - - - - - -Fielding, et al. Standards Track [Page 126] - -RFC 2616 HTTP/1.1 June 1999 - - - The Expect mechanism is hop-by-hop: that is, an HTTP/1.1 proxy MUST - return a 417 (Expectation Failed) status if it receives a request - with an expectation that it cannot meet. However, the Expect - request-header itself is end-to-end; it MUST be forwarded if the - request is forwarded. - - Many older HTTP/1.0 and HTTP/1.1 applications do not understand the - Expect header. - - See section 8.2.3 for the use of the 100 (continue) status. - -14.21 Expires - - The Expires entity-header field gives the date/time after which the - response is considered stale. A stale cache entry may not normally be - returned by a cache (either a proxy cache or a user agent cache) - unless it is first validated with the origin server (or with an - intermediate cache that has a fresh copy of the entity). See section - 13.2 for further discussion of the expiration model. - - The presence of an Expires field does not imply that the original - resource will change or cease to exist at, before, or after that - time. - - The format is an absolute date and time as defined by HTTP-date in - section 3.3.1; it MUST be in RFC 1123 date format: - - Expires = "Expires" ":" HTTP-date - - An example of its use is - - Expires: Thu, 01 Dec 1994 16:00:00 GMT - - Note: if a response includes a Cache-Control field with the max- - age directive (see section 14.9.3), that directive overrides the - Expires field. - - HTTP/1.1 clients and caches MUST treat other invalid date formats, - especially including the value "0", as in the past (i.e., "already - expired"). - - To mark a response as "already expired," an origin server sends an - Expires date that is equal to the Date header value. (See the rules - for expiration calculations in section 13.2.4.) - - - - - - - -Fielding, et al. Standards Track [Page 127] - -RFC 2616 HTTP/1.1 June 1999 - - - To mark a response as "never expires," an origin server sends an - Expires date approximately one year from the time the response is - sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one - year in the future. - - The presence of an Expires header field with a date value of some - time in the future on a response that otherwise would by default be - non-cacheable indicates that the response is cacheable, unless - indicated otherwise by a Cache-Control header field (section 14.9). - -14.22 From - - The From request-header field, if given, SHOULD contain an Internet - e-mail address for the human user who controls the requesting user - agent. The address SHOULD be machine-usable, as defined by "mailbox" - in RFC 822 [9] as updated by RFC 1123 [8]: - - From = "From" ":" mailbox - - An example is: - - From: webmaster@w3.org - - This header field MAY be used for logging purposes and as a means for - identifying the source of invalid or unwanted requests. It SHOULD NOT - be used as an insecure form of access protection. The interpretation - of this field is that the request is being performed on behalf of the - person given, who accepts responsibility for the method performed. In - particular, robot agents SHOULD include this header so that the - person responsible for running the robot can be contacted if problems - occur on the receiving end. - - The Internet e-mail address in this field MAY be separate from the - Internet host which issued the request. For example, when a request - is passed through a proxy the original issuer's address SHOULD be - used. - - The client SHOULD NOT send the From header field without the user's - approval, as it might conflict with the user's privacy interests or - their site's security policy. It is strongly recommended that the - user be able to disable, enable, and modify the value of this field - at any time prior to a request. - -14.23 Host - - The Host request-header field specifies the Internet host and port - number of the resource being requested, as obtained from the original - URI given by the user or referring resource (generally an HTTP URL, - - - -Fielding, et al. Standards Track [Page 128] - -RFC 2616 HTTP/1.1 June 1999 - - - as described in section 3.2.2). The Host field value MUST represent - the naming authority of the origin server or gateway given by the - original URL. This allows the origin server or gateway to - differentiate between internally-ambiguous URLs, such as the root "/" - URL of a server for multiple host names on a single IP address. - - Host = "Host" ":" host [ ":" port ] ; Section 3.2.2 - - A "host" without any trailing port information implies the default - port for the service requested (e.g., "80" for an HTTP URL). For - example, a request on the origin server for - <http://www.w3.org/pub/WWW/> would properly include: - - GET /pub/WWW/ HTTP/1.1 - Host: www.w3.org - - A client MUST include a Host header field in all HTTP/1.1 request - messages . If the requested URI does not include an Internet host - name for the service being requested, then the Host header field MUST - be given with an empty value. An HTTP/1.1 proxy MUST ensure that any - request message it forwards does contain an appropriate Host header - field that identifies the service being requested by the proxy. All - Internet-based HTTP/1.1 servers MUST respond with a 400 (Bad Request) - status code to any HTTP/1.1 request message which lacks a Host header - field. - - See sections 5.2 and 19.6.1.1 for other requirements relating to - Host. - -14.24 If-Match - - The If-Match request-header field is used with a method to make it - conditional. A client that has one or more entities previously - obtained from the resource can verify that one of those entities is - current by including a list of their associated entity tags in the - If-Match header field. Entity tags are defined in section 3.11. The - purpose of this feature is to allow efficient updates of cached - information with a minimum amount of transaction overhead. It is also - used, on updating requests, to prevent inadvertent modification of - the wrong version of a resource. As a special case, the value "*" - matches any current entity of the resource. - - If-Match = "If-Match" ":" ( "*" | 1#entity-tag ) - - If any of the entity tags match the entity tag of the entity that - would have been returned in the response to a similar GET request - (without the If-Match header) on that resource, or if "*" is given - - - - -Fielding, et al. Standards Track [Page 129] - -RFC 2616 HTTP/1.1 June 1999 - - - and any current entity exists for that resource, then the server MAY - perform the requested method as if the If-Match header field did not - exist. - - A server MUST use the strong comparison function (see section 13.3.3) - to compare the entity tags in If-Match. - - If none of the entity tags match, or if "*" is given and no current - entity exists, the server MUST NOT perform the requested method, and - MUST return a 412 (Precondition Failed) response. This behavior is - most useful when the client wants to prevent an updating method, such - as PUT, from modifying a resource that has changed since the client - last retrieved it. - - If the request would, without the If-Match header field, result in - anything other than a 2xx or 412 status, then the If-Match header - MUST be ignored. - - The meaning of "If-Match: *" is that the method SHOULD be performed - if the representation selected by the origin server (or by a cache, - possibly using the Vary mechanism, see section 14.44) exists, and - MUST NOT be performed if the representation does not exist. - - A request intended to update a resource (e.g., a PUT) MAY include an - If-Match header field to signal that the request method MUST NOT be - applied if the entity corresponding to the If-Match value (a single - entity tag) is no longer a representation of that resource. This - allows the user to indicate that they do not wish the request to be - successful if the resource has been changed without their knowledge. - Examples: - - If-Match: "xyzzy" - If-Match: "xyzzy", "r2d2xxxx", "c3piozzzz" - If-Match: * - - The result of a request having both an If-Match header field and - either an If-None-Match or an If-Modified-Since header fields is - undefined by this specification. - -14.25 If-Modified-Since - - The If-Modified-Since request-header field is used with a method to - make it conditional: if the requested variant has not been modified - since the time specified in this field, an entity will not be - returned from the server; instead, a 304 (not modified) response will - be returned without any message-body. - - If-Modified-Since = "If-Modified-Since" ":" HTTP-date - - - -Fielding, et al. Standards Track [Page 130] - -RFC 2616 HTTP/1.1 June 1999 - - - An example of the field is: - - If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT - - A GET method with an If-Modified-Since header and no Range header - requests that the identified entity be transferred only if it has - been modified since the date given by the If-Modified-Since header. - The algorithm for determining this includes the following cases: - - a) If the request would normally result in anything other than a - 200 (OK) status, or if the passed If-Modified-Since date is - invalid, the response is exactly the same as for a normal GET. - A date which is later than the server's current time is - invalid. - - b) If the variant has been modified since the If-Modified-Since - date, the response is exactly the same as for a normal GET. - - c) If the variant has not been modified since a valid If- - Modified-Since date, the server SHOULD return a 304 (Not - Modified) response. - - The purpose of this feature is to allow efficient updates of cached - information with a minimum amount of transaction overhead. - - Note: The Range request-header field modifies the meaning of If- - Modified-Since; see section 14.35 for full details. - - Note: If-Modified-Since times are interpreted by the server, whose - clock might not be synchronized with the client. - - Note: When handling an If-Modified-Since header field, some - servers will use an exact date comparison function, rather than a - less-than function, for deciding whether to send a 304 (Not - Modified) response. To get best results when sending an If- - Modified-Since header field for cache validation, clients are - advised to use the exact date string received in a previous Last- - Modified header field whenever possible. - - Note: If a client uses an arbitrary date in the If-Modified-Since - header instead of a date taken from the Last-Modified header for - the same request, the client should be aware of the fact that this - date is interpreted in the server's understanding of time. The - client should consider unsynchronized clocks and rounding problems - due to the different encodings of time between the client and - server. This includes the possibility of race conditions if the - document has changed between the time it was first requested and - the If-Modified-Since date of a subsequent request, and the - - - -Fielding, et al. Standards Track [Page 131] - -RFC 2616 HTTP/1.1 June 1999 - - - possibility of clock-skew-related problems if the If-Modified- - Since date is derived from the client's clock without correction - to the server's clock. Corrections for different time bases - between client and server are at best approximate due to network - latency. - - The result of a request having both an If-Modified-Since header field - and either an If-Match or an If-Unmodified-Since header fields is - undefined by this specification. - -14.26 If-None-Match - - The If-None-Match request-header field is used with a method to make - it conditional. A client that has one or more entities previously - obtained from the resource can verify that none of those entities is - current by including a list of their associated entity tags in the - If-None-Match header field. The purpose of this feature is to allow - efficient updates of cached information with a minimum amount of - transaction overhead. It is also used to prevent a method (e.g. PUT) - from inadvertently modifying an existing resource when the client - believes that the resource does not exist. - - As a special case, the value "*" matches any current entity of the - resource. - - If-None-Match = "If-None-Match" ":" ( "*" | 1#entity-tag ) - - If any of the entity tags match the entity tag of the entity that - would have been returned in the response to a similar GET request - (without the If-None-Match header) on that resource, or if "*" is - given and any current entity exists for that resource, then the - server MUST NOT perform the requested method, unless required to do - so because the resource's modification date fails to match that - supplied in an If-Modified-Since header field in the request. - Instead, if the request method was GET or HEAD, the server SHOULD - respond with a 304 (Not Modified) response, including the cache- - related header fields (particularly ETag) of one of the entities that - matched. For all other request methods, the server MUST respond with - a status of 412 (Precondition Failed). - - See section 13.3.3 for rules on how to determine if two entities tags - match. The weak comparison function can only be used with GET or HEAD - requests. - - - - - - - - -Fielding, et al. Standards Track [Page 132] - -RFC 2616 HTTP/1.1 June 1999 - - - If none of the entity tags match, then the server MAY perform the - requested method as if the If-None-Match header field did not exist, - but MUST also ignore any If-Modified-Since header field(s) in the - request. That is, if no entity tags match, then the server MUST NOT - return a 304 (Not Modified) response. - - If the request would, without the If-None-Match header field, result - in anything other than a 2xx or 304 status, then the If-None-Match - header MUST be ignored. (See section 13.3.4 for a discussion of - server behavior when both If-Modified-Since and If-None-Match appear - in the same request.) - - The meaning of "If-None-Match: *" is that the method MUST NOT be - performed if the representation selected by the origin server (or by - a cache, possibly using the Vary mechanism, see section 14.44) - exists, and SHOULD be performed if the representation does not exist. - This feature is intended to be useful in preventing races between PUT - operations. - - Examples: - - If-None-Match: "xyzzy" - If-None-Match: W/"xyzzy" - If-None-Match: "xyzzy", "r2d2xxxx", "c3piozzzz" - If-None-Match: W/"xyzzy", W/"r2d2xxxx", W/"c3piozzzz" - If-None-Match: * - - The result of a request having both an If-None-Match header field and - either an If-Match or an If-Unmodified-Since header fields is - undefined by this specification. - -14.27 If-Range - - If a client has a partial copy of an entity in its cache, and wishes - to have an up-to-date copy of the entire entity in its cache, it - could use the Range request-header with a conditional GET (using - either or both of If-Unmodified-Since and If-Match.) However, if the - condition fails because the entity has been modified, the client - would then have to make a second request to obtain the entire current - entity-body. - - The If-Range header allows a client to "short-circuit" the second - request. Informally, its meaning is `if the entity is unchanged, send - me the part(s) that I am missing; otherwise, send me the entire new - entity'. - - If-Range = "If-Range" ":" ( entity-tag | HTTP-date ) - - - - -Fielding, et al. Standards Track [Page 133] - -RFC 2616 HTTP/1.1 June 1999 - - - If the client has no entity tag for an entity, but does have a Last- - Modified date, it MAY use that date in an If-Range header. (The - server can distinguish between a valid HTTP-date and any form of - entity-tag by examining no more than two characters.) The If-Range - header SHOULD only be used together with a Range header, and MUST be - ignored if the request does not include a Range header, or if the - server does not support the sub-range operation. - - If the entity tag given in the If-Range header matches the current - entity tag for the entity, then the server SHOULD provide the - specified sub-range of the entity using a 206 (Partial content) - response. If the entity tag does not match, then the server SHOULD - return the entire entity using a 200 (OK) response. - -14.28 If-Unmodified-Since - - The If-Unmodified-Since request-header field is used with a method to - make it conditional. If the requested resource has not been modified - since the time specified in this field, the server SHOULD perform the - requested operation as if the If-Unmodified-Since header were not - present. - - If the requested variant has been modified since the specified time, - the server MUST NOT perform the requested operation, and MUST return - a 412 (Precondition Failed). - - If-Unmodified-Since = "If-Unmodified-Since" ":" HTTP-date - - An example of the field is: - - If-Unmodified-Since: Sat, 29 Oct 1994 19:43:31 GMT - - If the request normally (i.e., without the If-Unmodified-Since - header) would result in anything other than a 2xx or 412 status, the - If-Unmodified-Since header SHOULD be ignored. - - If the specified date is invalid, the header is ignored. - - The result of a request having both an If-Unmodified-Since header - field and either an If-None-Match or an If-Modified-Since header - fields is undefined by this specification. - -14.29 Last-Modified - - The Last-Modified entity-header field indicates the date and time at - which the origin server believes the variant was last modified. - - Last-Modified = "Last-Modified" ":" HTTP-date - - - -Fielding, et al. Standards Track [Page 134] - -RFC 2616 HTTP/1.1 June 1999 - - - An example of its use is - - Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT - - The exact meaning of this header field depends on the implementation - of the origin server and the nature of the original resource. For - files, it may be just the file system last-modified time. For - entities with dynamically included parts, it may be the most recent - of the set of last-modify times for its component parts. For database - gateways, it may be the last-update time stamp of the record. For - virtual objects, it may be the last time the internal state changed. - - An origin server MUST NOT send a Last-Modified date which is later - than the server's time of message origination. In such cases, where - the resource's last modification would indicate some time in the - future, the server MUST replace that date with the message - origination date. - - An origin server SHOULD obtain the Last-Modified value of the entity - as close as possible to the time that it generates the Date value of - its response. This allows a recipient to make an accurate assessment - of the entity's modification time, especially if the entity changes - near the time that the response is generated. - - HTTP/1.1 servers SHOULD send Last-Modified whenever feasible. - -14.30 Location - - The Location response-header field is used to redirect the recipient - to a location other than the Request-URI for completion of the - request or identification of a new resource. For 201 (Created) - responses, the Location is that of the new resource which was created - by the request. For 3xx responses, the location SHOULD indicate the - server's preferred URI for automatic redirection to the resource. The - field value consists of a single absolute URI. - - Location = "Location" ":" absoluteURI - [[ [ "#" fragment ] ]] - - An example is: - - Location: http://www.w3.org/pub/WWW/People.html - - Note: The Content-Location header field (section 14.14) differs - from Location in that the Content-Location identifies the original - location of the entity enclosed in the request. It is therefore - possible for a response to contain header fields for both Location - and Content-Location. Also see section 13.10 for cache - requirements of some methods. - - - -Fielding, et al. Standards Track [Page 135] - -RFC 2616 HTTP/1.1 June 1999 - - -14.31 Max-Forwards - - The Max-Forwards request-header field provides a mechanism with the - TRACE (section 9.8) and OPTIONS (section 9.2) methods to limit the - number of proxies or gateways that can forward the request to the - next inbound server. This can be useful when the client is attempting - to trace a request chain which appears to be failing or looping in - mid-chain. - - Max-Forwards = "Max-Forwards" ":" 1*DIGIT - - The Max-Forwards value is a decimal integer indicating the remaining - number of times this request message may be forwarded. - - Each proxy or gateway recipient of a TRACE or OPTIONS request - containing a Max-Forwards header field MUST check and update its - value prior to forwarding the request. If the received value is zero - (0), the recipient MUST NOT forward the request; instead, it MUST - respond as the final recipient. If the received Max-Forwards value is - greater than zero, then the forwarded message MUST contain an updated - Max-Forwards field with a value decremented by one (1). - - The Max-Forwards header field MAY be ignored for all other methods - defined by this specification and for any extension methods for which - it is not explicitly referred to as part of that method definition. - -14.32 Pragma - - The Pragma general-header field is used to include implementation- - specific directives that might apply to any recipient along the - request/response chain. All pragma directives specify optional - behavior from the viewpoint of the protocol; however, some systems - MAY require that behavior be consistent with the directives. - - Pragma = "Pragma" ":" 1#pragma-directive - pragma-directive = "no-cache" | extension-pragma - extension-pragma = token [ "=" ( token | quoted-string ) ] - - When the no-cache directive is present in a request message, an - application SHOULD forward the request toward the origin server even - if it has a cached copy of what is being requested. This pragma - directive has the same semantics as the no-cache cache-directive (see - section 14.9) and is defined here for backward compatibility with - HTTP/1.0. Clients SHOULD include both header fields when a no-cache - request is sent to a server not known to be HTTP/1.1 compliant. - - - - - - -Fielding, et al. Standards Track [Page 136] - -RFC 2616 HTTP/1.1 June 1999 - - - Pragma directives MUST be passed through by a proxy or gateway - application, regardless of their significance to that application, - since the directives might be applicable to all recipients along the - request/response chain. It is not possible to specify a pragma for a - specific recipient; however, any pragma directive not relevant to a - recipient SHOULD be ignored by that recipient. - - HTTP/1.1 caches SHOULD treat "Pragma: no-cache" as if the client had - sent "Cache-Control: no-cache". No new Pragma directives will be - defined in HTTP. - - Note: because the meaning of "Pragma: no-cache as a response - header field is not actually specified, it does not provide a - reliable replacement for "Cache-Control: no-cache" in a response - -14.33 Proxy-Authenticate - - The Proxy-Authenticate response-header field MUST be included as part - of a 407 (Proxy Authentication Required) response. The field value - consists of a challenge that indicates the authentication scheme and - parameters applicable to the proxy for this Request-URI. - - Proxy-Authenticate = "Proxy-Authenticate" ":" 1#challenge - - The HTTP access authentication process is described in "HTTP - Authentication: Basic and Digest Access Authentication" [43]. Unlike - WWW-Authenticate, the Proxy-Authenticate header field applies only to - the current connection and SHOULD NOT be passed on to downstream - clients. However, an intermediate proxy might need to obtain its own - credentials by requesting them from the downstream client, which in - some circumstances will appear as if the proxy is forwarding the - Proxy-Authenticate header field. - -14.34 Proxy-Authorization - - The Proxy-Authorization request-header field allows the client to - identify itself (or its user) to a proxy which requires - authentication. The Proxy-Authorization field value consists of - credentials containing the authentication information of the user - agent for the proxy and/or realm of the resource being requested. - - Proxy-Authorization = "Proxy-Authorization" ":" credentials - - The HTTP access authentication process is described in "HTTP - Authentication: Basic and Digest Access Authentication" [43] . Unlike - Authorization, the Proxy-Authorization header field applies only to - the next outbound proxy that demanded authentication using the Proxy- - Authenticate field. When multiple proxies are used in a chain, the - - - -Fielding, et al. Standards Track [Page 137] - -RFC 2616 HTTP/1.1 June 1999 - - - Proxy-Authorization header field is consumed by the first outbound - proxy that was expecting to receive credentials. A proxy MAY relay - the credentials from the client request to the next proxy if that is - the mechanism by which the proxies cooperatively authenticate a given - request. - -14.35 Range - -14.35.1 Byte Ranges - - Since all HTTP entities are represented in HTTP messages as sequences - of bytes, the concept of a byte range is meaningful for any HTTP - entity. (However, not all clients and servers need to support byte- - range operations.) - - Byte range specifications in HTTP apply to the sequence of bytes in - the entity-body (not necessarily the same as the message-body). - - A byte range operation MAY specify a single range of bytes, or a set - of ranges within a single entity. - - ranges-specifier = byte-ranges-specifier - byte-ranges-specifier = bytes-unit "=" byte-range-set - byte-range-set = 1#( byte-range-spec | suffix-byte-range-spec ) - byte-range-spec = first-byte-pos "-" [last-byte-pos] - first-byte-pos = 1*DIGIT - last-byte-pos = 1*DIGIT - - The first-byte-pos value in a byte-range-spec gives the byte-offset - of the first byte in a range. The last-byte-pos value gives the - byte-offset of the last byte in the range; that is, the byte - positions specified are inclusive. Byte offsets start at zero. - - If the last-byte-pos value is present, it MUST be greater than or - equal to the first-byte-pos in that byte-range-spec, or the byte- - range-spec is syntactically invalid. The recipient of a byte-range- - set that includes one or more syntactically invalid byte-range-spec - values MUST ignore the header field that includes that byte-range- - set. - - If the last-byte-pos value is absent, or if the value is greater than - or equal to the current length of the entity-body, last-byte-pos is - taken to be equal to one less than the current length of the entity- - body in bytes. - - By its choice of last-byte-pos, a client can limit the number of - bytes retrieved without knowing the size of the entity. - - - - -Fielding, et al. Standards Track [Page 138] - -RFC 2616 HTTP/1.1 June 1999 - - - suffix-byte-range-spec = "-" suffix-length - suffix-length = 1*DIGIT - - A suffix-byte-range-spec is used to specify the suffix of the - entity-body, of a length given by the suffix-length value. (That is, - this form specifies the last N bytes of an entity-body.) If the - entity is shorter than the specified suffix-length, the entire - entity-body is used. - - If a syntactically valid byte-range-set includes at least one byte- - range-spec whose first-byte-pos is less than the current length of - the entity-body, or at least one suffix-byte-range-spec with a non- - zero suffix-length, then the byte-range-set is satisfiable. - Otherwise, the byte-range-set is unsatisfiable. If the byte-range-set - is unsatisfiable, the server SHOULD return a response with a status - of 416 (Requested range not satisfiable). Otherwise, the server - SHOULD return a response with a status of 206 (Partial Content) - containing the satisfiable ranges of the entity-body. - - Examples of byte-ranges-specifier values (assuming an entity-body of - length 10000): - - - The first 500 bytes (byte offsets 0-499, inclusive): bytes=0- - 499 - - - The second 500 bytes (byte offsets 500-999, inclusive): - bytes=500-999 - - - The final 500 bytes (byte offsets 9500-9999, inclusive): - bytes=-500 - - - Or bytes=9500- - - - The first and last bytes only (bytes 0 and 9999): bytes=0-0,-1 - - - Several legal but not canonical specifications of the second 500 - bytes (byte offsets 500-999, inclusive): - bytes=500-600,601-999 - bytes=500-700,601-999 - -14.35.2 Range Retrieval Requests - - HTTP retrieval requests using conditional or unconditional GET - methods MAY request one or more sub-ranges of the entity, instead of - the entire entity, using the Range request header, which applies to - the entity returned as the result of the request: - - Range = "Range" ":" ranges-specifier - - - -Fielding, et al. Standards Track [Page 139] - -RFC 2616 HTTP/1.1 June 1999 - - - A server MAY ignore the Range header. However, HTTP/1.1 origin - servers and intermediate caches ought to support byte ranges when - possible, since Range supports efficient recovery from partially - failed transfers, and supports efficient partial retrieval of large - entities. - - If the server supports the Range header and the specified range or - ranges are appropriate for the entity: - - - The presence of a Range header in an unconditional GET modifies - what is returned if the GET is otherwise successful. In other - words, the response carries a status code of 206 (Partial - Content) instead of 200 (OK). - - - The presence of a Range header in a conditional GET (a request - using one or both of If-Modified-Since and If-None-Match, or - one or both of If-Unmodified-Since and If-Match) modifies what - is returned if the GET is otherwise successful and the - condition is true. It does not affect the 304 (Not Modified) - response returned if the conditional is false. - - In some cases, it might be more appropriate to use the If-Range - header (see section 14.27) in addition to the Range header. - - If a proxy that supports ranges receives a Range request, forwards - the request to an inbound server, and receives an entire entity in - reply, it SHOULD only return the requested range to its client. It - SHOULD store the entire received response in its cache if that is - consistent with its cache allocation policies. - -14.36 Referer - - The Referer[sic] request-header field allows the client to specify, - for the server's benefit, the address (URI) of the resource from - which the Request-URI was obtained (the "referrer", although the - header field is misspelled.) The Referer request-header allows a - server to generate lists of back-links to resources for interest, - logging, optimized caching, etc. It also allows obsolete or mistyped - links to be traced for maintenance. The Referer field MUST NOT be - sent if the Request-URI was obtained from a source that does not have - its own URI, such as input from the user keyboard. - - Referer = "Referer" ":" ( absoluteURI | relativeURI ) - - Example: - - Referer: http://www.w3.org/hypertext/DataSources/Overview.html - - - - -Fielding, et al. Standards Track [Page 140] - -RFC 2616 HTTP/1.1 June 1999 - - - If the field value is a relative URI, it SHOULD be interpreted - relative to the Request-URI. The URI MUST NOT include a fragment. See - section 15.1.3 for security considerations. - -14.37 Retry-After - - The Retry-After response-header field can be used with a 503 (Service - Unavailable) response to indicate how long the service is expected to - be unavailable to the requesting client. This field MAY also be used - with any 3xx (Redirection) response to indicate the minimum time the - user-agent is asked wait before issuing the redirected request. The - value of this field can be either an HTTP-date or an integer number - of seconds (in decimal) after the time of the response. - - Retry-After = "Retry-After" ":" ( HTTP-date | delta-seconds ) - - Two examples of its use are - - Retry-After: Fri, 31 Dec 1999 23:59:59 GMT - Retry-After: 120 - - In the latter example, the delay is 2 minutes. - -14.38 Server - - The Server response-header field contains information about the - software used by the origin server to handle the request. The field - can contain multiple product tokens (section 3.8) and comments - identifying the server and any significant subproducts. The product - tokens are listed in order of their significance for identifying the - application. - - Server = "Server" ":" 1*( product | comment ) - - Example: - - Server: CERN/3.0 libwww/2.17 - - If the response is being forwarded through a proxy, the proxy - application MUST NOT modify the Server response-header. Instead, it - SHOULD include a Via field (as described in section 14.45). - [[ Actually, it MUST ]] - - Note: Revealing the specific software version of the server might - allow the server machine to become more vulnerable to attacks - against software that is known to contain security holes. Server - implementors are encouraged to make this field a configurable - option. - - - - -Fielding, et al. Standards Track [Page 141] - -RFC 2616 HTTP/1.1 June 1999 - - -14.39 TE - - The TE request-header field indicates what extension transfer-codings - it is willing to accept in the response and whether or not it is - willing to accept trailer fields in a chunked transfer-coding. Its - value may consist of the keyword "trailers" and/or a comma-separated - list of extension transfer-coding names with optional accept - parameters (as described in section 3.6). - - TE = "TE" ":" #( t-codings ) - t-codings = "trailers" | ( transfer-extension [ accept-params ] ) - - The presence of the keyword "trailers" indicates that the client is - willing to accept trailer fields in a chunked transfer-coding, as - defined in section 3.6.1. This keyword is reserved for use with - transfer-coding values even though it does not itself represent a - transfer-coding. - - Examples of its use are: - - TE: deflate - TE: - TE: trailers, deflate;q=0.5 - - The TE header field only applies to the immediate connection. - Therefore, the keyword MUST be supplied within a Connection header - field (section 14.10) whenever TE is present in an HTTP/1.1 message. - - A server tests whether a transfer-coding is acceptable, according to - a TE field, using these rules: - - 1. The "chunked" transfer-coding is always acceptable. If the - keyword "trailers" is listed, the client indicates that it is - willing to accept trailer fields in the chunked response on - behalf of itself and any downstream clients. The implication is - that, if given, the client is stating that either all - downstream clients are willing to accept trailer fields in the - forwarded response, or that it will attempt to buffer the - response on behalf of downstream recipients. - - Note: HTTP/1.1 does not define any means to limit the size of a - chunked response such that a client can be assured of buffering - the entire response. - - 2. If the transfer-coding being tested is one of the transfer- - codings listed in the TE field, then it is acceptable unless it - is accompanied by a qvalue of 0. (As defined in section 3.9, a - qvalue of 0 means "not acceptable.") - - - -Fielding, et al. Standards Track [Page 142] - -RFC 2616 HTTP/1.1 June 1999 - - - 3. If multiple transfer-codings are acceptable, then the - acceptable transfer-coding with the highest non-zero qvalue is - preferred. The "chunked" transfer-coding always has a qvalue - of 1. - - If the TE field-value is empty or if no TE field is present, the only - transfer-coding is "chunked". A message with no transfer-coding is - always acceptable. - -14.40 Trailer - - The Trailer general field value indicates that the given set of - header fields is present in the trailer of a message encoded with - chunked transfer-coding. - - Trailer = "Trailer" ":" 1#field-name - - An HTTP/1.1 message SHOULD include a Trailer header field in a - message using chunked transfer-coding with a non-empty trailer. Doing - so allows the recipient to know which header fields to expect in the - trailer. - - If no Trailer header field is present, the trailer SHOULD NOT include - any header fields. See section 3.6.1 for restrictions on the use of - trailer fields in a "chunked" transfer-coding. - - Message header fields listed in the Trailer header field MUST NOT - include the following header fields: - - . Transfer-Encoding - - . Content-Length - - . Trailer - -14.41 Transfer-Encoding - - The Transfer-Encoding general-header field indicates what (if any) - type of transformation has been applied to the message body in order - to safely transfer it between the sender and the recipient. This - differs from the content-coding in that the transfer-coding is a - property of the message, not of the entity. - - Transfer-Encoding = "Transfer-Encoding" ":" 1#transfer-coding - - Transfer-codings are defined in section 3.6. An example is: - - Transfer-Encoding: chunked - - - -Fielding, et al. Standards Track [Page 143] - -RFC 2616 HTTP/1.1 June 1999 - - - If multiple encodings have been applied to an entity, the transfer- - codings MUST be listed in the order in which they were applied. - Additional information about the encoding parameters MAY be provided - by other entity-header fields not defined by this specification. - - Many older HTTP/1.0 applications do not understand the Transfer- - Encoding header. - -14.42 Upgrade - - The Upgrade general-header allows the client to specify what - additional communication protocols it supports and would like to use - if the server finds it appropriate to switch protocols. The server - MUST use the Upgrade header field within a 101 (Switching Protocols) - response to indicate which protocol(s) are being switched. - - Upgrade = "Upgrade" ":" 1#product - - For example, - - Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11 - - The Upgrade header field is intended to provide a simple mechanism - for transition from HTTP/1.1 to some other, incompatible protocol. It - does so by allowing the client to advertise its desire to use another - protocol, such as a later version of HTTP with a higher major version - number, even though the current request has been made using HTTP/1.1. - This eases the difficult transition between incompatible protocols by - allowing the client to initiate a request in the more commonly - supported protocol while indicating to the server that it would like - to use a "better" protocol if available (where "better" is determined - by the server, possibly according to the nature of the method and/or - resource being requested). - - The Upgrade header field only applies to switching application-layer - protocols upon the existing transport-layer connection. Upgrade - cannot be used to insist on a protocol change; its acceptance and use - by the server is optional. The capabilities and nature of the - application-layer communication after the protocol change is entirely - dependent upon the new protocol chosen, although the first action - after changing the protocol MUST be a response to the initial HTTP - request containing the Upgrade header field. - - The Upgrade header field only applies to the immediate connection. - Therefore, the upgrade keyword MUST be supplied within a Connection - header field (section 14.10) whenever Upgrade is present in an - HTTP/1.1 message. - - - - -Fielding, et al. Standards Track [Page 144] - -RFC 2616 HTTP/1.1 June 1999 - - - The Upgrade header field cannot be used to indicate a switch to a - protocol on a different connection. For that purpose, it is more - appropriate to use a 301, 302, 303, or 305 redirection response. - - This specification only defines the protocol name "HTTP" for use by - the family of Hypertext Transfer Protocols, as defined by the HTTP - version rules of section 3.1 and future updates to this - specification. Any token can be used as a protocol name; however, it - will only be useful if both the client and server associate the name - with the same protocol. - -14.43 User-Agent - - The User-Agent request-header field contains information about the - user agent originating the request. This is for statistical purposes, - the tracing of protocol violations, and automated recognition of user - agents for the sake of tailoring responses to avoid particular user - agent limitations. User agents SHOULD include this field with - requests. The field can contain multiple product tokens (section 3.8) - and comments identifying the agent and any subproducts which form a - significant part of the user agent. By convention, the product tokens - are listed in order of their significance for identifying the - application. - - User-Agent = "User-Agent" ":" 1*( product | comment ) - - Example: - - User-Agent: CERN-LineMode/2.15 libwww/2.17b3 - -14.44 Vary - - The Vary field value indicates the set of request-header fields that - fully determines, while the response is fresh, whether a cache is - permitted to use the response to reply to a subsequent request - without revalidation. For uncacheable or stale responses, the Vary - field value advises the user agent about the criteria that were used - to select the representation. A Vary field value of "*" implies that - a cache cannot determine from the request headers of a subsequent - request whether this response is the appropriate representation. See - section 13.6 for use of the Vary header field by caches. - - Vary = "Vary" ":" ( "*" | 1#field-name ) - - An HTTP/1.1 server SHOULD include a Vary header field with any - cacheable response that is subject to server-driven negotiation. - Doing so allows a cache to properly interpret future requests on that - resource and informs the user agent about the presence of negotiation - - - -Fielding, et al. Standards Track [Page 145] - -RFC 2616 HTTP/1.1 June 1999 - - - on that resource. A server MAY include a Vary header field with a - non-cacheable response that is subject to server-driven negotiation, - since this might provide the user agent with useful information about - the dimensions over which the response varies at the time of the - response. - - A Vary field value consisting of a list of field-names signals that - the representation selected for the response is based on a selection - algorithm which considers ONLY the listed request-header field values - in selecting the most appropriate representation. A cache MAY assume - that the same selection will be made for future requests with the - same values for the listed field names, for the duration of time for - which the response is fresh. - - The field-names given are not limited to the set of standard - request-header fields defined by this specification. Field names are - case-insensitive. - - A Vary field value of "*" signals that unspecified parameters not - limited to the request-headers (e.g., the network address of the - client), play a role in the selection of the response representation. - The "*" value MUST NOT be generated by a proxy server; it may only be - generated by an origin server. - -14.45 Via - - The Via general-header field MUST be used by gateways and proxies to - indicate the intermediate protocols and recipients between the user - agent and the server on requests, and between the origin server and - the client on responses. It is analogous to the "Received" field of - RFC 822 [9] and is intended to be used for tracking message forwards, - avoiding request loops, and identifying the protocol capabilities of - all senders along the request/response chain. - - Via = "Via" ":" 1#( received-protocol received-by [ comment ] ) - received-protocol = [ protocol-name "/" ] protocol-version - protocol-name = token - protocol-version = token - received-by = ( host [ ":" port ] ) | pseudonym - pseudonym = token - - The received-protocol indicates the protocol version of the message - received by the server or client along each segment of the - request/response chain. The received-protocol version is appended to - the Via field value when the message is forwarded so that information - about the protocol capabilities of upstream applications remains - visible to all recipients. - - - - -Fielding, et al. Standards Track [Page 146] - -RFC 2616 HTTP/1.1 June 1999 - - - The protocol-name is optional if and only if it would be "HTTP". The - received-by field is normally the host and optional port number of a - recipient server or client that subsequently forwarded the message. - However, if the real host is considered to be sensitive information, - it MAY be replaced by a pseudonym. If the port is not given, it MAY - be assumed to be the default port of the received-protocol. - - Multiple Via field values represents each proxy or gateway that has - forwarded the message. Each recipient MUST append its information - such that the end result is ordered according to the sequence of - forwarding applications. - - Comments MAY be used in the Via header field to identify the software - of the recipient proxy or gateway, analogous to the User-Agent and - Server header fields. However, all comments in the Via field are - optional and MAY be removed by any recipient prior to forwarding the - message. - - For example, a request message could be sent from an HTTP/1.0 user - agent to an internal proxy code-named "fred", which uses HTTP/1.1 to - forward the request to a public proxy at nowhere.com, which completes - the request by forwarding it to the origin server at www.ics.uci.edu. - The request received by www.ics.uci.edu would then have the following - Via header field: - - Via: 1.0 fred, 1.1 nowhere.com (Apache/1.1) - - Proxies and gateways used as a portal through a network firewall - SHOULD NOT, by default, forward the names and ports of hosts within - the firewall region. This information SHOULD only be propagated if - explicitly enabled. If not enabled, the received-by host of any host - behind the firewall SHOULD be replaced by an appropriate pseudonym - for that host. - - For organizations that have strong privacy requirements for hiding - internal structures, a proxy MAY combine an ordered subsequence of - Via header field entries with identical received-protocol values into - a single such entry. For example, - - Via: 1.0 ricky, 1.1 ethel, 1.1 fred, 1.0 lucy - - could be collapsed to - - Via: 1.0 ricky, 1.1 mertz, 1.0 lucy - - - - - - - -Fielding, et al. Standards Track [Page 147] - -RFC 2616 HTTP/1.1 June 1999 - - - Applications SHOULD NOT combine multiple entries unless they are all - under the same organizational control and the hosts have already been - replaced by pseudonyms. Applications MUST NOT combine entries which - have different received-protocol values. - -14.46 Warning - - The Warning general-header field is used to carry additional - information about the status or transformation of a message which - might not be reflected in the message. This information is typically - used to warn about a possible lack of semantic transparency from - caching operations or transformations applied to the entity body of - the message. - - Warning headers are sent with responses using: - - Warning = "Warning" ":" 1#warning-value - - warning-value = warn-code SP warn-agent SP warn-text - [SP warn-date] - - warn-code = 3DIGIT - warn-agent = ( host [ ":" port ] ) | pseudonym - ; the name or pseudonym of the server adding - ; the Warning header, for use in debugging - warn-text = quoted-string - warn-date = <"> HTTP-date <"> - - A response MAY carry more than one Warning header. - - The warn-text SHOULD be in a natural language and character set that - is most likely to be intelligible to the human user receiving the - response. This decision MAY be based on any available knowledge, such - as the location of the cache or user, the Accept-Language field in a - request, the Content-Language field in a response, etc. The default - language is English and the default character set is ISO-8859-1. - - If a character set other than ISO-8859-1 is used, it MUST be encoded - in the warn-text using the method described in RFC 2047 [14]. - - Warning headers can in general be applied to any message, however - some specific warn-codes are specific to caches and can only be - applied to response messages. New Warning headers SHOULD be added - after any existing Warning headers. A cache MUST NOT delete any - Warning header that it received with a message. However, if a cache - successfully validates a cache entry, it SHOULD remove any Warning - headers previously attached to that entry except as specified for - - - - -Fielding, et al. Standards Track [Page 148] - -RFC 2616 HTTP/1.1 June 1999 - - - specific Warning codes. It MUST then add any Warning headers received - in the validating response. In other words, Warning headers are those - that would be attached to the most recent relevant response. - - When multiple Warning headers are attached to a response, the user - agent ought to inform the user of as many of them as possible, in the - order that they appear in the response. If it is not possible to - inform the user of all of the warnings, the user agent SHOULD follow - these heuristics: - - - Warnings that appear early in the response take priority over - those appearing later in the response. - - - Warnings in the user's preferred character set take priority - over warnings in other character sets but with identical warn- - codes and warn-agents. - - Systems that generate multiple Warning headers SHOULD order them with - this user agent behavior in mind. - - Requirements for the behavior of caches with respect to Warnings are - stated in section 13.1.2. - - This is a list of the currently-defined warn-codes, each with a - recommended warn-text in English, and a description of its meaning. - - 110 Response is stale - MUST be included whenever the returned response is stale. - - 111 Revalidation failed - MUST be included if a cache returns a stale response because an - attempt to revalidate the response failed, due to an inability to - reach the server. - - 112 Disconnected operation - SHOULD be included if the cache is intentionally disconnected from - the rest of the network for a period of time. - - 113 Heuristic expiration - MUST be included if the cache heuristically chose a freshness - lifetime greater than 24 hours and the response's age is greater - than 24 hours. - - 199 Miscellaneous warning - The warning text MAY include arbitrary information to be presented - to a human user, or logged. A system receiving this warning MUST - NOT take any automated action, besides presenting the warning to - the user. - - - -Fielding, et al. Standards Track [Page 149] - -RFC 2616 HTTP/1.1 June 1999 - - - 214 Transformation applied - MUST be added by an intermediate cache or proxy if it applies any - transformation changing the content-coding (as specified in the - Content-Encoding header) or media-type (as specified in the - Content-Type header) of the response, or the entity-body of the - response, unless this Warning code already appears in the response. - - 299 Miscellaneous persistent warning - The warning text MAY include arbitrary information to be presented - to a human user, or logged. A system receiving this warning MUST - NOT take any automated action. - - If an implementation sends a message with one or more Warning headers - whose version is HTTP/1.0 or lower, then the sender MUST include in - each warning-value a warn-date that matches the date in the response. - - If an implementation receives a message with a warning-value that - includes a warn-date, and that warn-date is different from the Date - value in the response, then that warning-value MUST be deleted from - the message before storing, forwarding, or using it. (This prevents - bad consequences of naive caching of Warning header fields.) If all - of the warning-values are deleted for this reason, the Warning header - MUST be deleted as well. - -14.47 WWW-Authenticate - - The WWW-Authenticate response-header field MUST be included in 401 - (Unauthorized) response messages. The field value consists of at - least one challenge that indicates the authentication scheme(s) and - parameters applicable to the Request-URI. - - WWW-Authenticate = "WWW-Authenticate" ":" 1#challenge - - The HTTP access authentication process is described in "HTTP - Authentication: Basic and Digest Access Authentication" [43]. User - agents are advised to take special care in parsing the WWW- - Authenticate field value as it might contain more than one challenge, - or if more than one WWW-Authenticate header field is provided, the - contents of a challenge itself can contain a comma-separated list of - authentication parameters. - -15 Security Considerations - - This section is meant to inform application developers, information - providers, and users of the security limitations in HTTP/1.1 as - described by this document. The discussion does not include - definitive solutions to the problems revealed, though it does make - some suggestions for reducing security risks. - - - -Fielding, et al. Standards Track [Page 150] - -RFC 2616 HTTP/1.1 June 1999 - - -15.1 Personal Information - - HTTP clients are often privy to large amounts of personal information - (e.g. the user's name, location, mail address, passwords, encryption - keys, etc.), and SHOULD be very careful to prevent unintentional - leakage of this information via the HTTP protocol to other sources. - We very strongly recommend that a convenient interface be provided - for the user to control dissemination of such information, and that - designers and implementors be particularly careful in this area. - History shows that errors in this area often create serious security - and/or privacy problems and generate highly adverse publicity for the - implementor's company. - -15.1.1 Abuse of Server Log Information - - A server is in the position to save personal data about a user's - requests which might identify their reading patterns or subjects of - interest. This information is clearly confidential in nature and its - handling can be constrained by law in certain countries. People using - the HTTP protocol to provide data are responsible for ensuring that - such material is not distributed without the permission of any - individuals that are identifiable by the published results. - -15.1.2 Transfer of Sensitive Information - - Like any generic data transfer protocol, HTTP cannot regulate the - content of the data that is transferred, nor is there any a priori - method of determining the sensitivity of any particular piece of - information within the context of any given request. Therefore, - applications SHOULD supply as much control over this information as - possible to the provider of that information. Four header fields are - worth special mention in this context: Server, Via, Referer and From. - - Revealing the specific software version of the server might allow the - server machine to become more vulnerable to attacks against software - that is known to contain security holes. Implementors SHOULD make the - Server header field a configurable option. - - Proxies which serve as a portal through a network firewall SHOULD - take special precautions regarding the transfer of header information - that identifies the hosts behind the firewall. In particular, they - SHOULD remove, or replace with sanitized versions, any Via fields - generated behind the firewall. - - The Referer header allows reading patterns to be studied and reverse - links drawn. Although it can be very useful, its power can be abused - if user details are not separated from the information contained in - - - - -Fielding, et al. Standards Track [Page 151] - -RFC 2616 HTTP/1.1 June 1999 - - - the Referer. Even when the personal information has been removed, the - Referer header might indicate a private document's URI whose - publication would be inappropriate. - - The information sent in the From field might conflict with the user's - privacy interests or their site's security policy, and hence it - SHOULD NOT be transmitted without the user being able to disable, - enable, and modify the contents of the field. The user MUST be able - to set the contents of this field within a user preference or - application defaults configuration. - - We suggest, though do not require, that a convenient toggle interface - be provided for the user to enable or disable the sending of From and - Referer information. - - The User-Agent (section 14.43) or Server (section 14.38) header - fields can sometimes be used to determine that a specific client or - server have a particular security hole which might be exploited. - Unfortunately, this same information is often used for other valuable - purposes for which HTTP currently has no better mechanism. - -15.1.3 Encoding Sensitive Information in URI's - - Because the source of a link might be private information or might - reveal an otherwise private information source, it is strongly - recommended that the user be able to select whether or not the - Referer field is sent. For example, a browser client could have a - toggle switch for browsing openly/anonymously, which would - respectively enable/disable the sending of Referer and From - information. - - Clients SHOULD NOT include a Referer header field in a (non-secure) - HTTP request if the referring page was transferred with a secure - protocol. - - Authors of services which use the HTTP protocol SHOULD NOT use GET - based forms for the submission of sensitive data, because this will - cause this data to be encoded in the Request-URI. Many existing - servers, proxies, and user agents will log the request URI in some - place where it might be visible to third parties. Servers can use - POST-based form submission instead - -15.1.4 Privacy Issues Connected to Accept Headers - - Accept request-headers can reveal information about the user to all - servers which are accessed. The Accept-Language header in particular - can reveal information the user would consider to be of a private - nature, because the understanding of particular languages is often - - - -Fielding, et al. Standards Track [Page 152] - -RFC 2616 HTTP/1.1 June 1999 - - - strongly correlated to the membership of a particular ethnic group. - User agents which offer the option to configure the contents of an - Accept-Language header to be sent in every request are strongly - encouraged to let the configuration process include a message which - makes the user aware of the loss of privacy involved. - - An approach that limits the loss of privacy would be for a user agent - to omit the sending of Accept-Language headers by default, and to ask - the user whether or not to start sending Accept-Language headers to a - server if it detects, by looking for any Vary response-header fields - generated by the server, that such sending could improve the quality - of service. - - Elaborate user-customized accept header fields sent in every request, - in particular if these include quality values, can be used by servers - as relatively reliable and long-lived user identifiers. Such user - identifiers would allow content providers to do click-trail tracking, - and would allow collaborating content providers to match cross-server - click-trails or form submissions of individual users. Note that for - many users not behind a proxy, the network address of the host - running the user agent will also serve as a long-lived user - identifier. In environments where proxies are used to enhance - privacy, user agents ought to be conservative in offering accept - header configuration options to end users. As an extreme privacy - measure, proxies could filter the accept headers in relayed requests. - General purpose user agents which provide a high degree of header - configurability SHOULD warn users about the loss of privacy which can - be involved. - -15.2 Attacks Based On File and Path Names - - Implementations of HTTP origin servers SHOULD be careful to restrict - the documents returned by HTTP requests to be only those that were - intended by the server administrators. If an HTTP server translates - HTTP URIs directly into file system calls, the server MUST take - special care not to serve files that were not intended to be - delivered to HTTP clients. For example, UNIX, Microsoft Windows, and - other operating systems use ".." as a path component to indicate a - directory level above the current one. On such a system, an HTTP - server MUST disallow any such construct in the Request-URI if it - would otherwise allow access to a resource outside those intended to - be accessible via the HTTP server. Similarly, files intended for - reference only internally to the server (such as access control - files, configuration files, and script code) MUST be protected from - inappropriate retrieval, since they might contain sensitive - information. Experience has shown that minor bugs in such HTTP server - implementations have turned into security risks. - - - - -Fielding, et al. Standards Track [Page 153] - -RFC 2616 HTTP/1.1 June 1999 - - -15.3 DNS Spoofing - - Clients using HTTP rely heavily on the Domain Name Service, and are - thus generally prone to security attacks based on the deliberate - mis-association of IP addresses and DNS names. Clients need to be - cautious in assuming the continuing validity of an IP number/DNS name - association. - - In particular, HTTP clients SHOULD rely on their name resolver for - confirmation of an IP number/DNS name association, rather than - caching the result of previous host name lookups. Many platforms - already can cache host name lookups locally when appropriate, and - they SHOULD be configured to do so. It is proper for these lookups to - be cached, however, only when the TTL (Time To Live) information - reported by the name server makes it likely that the cached - information will remain useful. - - If HTTP clients cache the results of host name lookups in order to - achieve a performance improvement, they MUST observe the TTL - information reported by DNS. - - If HTTP clients do not observe this rule, they could be spoofed when - a previously-accessed server's IP address changes. As network - renumbering is expected to become increasingly common [24], the - possibility of this form of attack will grow. Observing this - requirement thus reduces this potential security vulnerability. - - This requirement also improves the load-balancing behavior of clients - for replicated servers using the same DNS name and reduces the - likelihood of a user's experiencing failure in accessing sites which - use that strategy. - -15.4 Location Headers and Spoofing - - If a single server supports multiple organizations that do not trust - one another, then it MUST check the values of Location and Content- - Location headers in responses that are generated under control of - said organizations to make sure that they do not attempt to - invalidate resources over which they have no authority. - -15.5 Content-Disposition Issues - - RFC 1806 [35], from which the often implemented Content-Disposition - (see section 19.5.1) header in HTTP is derived, has a number of very - serious security considerations. Content-Disposition is not part of - the HTTP standard, but since it is widely implemented, we are - documenting its use and risks for implementors. See RFC 2183 [49] - (which updates RFC 1806) for details. - - - -Fielding, et al. Standards Track [Page 154] - -RFC 2616 HTTP/1.1 June 1999 - - -15.6 Authentication Credentials and Idle Clients - - Existing HTTP clients and user agents typically retain authentication - information indefinitely. HTTP/1.1. does not provide a method for a - server to direct clients to discard these cached credentials. This is - a significant defect that requires further extensions to HTTP. - Circumstances under which credential caching can interfere with the - application's security model include but are not limited to: - - - Clients which have been idle for an extended period following - which the server might wish to cause the client to reprompt the - user for credentials. - - - Applications which include a session termination indication - (such as a `logout' or `commit' button on a page) after which - the server side of the application `knows' that there is no - further reason for the client to retain the credentials. - - This is currently under separate study. There are a number of work- - arounds to parts of this problem, and we encourage the use of - password protection in screen savers, idle time-outs, and other - methods which mitigate the security problems inherent in this - problem. In particular, user agents which cache credentials are - encouraged to provide a readily accessible mechanism for discarding - cached credentials under user control. - -15.7 Proxies and Caching - - By their very nature, HTTP proxies are men-in-the-middle, and - represent an opportunity for man-in-the-middle attacks. Compromise of - the systems on which the proxies run can result in serious security - and privacy problems. Proxies have access to security-related - information, personal information about individual users and - organizations, and proprietary information belonging to users and - content providers. A compromised proxy, or a proxy implemented or - configured without regard to security and privacy considerations, - might be used in the commission of a wide range of potential attacks. - - Proxy operators should protect the systems on which proxies run as - they would protect any system that contains or transports sensitive - information. In particular, log information gathered at proxies often - contains highly sensitive personal information, and/or information - about organizations. Log information should be carefully guarded, and - appropriate guidelines for use developed and followed. (Section - 15.1.1). - - - - - - -Fielding, et al. Standards Track [Page 155] - -RFC 2616 HTTP/1.1 June 1999 - - - Caching proxies provide additional potential vulnerabilities, since - the contents of the cache represent an attractive target for - malicious exploitation. Because cache contents persist after an HTTP - request is complete, an attack on the cache can reveal information - long after a user believes that the information has been removed from - the network. Therefore, cache contents should be protected as - sensitive information. - - Proxy implementors should consider the privacy and security - implications of their design and coding decisions, and of the - configuration options they provide to proxy operators (especially the - default configuration). - - Users of a proxy need to be aware that they are no trustworthier than - the people who run the proxy; HTTP itself cannot solve this problem. - - The judicious use of cryptography, when appropriate, may suffice to - protect against a broad range of security and privacy attacks. Such - cryptography is beyond the scope of the HTTP/1.1 specification. - -15.7.1 Denial of Service Attacks on Proxies - - They exist. They are hard to defend against. Research continues. - Beware. - -16 Acknowledgments - - This specification makes heavy use of the augmented BNF and generic - constructs defined by David H. Crocker for RFC 822 [9]. Similarly, it - reuses many of the definitions provided by Nathaniel Borenstein and - Ned Freed for MIME [7]. We hope that their inclusion in this - specification will help reduce past confusion over the relationship - between HTTP and Internet mail message formats. - - The HTTP protocol has evolved considerably over the years. It has - benefited from a large and active developer community--the many - people who have participated on the www-talk mailing list--and it is - that community which has been most responsible for the success of - HTTP and of the World-Wide Web in general. Marc Andreessen, Robert - Cailliau, Daniel W. Connolly, Bob Denny, John Franks, Jean-Francois - Groff, Phillip M. Hallam-Baker, Hakon W. Lie, Ari Luotonen, Rob - McCool, Lou Montulli, Dave Raggett, Tony Sanders, and Marc - VanHeyningen deserve special recognition for their efforts in - defining early aspects of the protocol. - - This document has benefited greatly from the comments of all those - participating in the HTTP-WG. In addition to those already mentioned, - the following individuals have contributed to this specification: - - - -Fielding, et al. Standards Track [Page 156] - -RFC 2616 HTTP/1.1 June 1999 - - - Gary Adams Ross Patterson - Harald Tveit Alvestrand Albert Lunde - Keith Ball John C. Mallery - Brian Behlendorf Jean-Philippe Martin-Flatin - Paul Burchard Mitra - Maurizio Codogno David Morris - Mike Cowlishaw Gavin Nicol - Roman Czyborra Bill Perry - Michael A. Dolan Jeffrey Perry - David J. Fiander Scott Powers - Alan Freier Owen Rees - Marc Hedlund Luigi Rizzo - Greg Herlihy David Robinson - Koen Holtman Marc Salomon - Alex Hopmann Rich Salz - Bob Jernigan Allan M. Schiffman - Shel Kaphan Jim Seidman - Rohit Khare Chuck Shotton - John Klensin Eric W. Sink - Martijn Koster Simon E. Spero - Alexei Kosut Richard N. Taylor - David M. Kristol Robert S. Thau - Daniel LaLiberte Bill (BearHeart) Weinman - Ben Laurie Francois Yergeau - Paul J. Leach Mary Ellen Zurko - Daniel DuBois Josh Cohen - - - Much of the content and presentation of the caching design is due to - suggestions and comments from individuals including: Shel Kaphan, - Paul Leach, Koen Holtman, David Morris, and Larry Masinter. - - Most of the specification of ranges is based on work originally done - by Ari Luotonen and John Franks, with additional input from Steve - Zilles. - - Thanks to the "cave men" of Palo Alto. You know who you are. - - Jim Gettys (the current editor of this document) wishes particularly - to thank Roy Fielding, the previous editor of this document, along - with John Klensin, Jeff Mogul, Paul Leach, Dave Kristol, Koen - Holtman, John Franks, Josh Cohen, Alex Hopmann, Scott Lawrence, and - Larry Masinter for their help. And thanks go particularly to Jeff - Mogul and Scott Lawrence for performing the "MUST/MAY/SHOULD" audit. - - - - - - - -Fielding, et al. Standards Track [Page 157] - -RFC 2616 HTTP/1.1 June 1999 - - - The Apache Group, Anselm Baird-Smith, author of Jigsaw, and Henrik - Frystyk implemented RFC 2068 early, and we wish to thank them for the - discovery of many of the problems that this document attempts to - rectify. - -17 References - - [1] Alvestrand, H., "Tags for the Identification of Languages", RFC - 1766, March 1995. - - [2] Anklesaria, F., McCahill, M., Lindner, P., Johnson, D., Torrey, - D. and B. Alberti, "The Internet Gopher Protocol (a distributed - document search and retrieval protocol)", RFC 1436, March 1993. - - [3] Berners-Lee, T., "Universal Resource Identifiers in WWW", RFC - 1630, June 1994. - - [4] Berners-Lee, T., Masinter, L. and M. McCahill, "Uniform Resource - Locators (URL)", RFC 1738, December 1994. - - [5] Berners-Lee, T. and D. Connolly, "Hypertext Markup Language - - 2.0", RFC 1866, November 1995. - - [6] Berners-Lee, T., Fielding, R. and H. Frystyk, "Hypertext Transfer - Protocol -- HTTP/1.0", RFC 1945, May 1996. - - [7] Freed, N. and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part One: Format of Internet Message Bodies", - RFC 2045, November 1996. - - [8] Braden, R., "Requirements for Internet Hosts -- Communication - Layers", STD 3, RFC 1123, October 1989. - - [9] Crocker, D., "Standard for The Format of ARPA Internet Text - Messages", STD 11, RFC 822, August 1982. - - [10] Davis, F., Kahle, B., Morris, H., Salem, J., Shen, T., Wang, R., - Sui, J., and M. Grinbaum, "WAIS Interface Protocol Prototype - Functional Specification," (v1.5), Thinking Machines - Corporation, April 1990. - - [11] Fielding, R., "Relative Uniform Resource Locators", RFC 1808, - June 1995. - - [12] Horton, M. and R. Adams, "Standard for Interchange of USENET - Messages", RFC 1036, December 1987. - - - - - -Fielding, et al. Standards Track [Page 158] - -RFC 2616 HTTP/1.1 June 1999 - - - [13] Kantor, B. and P. Lapsley, "Network News Transfer Protocol", RFC - 977, February 1986. - - [14] Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part - Three: Message Header Extensions for Non-ASCII Text", RFC 2047, - November 1996. - - [15] Nebel, E. and L. Masinter, "Form-based File Upload in HTML", RFC - 1867, November 1995. - - [16] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC 821, - August 1982. - - [17] Postel, J., "Media Type Registration Procedure", RFC 1590, - November 1996. - -[[ Should be: ]] -[[ [17] Freed, N., Klensin, J., and Postel, J., "Multipurpose Internet ]] -[[ Mail Extensions (MIME) Part Four: "Registration Procedure", ]] -[[ RFC 2048, November 1996. ]] - - [18] Postel, J. and J. Reynolds, "File Transfer Protocol", STD 9, RFC - 959, October 1985. - - [19] Reynolds, J. and J. Postel, "Assigned Numbers", STD 2, RFC 1700, - October 1994. - - [20] Sollins, K. and L. Masinter, "Functional Requirements for - Uniform Resource Names", RFC 1737, December 1994. - - [21] US-ASCII. Coded Character Set - 7-Bit American Standard Code for - Information Interchange. Standard ANSI X3.4-1986, ANSI, 1986. - - [22] ISO-8859. International Standard -- Information Processing -- - 8-bit Single-Byte Coded Graphic Character Sets -- - Part 1: Latin alphabet No. 1, ISO-8859-1:1987. - Part 2: Latin alphabet No. 2, ISO-8859-2, 1987. - Part 3: Latin alphabet No. 3, ISO-8859-3, 1988. - Part 4: Latin alphabet No. 4, ISO-8859-4, 1988. - Part 5: Latin/Cyrillic alphabet, ISO-8859-5, 1988. - Part 6: Latin/Arabic alphabet, ISO-8859-6, 1987. - Part 7: Latin/Greek alphabet, ISO-8859-7, 1987. - Part 8: Latin/Hebrew alphabet, ISO-8859-8, 1988. - Part 9: Latin alphabet No. 5, ISO-8859-9, 1990. - - [23] Meyers, J. and M. Rose, "The Content-MD5 Header Field", RFC - 1864, October 1995. - - [24] Carpenter, B. and Y. Rekhter, "Renumbering Needs Work", RFC - 1900, February 1996. - - [25] Deutsch, P., "GZIP file format specification version 4.3", RFC - 1952, May 1996. - - - -Fielding, et al. Standards Track [Page 159] - -RFC 2616 HTTP/1.1 June 1999 - - - [26] Venkata N. Padmanabhan, and Jeffrey C. Mogul. "Improving HTTP - Latency", Computer Networks and ISDN Systems, v. 28, pp. 25-35, - Dec. 1995. Slightly revised version of paper in Proc. 2nd - International WWW Conference '94: Mosaic and the Web, Oct. 1994, - which is available at - http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/DDay/mogul/HTTPLat - ency.html. - - [27] Joe Touch, John Heidemann, and Katia Obraczka. "Analysis of HTTP - Performance", <URL: http://www.isi.edu/touch/pubs/http-perf96/>, - ISI Research Report ISI/RR-98-463, (original report dated Aug. - 1996), USC/Information Sciences Institute, August 1998. - - [28] Mills, D., "Network Time Protocol (Version 3) Specification, - Implementation and Analysis", RFC 1305, March 1992. - - [29] Deutsch, P., "DEFLATE Compressed Data Format Specification - version 1.3", RFC 1951, May 1996. - - [30] S. Spero, "Analysis of HTTP Performance Problems," - http://sunsite.unc.edu/mdma-release/http-prob.html. - - [31] Deutsch, P. and J. Gailly, "ZLIB Compressed Data Format - Specification version 3.3", RFC 1950, May 1996. - - [32] Franks, J., Hallam-Baker, P., Hostetler, J., Leach, P., - Luotonen, A., Sink, E. and L. Stewart, "An Extension to HTTP: - Digest Access Authentication", RFC 2069, January 1997. - - [33] Fielding, R., Gettys, J., Mogul, J., Frystyk, H. and T. - Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC - 2068, January 1997. - - [34] Bradner, S., "Key words for use in RFCs to Indicate Requirement - Levels", BCP 14, RFC 2119, March 1997. - - [35] Troost, R. and Dorner, S., "Communicating Presentation - Information in Internet Messages: The Content-Disposition - Header", RFC 1806, June 1995. - - [36] Mogul, J., Fielding, R., Gettys, J. and H. Frystyk, "Use and - Interpretation of HTTP Version Numbers", RFC 2145, May 1997. - [jg639] - - [37] Palme, J., "Common Internet Message Headers", RFC 2076, February - 1997. [jg640] - - - - - -Fielding, et al. Standards Track [Page 160] - -RFC 2616 HTTP/1.1 June 1999 - - - [38] Yergeau, F., "UTF-8, a transformation format of Unicode and - ISO-10646", RFC 2279, January 1998. [jg641] - - [39] Nielsen, H.F., Gettys, J., Baird-Smith, A., Prud'hommeaux, E., - Lie, H., and C. Lilley. "Network Performance Effects of - HTTP/1.1, CSS1, and PNG," Proceedings of ACM SIGCOMM '97, Cannes - France, September 1997.[jg642] - - [40] Freed, N. and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part Two: Media Types", RFC 2046, November - 1996. [jg643] - - [41] Alvestrand, H., "IETF Policy on Character Sets and Languages", - BCP 18, RFC 2277, January 1998. [jg644] - - [42] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource - Identifiers (URI): Generic Syntax and Semantics", RFC 2396, - August 1998. [jg645] - - [43] Franks, J., Hallam-Baker, P., Hostetler, J., Lawrence, S., - Leach, P., Luotonen, A., Sink, E. and L. Stewart, "HTTP - Authentication: Basic and Digest Access Authentication", RFC - 2617, June 1999. [jg646] - - [44] Luotonen, A., "Tunneling TCP based protocols through Web proxy - servers," Work in Progress. [jg647] - - [45] Palme, J. and A. Hopmann, "MIME E-mail Encapsulation of - Aggregate Documents, such as HTML (MHTML)", RFC 2110, March - 1997. - - [46] Bradner, S., "The Internet Standards Process -- Revision 3", BCP - 9, RFC 2026, October 1996. - - [47] Masinter, L., "Hyper Text Coffee Pot Control Protocol - (HTCPCP/1.0)", RFC 2324, 1 April 1998. - - [48] Freed, N. and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part Five: Conformance Criteria and Examples", - RFC 2049, November 1996. - - [49] Troost, R., Dorner, S. and K. Moore, "Communicating Presentation - Information in Internet Messages: The Content-Disposition Header - Field", RFC 2183, August 1997. - - - - - - - -Fielding, et al. Standards Track [Page 161] - -RFC 2616 HTTP/1.1 June 1999 - - -18 Authors' Addresses - - Roy T. Fielding - Information and Computer Science - University of California, Irvine - Irvine, CA 92697-3425, USA - - Fax: +1 (949) 824-1715 - EMail: fielding@ics.uci.edu - - - James Gettys - World Wide Web Consortium - MIT Laboratory for Computer Science - 545 Technology Square - Cambridge, MA 02139, USA - - Fax: +1 (617) 258 8682 - EMail: jg@w3.org - - - Jeffrey C. Mogul - Western Research Laboratory - Compaq Computer Corporation - 250 University Avenue - Palo Alto, California, 94305, USA - - EMail: mogul@wrl.dec.com - - - Henrik Frystyk Nielsen - World Wide Web Consortium - MIT Laboratory for Computer Science - 545 Technology Square - Cambridge, MA 02139, USA - - Fax: +1 (617) 258 8682 - EMail: frystyk@w3.org - - - Larry Masinter - Xerox Corporation - 3333 Coyote Hill Road - Palo Alto, CA 94034, USA - - EMail: masinter@parc.xerox.com - - - - - -Fielding, et al. Standards Track [Page 162] - -RFC 2616 HTTP/1.1 June 1999 - - - Paul J. Leach - Microsoft Corporation - 1 Microsoft Way - Redmond, WA 98052, USA - - EMail: paulle@microsoft.com - - - Tim Berners-Lee - Director, World Wide Web Consortium - MIT Laboratory for Computer Science - 545 Technology Square - Cambridge, MA 02139, USA - - Fax: +1 (617) 258 8682 - EMail: timbl@w3.org - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Fielding, et al. Standards Track [Page 163] - -RFC 2616 HTTP/1.1 June 1999 - - -19 Appendices - -19.1 Internet Media Type message/http and application/http - - In addition to defining the HTTP/1.1 protocol, this document serves - as the specification for the Internet media type "message/http" and - "application/http". The message/http type can be used to enclose a - single HTTP request or response message, provided that it obeys the - MIME restrictions for all "message" types regarding line length and - encodings. The application/http type can be used to enclose a - pipeline of one or more HTTP request or response messages (not - intermixed). The following is to be registered with IANA [17]. - - Media Type name: message - Media subtype name: http - Required parameters: none - Optional parameters: version, msgtype - version: The HTTP-Version number of the enclosed message - (e.g., "1.1"). If not present, the version can be - determined from the first line of the body. - msgtype: The message type -- "request" or "response". If not - present, the type can be determined from the first - line of the body. - Encoding considerations: only "7bit", "8bit", or "binary" are - permitted - Security considerations: none - - Media Type name: application - Media subtype name: http - Required parameters: none - Optional parameters: version, msgtype - version: The HTTP-Version number of the enclosed messages - (e.g., "1.1"). If not present, the version can be - determined from the first line of the body. - msgtype: The message type -- "request" or "response". If not - present, the type can be determined from the first - line of the body. - Encoding considerations: HTTP messages enclosed by this type - are in "binary" format; use of an appropriate - Content-Transfer-Encoding is required when - transmitted via E-mail. - Security considerations: none - - - - - - - - - -Fielding, et al. Standards Track [Page 164] - -RFC 2616 HTTP/1.1 June 1999 - - -19.2 Internet Media Type multipart/byteranges - - When an HTTP 206 (Partial Content) response message includes the - content of multiple ranges (a response to a request for multiple - non-overlapping ranges), these are transmitted as a multipart - message-body. The media type for this purpose is called - "multipart/byteranges". - - The multipart/byteranges media type includes two or more parts, each - with its own Content-Type and Content-Range fields. The required - boundary parameter specifies the boundary string used to separate - each body-part. - - Media Type name: multipart - Media subtype name: byteranges - Required parameters: boundary - Optional parameters: none - Encoding considerations: only "7bit", "8bit", or "binary" are - permitted - Security considerations: none - - - For example: - - HTTP/1.1 206 Partial Content - Date: Wed, 15 Nov 1995 06:25:24 GMT - Last-Modified: Wed, 15 Nov 1995 04:58:08 GMT - Content-type: multipart/byteranges; boundary=THIS_STRING_SEPARATES - - --THIS_STRING_SEPARATES - Content-type: application/pdf - Content-range: bytes 500-999/8000 - - ...the first range... - --THIS_STRING_SEPARATES - Content-type: application/pdf - Content-range: bytes 7000-7999/8000 - - ...the second range - --THIS_STRING_SEPARATES-- - - Notes: - - 1) Additional CRLFs may precede the first boundary string in the - entity. - - - - - - -Fielding, et al. Standards Track [Page 165] - -RFC 2616 HTTP/1.1 June 1999 - - - 2) Although RFC 2046 [40] permits the boundary string to be - quoted, some existing implementations handle a quoted boundary - string incorrectly. - - 3) A number of browsers and servers were coded to an early draft - of the byteranges specification to use a media type of - multipart/x-byteranges, which is almost, but not quite - compatible with the version documented in HTTP/1.1. - -19.3 Tolerant Applications - - Although this document specifies the requirements for the generation - of HTTP/1.1 messages, not all applications will be correct in their - implementation. We therefore recommend that operational applications - be tolerant of deviations whenever those deviations can be - interpreted unambiguously. - - Clients SHOULD be tolerant in parsing the Status-Line and servers - tolerant when parsing the Request-Line. In particular, they SHOULD - accept any amount of SP or HT characters between fields, even though - only a single SP is required. - - The line terminator for message-header fields is the sequence CRLF. - However, we recommend that applications, when parsing such headers, - recognize a single LF as a line terminator and ignore the leading CR. - - The character set of an entity-body SHOULD be labeled as the lowest - common denominator of the character codes used within that body, with - the exception that not labeling the entity is preferred over labeling - the entity with the labels US-ASCII or ISO-8859-1. See section 3.7.1 - and 3.4.1. - - Additional rules for requirements on parsing and encoding of dates - and other potential problems with date encodings include: - - - HTTP/1.1 clients and caches SHOULD assume that an RFC-850 date - which appears to be more than 50 years in the future is in fact - in the past (this helps solve the "year 2000" problem). - - - An HTTP/1.1 implementation MAY internally represent a parsed - Expires date as earlier than the proper value, but MUST NOT - internally represent a parsed Expires date as later than the - proper value. - - - All expiration-related calculations MUST be done in GMT. The - local time zone MUST NOT influence the calculation or comparison - of an age or expiration time. - - - - -Fielding, et al. Standards Track [Page 166] - -RFC 2616 HTTP/1.1 June 1999 - - - - If an HTTP header incorrectly carries a date value with a time - zone other than GMT, it MUST be converted into GMT using the - most conservative possible conversion. - -19.4 Differences Between HTTP Entities and RFC 2045 Entities - - HTTP/1.1 uses many of the constructs defined for Internet Mail (RFC - 822 [9]) and the Multipurpose Internet Mail Extensions (MIME [7]) to - allow entities to be transmitted in an open variety of - representations and with extensible mechanisms. However, RFC 2045 - discusses mail, and HTTP has a few features that are different from - those described in RFC 2045. These differences were carefully chosen - to optimize performance over binary connections, to allow greater - freedom in the use of new media types, to make date comparisons - easier, and to acknowledge the practice of some early HTTP servers - and clients. - - This appendix describes specific areas where HTTP differs from RFC - 2045. Proxies and gateways to strict MIME environments SHOULD be - aware of these differences and provide the appropriate conversions - where necessary. Proxies and gateways from MIME environments to HTTP - also need to be aware of the differences because some conversions - might be required. - -19.4.1 MIME-Version - - HTTP is not a MIME-compliant protocol. However, HTTP/1.1 messages MAY - include a single MIME-Version general-header field to indicate what - version of the MIME protocol was used to construct the message. Use - of the MIME-Version header field indicates that the message is in - full compliance with the MIME protocol (as defined in RFC 2045[7]). - Proxies/gateways are responsible for ensuring full compliance (where - possible) when exporting HTTP messages to strict MIME environments. - - MIME-Version = "MIME-Version" ":" 1*DIGIT "." 1*DIGIT - - MIME version "1.0" is the default for use in HTTP/1.1. However, - HTTP/1.1 message parsing and semantics are defined by this document - and not the MIME specification. - -19.4.2 Conversion to Canonical Form - - RFC 2045 [7] requires that an Internet mail entity be converted to - canonical form prior to being transferred, as described in section 4 - of RFC 2049 [48]. Section 3.7.1 of this document describes the forms - allowed for subtypes of the "text" media type when transmitted over - HTTP. RFC 2046 requires that content with a type of "text" represent - line breaks as CRLF and forbids the use of CR or LF outside of line - - - -Fielding, et al. Standards Track [Page 167] - -RFC 2616 HTTP/1.1 June 1999 - - - break sequences. HTTP allows CRLF, bare CR, and bare LF to indicate a - line break within text content when a message is transmitted over - HTTP. - - Where it is possible, a proxy or gateway from HTTP to a strict MIME - environment SHOULD translate all line breaks within the text media - types described in section 3.7.1 of this document to the RFC 2049 - canonical form of CRLF. Note, however, that this might be complicated - by the presence of a Content-Encoding and by the fact that HTTP - allows the use of some character sets which do not use octets 13 and - 10 to represent CR and LF, as is the case for some multi-byte - character sets. - - Implementors should note that conversion will break any cryptographic - checksums applied to the original content unless the original content - is already in canonical form. Therefore, the canonical form is - recommended for any content that uses such checksums in HTTP. - -19.4.3 Conversion of Date Formats - - HTTP/1.1 uses a restricted set of date formats (section 3.3.1) to - simplify the process of date comparison. Proxies and gateways from - other protocols SHOULD ensure that any Date header field present in a - message conforms to one of the HTTP/1.1 formats and rewrite the date - if necessary. - -19.4.4 Introduction of Content-Encoding - - RFC 2045 does not include any concept equivalent to HTTP/1.1's - Content-Encoding header field. Since this acts as a modifier on the - media type, proxies and gateways from HTTP to MIME-compliant - protocols MUST either change the value of the Content-Type header - field or decode the entity-body before forwarding the message. (Some - experimental applications of Content-Type for Internet mail have used - a media-type parameter of ";conversions=<content-coding>" to perform - a function equivalent to Content-Encoding. However, this parameter is - not part of RFC 2045.) - -19.4.5 No Content-Transfer-Encoding - - HTTP does not use the Content-Transfer-Encoding (CTE) field of RFC - 2045. Proxies and gateways from MIME-compliant protocols to HTTP MUST - remove any non-identity CTE ("quoted-printable" or "base64") encoding - prior to delivering the response message to an HTTP client. - - [[ "MUST remove any CTE encoding prior to delivering the response ]] - [[ message to an HTTP client." ]] - - Proxies and gateways from HTTP to MIME-compliant protocols are - responsible for ensuring that the message is in the correct format - and encoding for safe transport on that protocol, where "safe - - - -Fielding, et al. Standards Track [Page 168] - -RFC 2616 HTTP/1.1 June 1999 - - - transport" is defined by the limitations of the protocol being used. - Such a proxy or gateway SHOULD label the data with an appropriate - Content-Transfer-Encoding if doing so will improve the likelihood of - safe transport over the destination protocol. - -19.4.6 Introduction of Transfer-Encoding - - HTTP/1.1 introduces the Transfer-Encoding header field (section - 14.41). Proxies/gateways MUST remove any transfer-coding prior to - forwarding a message via a MIME-compliant protocol. - - A process for decoding the "chunked" transfer-coding (section 3.6) - can be represented in pseudo-code as: - - length := 0 - read chunk-size, chunk-extension (if any) and CRLF - while (chunk-size > 0) { - read chunk-data and CRLF - append chunk-data to entity-body - length := length + chunk-size - read chunk-size and CRLF - } - read entity-header - while (entity-header not empty) { - append entity-header to existing header fields - read entity-header - } - Content-Length := length - Remove "chunked" from Transfer-Encoding - -19.4.7 MHTML and Line Length Limitations - - HTTP implementations which share code with MHTML [45] implementations - need to be aware of MIME line length limitations. Since HTTP does not - have this limitation, HTTP does not fold long lines. MHTML messages - being transported by HTTP follow all conventions of MHTML, including - line length limitations and folding, canonicalization, etc., since - HTTP transports all message-bodies as payload (see section 3.7.2) and - does not interpret the content or any MIME header lines that might be - contained therein. - -19.5 Additional Features - - RFC 1945 and RFC 2068 document protocol elements used by some - existing HTTP implementations, but not consistently and correctly - across most HTTP/1.1 applications. Implementors are advised to be - aware of these features, but cannot rely upon their presence in, or - interoperability with, other HTTP/1.1 applications. Some of these - - - -Fielding, et al. Standards Track [Page 169] - -RFC 2616 HTTP/1.1 June 1999 - - - describe proposed experimental features, and some describe features - that experimental deployment found lacking that are now addressed in - the base HTTP/1.1 specification. - - A number of other headers, such as Content-Disposition and Title, - from SMTP and MIME are also often implemented (see RFC 2076 [37]). - -19.5.1 Content-Disposition - - The Content-Disposition response-header field has been proposed as a - means for the origin server to suggest a default filename if the user - requests that the content is saved to a file. This usage is derived - from the definition of Content-Disposition in RFC 1806 [35]. - - content-disposition = "Content-Disposition" ":" - disposition-type *( ";" disposition-parm ) - disposition-type = "attachment" | disp-extension-token - disposition-parm = filename-parm | disp-extension-parm - filename-parm = "filename" "=" quoted-string - disp-extension-token = token - disp-extension-parm = token "=" ( token | quoted-string ) - - An example is - - Content-Disposition: attachment; filename="fname.ext" - - The receiving user agent SHOULD NOT respect any directory path - information present in the filename-parm parameter, which is the only - parameter believed to apply to HTTP implementations at this time. The - filename SHOULD be treated as a terminal component only. - - If this header is used in a response with the application/octet- - stream content-type, the implied suggestion is that the user agent - should not display the response, but directly enter a `save response - as...' dialog. - - See section 15.5 for Content-Disposition security issues. - -19.6 Compatibility with Previous Versions - - It is beyond the scope of a protocol specification to mandate - compliance with previous versions. HTTP/1.1 was deliberately - designed, however, to make supporting previous versions easy. It is - worth noting that, at the time of composing this specification - (1996), we would expect commercial HTTP/1.1 servers to: - - - recognize the format of the Request-Line for HTTP/0.9, 1.0, and - 1.1 requests; - - - -Fielding, et al. Standards Track [Page 170] - -RFC 2616 HTTP/1.1 June 1999 - - - - understand any valid request in the format of HTTP/0.9, 1.0, or - 1.1; - - - respond appropriately with a message in the same major version - used by the client. - - And we would expect HTTP/1.1 clients to: - - - recognize the format of the Status-Line for HTTP/1.0 and 1.1 - responses; - - - understand any valid response in the format of HTTP/0.9, 1.0, or - 1.1. - - For most implementations of HTTP/1.0, each connection is established - by the client prior to the request and closed by the server after - sending the response. Some implementations implement the Keep-Alive - version of persistent connections described in section 19.7.1 of RFC - 2068 [33]. - -19.6.1 Changes from HTTP/1.0 - - This section summarizes major differences between versions HTTP/1.0 - and HTTP/1.1. - -19.6.1.1 Changes to Simplify Multi-homed Web Servers and Conserve IP - Addresses - - The requirements that clients and servers support the Host request- - header, report an error if the Host request-header (section 14.23) is - missing from an HTTP/1.1 request, and accept absolute URIs (section - 5.1.2) are among the most important changes defined by this - specification. - - Older HTTP/1.0 clients assumed a one-to-one relationship of IP - addresses and servers; there was no other established mechanism for - distinguishing the intended server of a request than the IP address - to which that request was directed. The changes outlined above will - allow the Internet, once older HTTP clients are no longer common, to - support multiple Web sites from a single IP address, greatly - simplifying large operational Web servers, where allocation of many - IP addresses to a single host has created serious problems. The - Internet will also be able to recover the IP addresses that have been - allocated for the sole purpose of allowing special-purpose domain - names to be used in root-level HTTP URLs. Given the rate of growth of - the Web, and the number of servers already deployed, it is extremely - - - - - -Fielding, et al. Standards Track [Page 171] - -RFC 2616 HTTP/1.1 June 1999 - - - important that all implementations of HTTP (including updates to - existing HTTP/1.0 applications) correctly implement these - requirements: - - - Both clients and servers MUST support the Host request-header. - - - A client that sends an HTTP/1.1 request MUST send a Host header. - - - Servers MUST report a 400 (Bad Request) error if an HTTP/1.1 - request does not include a Host request-header. - - - Servers MUST accept absolute URIs. - -19.6.2 Compatibility with HTTP/1.0 Persistent Connections - - Some clients and servers might wish to be compatible with some - previous implementations of persistent connections in HTTP/1.0 - clients and servers. Persistent connections in HTTP/1.0 are - explicitly negotiated as they are not the default behavior. HTTP/1.0 - experimental implementations of persistent connections are faulty, - and the new facilities in HTTP/1.1 are designed to rectify these - problems. The problem was that some existing 1.0 clients may be - sending Keep-Alive to a proxy server that doesn't understand - Connection, which would then erroneously forward it to the next - inbound server, which would establish the Keep-Alive connection and - result in a hung HTTP/1.0 proxy waiting for the close on the - response. The result is that HTTP/1.0 clients must be prevented from - using Keep-Alive when talking to proxies. - - However, talking to proxies is the most important use of persistent - connections, so that prohibition is clearly unacceptable. Therefore, - we need some other mechanism for indicating a persistent connection - is desired, which is safe to use even when talking to an old proxy - that ignores Connection. Persistent connections are the default for - HTTP/1.1 messages; we introduce a new keyword (Connection: close) for - declaring non-persistence. See section 14.10. - - The original HTTP/1.0 form of persistent connections (the Connection: - Keep-Alive and Keep-Alive header) is documented in RFC 2068. [33] - -19.6.3 Changes from RFC 2068 - - This specification has been carefully audited to correct and - disambiguate key word usage; RFC 2068 had many problems in respect to - the conventions laid out in RFC 2119 [34]. - - Clarified which error code should be used for inbound server failures - (e.g. DNS failures). (Section 10.5.5). - - - -Fielding, et al. Standards Track [Page 172] - -RFC 2616 HTTP/1.1 June 1999 - - - CREATE had a race that required an Etag be sent when a resource is - first created. (Section 10.2.2). - - Content-Base was deleted from the specification: it was not - implemented widely, and there is no simple, safe way to introduce it - without a robust extension mechanism. In addition, it is used in a - similar, but not identical fashion in MHTML [45]. - - Transfer-coding and message lengths all interact in ways that - required fixing exactly when chunked encoding is used (to allow for - transfer encoding that may not be self delimiting); it was important - to straighten out exactly how message lengths are computed. (Sections - 3.6, 4.4, 7.2.2, 13.5.2, 14.13, 14.16) - - A content-coding of "identity" was introduced, to solve problems - discovered in caching. (section 3.5) - - Quality Values of zero should indicate that "I don't want something" - to allow clients to refuse a representation. (Section 3.9) - - The use and interpretation of HTTP version numbers has been clarified - by RFC 2145. Require proxies to upgrade requests to highest protocol - version they support to deal with problems discovered in HTTP/1.0 - implementations (Section 3.1) - - Charset wildcarding is introduced to avoid explosion of character set - names in accept headers. (Section 14.2) - - A case was missed in the Cache-Control model of HTTP/1.1; s-maxage - was introduced to add this missing case. (Sections 13.4, 14.8, 14.9, - 14.9.3) - - The Cache-Control: max-age directive was not properly defined for - responses. (Section 14.9.3) - - There are situations where a server (especially a proxy) does not - know the full length of a response but is capable of serving a - byterange request. We therefore need a mechanism to allow byteranges - with a content-range not indicating the full length of the message. - (Section 14.16) - - Range request responses would become very verbose if all meta-data - were always returned; by allowing the server to only send needed - headers in a 206 response, this problem can be avoided. (Section - 10.2.7, 13.5.3, and 14.27) - - - - - - -Fielding, et al. Standards Track [Page 173] - -RFC 2616 HTTP/1.1 June 1999 - - - Fix problem with unsatisfiable range requests; there are two cases: - syntactic problems, and range doesn't exist in the document. The 416 - status code was needed to resolve this ambiguity needed to indicate - an error for a byte range request that falls outside of the actual - contents of a document. (Section 10.4.17, 14.16) - - Rewrite of message transmission requirements to make it much harder - for implementors to get it wrong, as the consequences of errors here - can have significant impact on the Internet, and to deal with the - following problems: - - 1. Changing "HTTP/1.1 or later" to "HTTP/1.1", in contexts where - this was incorrectly placing a requirement on the behavior of - an implementation of a future version of HTTP/1.x - - 2. Made it clear that user-agents should retry requests, not - "clients" in general. - - 3. Converted requirements for clients to ignore unexpected 100 - (Continue) responses, and for proxies to forward 100 responses, - into a general requirement for 1xx responses. - - 4. Modified some TCP-specific language, to make it clearer that - non-TCP transports are possible for HTTP. - - 5. Require that the origin server MUST NOT wait for the request - body before it sends a required 100 (Continue) response. - - 6. Allow, rather than require, a server to omit 100 (Continue) if - it has already seen some of the request body. - - 7. Allow servers to defend against denial-of-service attacks and - broken clients. - - This change adds the Expect header and 417 status code. The message - transmission requirements fixes are in sections 8.2, 10.4.18, - 8.1.2.2, 13.11, and 14.20. - - Proxies should be able to add Content-Length when appropriate. - (Section 13.5.2) - - Clean up confusion between 403 and 404 responses. (Section 10.4.4, - 10.4.5, and 10.4.11) - - Warnings could be cached incorrectly, or not updated appropriately. - (Section 13.1.2, 13.2.4, 13.5.2, 13.5.3, 14.9.3, and 14.46) Warning - also needed to be a general header, as PUT or other methods may have - need for it in requests. - - - -Fielding, et al. Standards Track [Page 174] - -RFC 2616 HTTP/1.1 June 1999 - - - Transfer-coding had significant problems, particularly with - interactions with chunked encoding. The solution is that transfer- - codings become as full fledged as content-codings. This involves - adding an IANA registry for transfer-codings (separate from content - codings), a new header field (TE) and enabling trailer headers in the - future. Transfer encoding is a major performance benefit, so it was - worth fixing [39]. TE also solves another, obscure, downward - interoperability problem that could have occurred due to interactions - between authentication trailers, chunked encoding and HTTP/1.0 - clients.(Section 3.6, 3.6.1, and 14.39) - - The PATCH, LINK, UNLINK methods were defined but not commonly - implemented in previous versions of this specification. See RFC 2068 - [33]. - - The Alternates, Content-Version, Derived-From, Link, URI, Public and - Content-Base header fields were defined in previous versions of this - specification, but not commonly implemented. See RFC 2068 [33]. - -20 Index - - Please see the PostScript version of this RFC for the INDEX. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Fielding, et al. Standards Track [Page 175] - -RFC 2616 HTTP/1.1 June 1999 - - -21. Full Copyright Statement - - Copyright (C) The Internet Society (1999). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Acknowledgement - - Funding for the RFC Editor function is currently provided by the - Internet Society. - - - - - - - - - - - - - - - - - - - -Fielding, et al. Standards Track [Page 176] - diff --git a/docs/specs/rfc2617.txt b/docs/specs/rfc2617.txt deleted file mode 100644 index b8fdf59f..00000000 --- a/docs/specs/rfc2617.txt +++ /dev/null @@ -1,1909 +0,0 @@ - -[[ Text in double brackets is from the unofficial errata at ]] -[[ http://skrb.org/ietf/http_errata.html ]] - -Network Working Group J. Franks -Request for Comments: 2617 Northwestern University -Obsoletes: 2069 P. Hallam-Baker -Category: Standards Track Verisign, Inc. - J. Hostetler - AbiSource, Inc. - S. Lawrence - Agranat Systems, Inc. - P. Leach - Microsoft Corporation - A. Luotonen - Netscape Communications Corporation - L. Stewart - Open Market, Inc. - June 1999 - - - HTTP Authentication: Basic and Digest Access Authentication - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (1999). All Rights Reserved. - -Abstract - - "HTTP/1.0", includes the specification for a Basic Access - Authentication scheme. This scheme is not considered to be a secure - method of user authentication (unless used in conjunction with some - external secure system such as SSL [5]), as the user name and - password are passed over the network as cleartext. - - This document also provides the specification for HTTP's - authentication framework, the original Basic authentication scheme - and a scheme based on cryptographic hashes, referred to as "Digest - Access Authentication". It is therefore also intended to serve as a - replacement for RFC 2069 [6]. Some optional elements specified by - RFC 2069 have been removed from this specification due to problems - found since its publication; other new elements have been added for - compatibility, those new elements have been made optional, but are - strongly recommended. - - - -Franks, et al. Standards Track [Page 1] - -RFC 2617 HTTP Authentication June 1999 - - - Like Basic, Digest access authentication verifies that both parties - to a communication know a shared secret (a password); unlike Basic, - this verification can be done without sending the password in the - clear, which is Basic's biggest weakness. As with most other - authentication protocols, the greatest sources of risks are usually - found not in the core protocol itself but in policies and procedures - surrounding its use. - -Table of Contents - - 1 Access Authentication................................ 3 - 1.1 Reliance on the HTTP/1.1 Specification............ 3 - 1.2 Access Authentication Framework................... 3 - 2 Basic Authentication Scheme.......................... 5 - 3 Digest Access Authentication Scheme.................. 6 - 3.1 Introduction...................................... 6 - 3.1.1 Purpose......................................... 6 - 3.1.2 Overall Operation............................... 6 - 3.1.3 Representation of digest values................. 7 - 3.1.4 Limitations..................................... 7 - 3.2 Specification of Digest Headers................... 7 - 3.2.1 The WWW-Authenticate Response Header............ 8 - 3.2.2 The Authorization Request Header................ 11 - 3.2.3 The Authentication-Info Header.................. 15 - 3.3 Digest Operation.................................. 17 - 3.4 Security Protocol Negotiation..................... 18 - 3.5 Example........................................... 18 - 3.6 Proxy-Authentication and Proxy-Authorization...... 19 - 4 Security Considerations.............................. 19 - 4.1 Authentication of Clients using Basic - Authentication.................................... 19 - 4.2 Authentication of Clients using Digest - Authentication.................................... 20 - 4.3 Limited Use Nonce Values.......................... 21 - 4.4 Comparison of Digest with Basic Authentication.... 22 - 4.5 Replay Attacks.................................... 22 - 4.6 Weakness Created by Multiple Authentication - Schemes........................................... 23 - 4.7 Online dictionary attacks......................... 23 - 4.8 Man in the Middle................................. 24 - 4.9 Chosen plaintext attacks.......................... 24 - 4.10 Precomputed dictionary attacks.................... 25 - 4.11 Batch brute force attacks......................... 25 - 4.12 Spoofing by Counterfeit Servers................... 25 - 4.13 Storing passwords................................. 26 - 4.14 Summary........................................... 26 - 5 Sample implementation................................ 27 - 6 Acknowledgments...................................... 31 - - - -Franks, et al. Standards Track [Page 2] - -RFC 2617 HTTP Authentication June 1999 - - - 7 References........................................... 31 - 8 Authors' Addresses................................... 32 - 9 Full Copyright Statement............................. 34 - -1 Access Authentication - -1.1 Reliance on the HTTP/1.1 Specification - - This specification is a companion to the HTTP/1.1 specification [2]. - It uses the augmented BNF section 2.1 of that document, and relies on - both the non-terminals defined in that document and other aspects of - the HTTP/1.1 specification. - -1.2 Access Authentication Framework - - HTTP provides a simple challenge-response authentication mechanism - that MAY be used by a server to challenge a client request and by a - client to provide authentication information. It uses an extensible, - case-insensitive token to identify the authentication scheme, - followed by a comma-separated list of attribute-value pairs which - carry the parameters necessary for achieving authentication via that - scheme. - - auth-scheme = token - auth-param = token "=" ( token | quoted-string ) - - The 401 (Unauthorized) response message is used by an origin server - to challenge the authorization of a user agent. This response MUST - include a WWW-Authenticate header field containing at least one - challenge applicable to the requested resource. The 407 (Proxy - Authentication Required) response message is used by a proxy to - challenge the authorization of a client and MUST include a Proxy- - Authenticate header field containing at least one challenge - applicable to the proxy for the requested resource. - - challenge = auth-scheme 1*SP 1#auth-param - - Note: User agents will need to take special care in parsing the WWW- - Authenticate or Proxy-Authenticate header field value if it contains - more than one challenge, or if more than one WWW-Authenticate header - field is provided, since the contents of a challenge may itself - contain a comma-separated list of authentication parameters. - - The authentication parameter realm is defined for all authentication - schemes: - - realm = "realm" "=" realm-value - realm-value = quoted-string - - - -Franks, et al. Standards Track [Page 3] - -RFC 2617 HTTP Authentication June 1999 - - - The realm directive (case-insensitive) is required for all - authentication schemes that issue a challenge. The realm value - (case-sensitive), in combination with the canonical root URL (the - absoluteURI for the server whose abs_path is empty; see section 5.1.2 - of [2]) of the server being accessed, defines the protection space. - These realms allow the protected resources on a server to be - partitioned into a set of protection spaces, each with its own - authentication scheme and/or authorization database. The realm value - is a string, generally assigned by the origin server, which may have - additional semantics specific to the authentication scheme. Note that - there may be multiple challenges with the same auth-scheme but - different realms. - - A user agent that wishes to authenticate itself with an origin - server--usually, but not necessarily, after receiving a 401 - (Unauthorized)--MAY do so by including an Authorization header field - with the request. A client that wishes to authenticate itself with a - proxy--usually, but not necessarily, after receiving a 407 (Proxy - Authentication Required)--MAY do so by including a Proxy- - Authorization header field with the request. Both the Authorization - field value and the Proxy-Authorization field value consist of - credentials containing the authentication information of the client - for the realm of the resource being requested. The user agent MUST - choose to use one of the challenges with the strongest auth-scheme it - understands and request credentials from the user based upon that - challenge. - - credentials = auth-scheme #auth-param - - Note that many browsers will only recognize Basic and will require - that it be the first auth-scheme presented. Servers should only - include Basic if it is minimally acceptable. - - The protection space determines the domain over which credentials can - be automatically applied. If a prior request has been authorized, the - same credentials MAY be reused for all other requests within that - protection space for a period of time determined by the - authentication scheme, parameters, and/or user preference. Unless - otherwise defined by the authentication scheme, a single protection - space cannot extend outside the scope of its server. - - If the origin server does not wish to accept the credentials sent - with a request, it SHOULD return a 401 (Unauthorized) response. The - response MUST include a WWW-Authenticate header field containing at - least one (possibly new) challenge applicable to the requested - resource. If a proxy does not accept the credentials sent with a - request, it SHOULD return a 407 (Proxy Authentication Required). The - response MUST include a Proxy-Authenticate header field containing a - - - -Franks, et al. Standards Track [Page 4] - -RFC 2617 HTTP Authentication June 1999 - - - (possibly new) challenge applicable to the proxy for the requested - resource. - - The HTTP protocol does not restrict applications to this simple - challenge-response mechanism for access authentication. Additional - mechanisms MAY be used, such as encryption at the transport level or - via message encapsulation, and with additional header fields - specifying authentication information. However, these additional - mechanisms are not defined by this specification. - - Proxies MUST be completely transparent regarding user agent - authentication by origin servers. That is, they must forward the - WWW-Authenticate and Authorization headers untouched, and follow the - rules found in section 14.8 of [2]. Both the Proxy-Authenticate and - the Proxy-Authorization header fields are hop-by-hop headers (see - section 13.5.1 of [2]). - -2 Basic Authentication Scheme - - The "basic" authentication scheme is based on the model that the - client must authenticate itself with a user-ID and a password for - each realm. The realm value should be considered an opaque string - which can only be compared for equality with other realms on that - server. The server will service the request only if it can validate - the user-ID and password for the protection space of the Request-URI. - There are no optional authentication parameters. - - For Basic, the framework above is utilized as follows: - - challenge = "Basic" realm - credentials = "Basic" basic-credentials - - Upon receipt of an unauthorized request for a URI within the - protection space, the origin server MAY respond with a challenge like - the following: - - WWW-Authenticate: Basic realm="WallyWorld" - - where "WallyWorld" is the string assigned by the server to identify - the protection space of the Request-URI. A proxy may respond with the - same challenge using the Proxy-Authenticate header field. - - To receive authorization, the client sends the userid and password, - separated by a single colon (":") character, within a base64 [7] - encoded string in the credentials. - - basic-credentials = base64-user-pass - base64-user-pass = <base64 [4] encoding of user-pass, - - - -Franks, et al. Standards Track [Page 5] - -RFC 2617 HTTP Authentication June 1999 - - - except not limited to 76 char/line> - user-pass = userid ":" password - userid = *<TEXT excluding ":"> - password = *TEXT - - Userids might be case sensitive. - - If the user agent wishes to send the userid "Aladdin" and password - "open sesame", it would use the following header field: - - Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ== - - A client SHOULD assume that all paths at or deeper than the depth of - the last symbolic element in the path field of the Request-URI also - are within the protection space specified by the Basic realm value of - the current challenge. A client MAY preemptively send the - corresponding Authorization header with requests for resources in - that space without receipt of another challenge from the server. - Similarly, when a client sends a request to a proxy, it may reuse a - userid and password in the Proxy-Authorization header field without - receiving another challenge from the proxy server. See section 4 for - security considerations associated with Basic authentication. - -3 Digest Access Authentication Scheme - -3.1 Introduction - -3.1.1 Purpose - - The protocol referred to as "HTTP/1.0" includes the specification for - a Basic Access Authentication scheme[1]. That scheme is not - considered to be a secure method of user authentication, as the user - name and password are passed over the network in an unencrypted form. - This section provides the specification for a scheme that does not - send the password in cleartext, referred to as "Digest Access - Authentication". - - The Digest Access Authentication scheme is not intended to be a - complete answer to the need for security in the World Wide Web. This - scheme provides no encryption of message content. The intent is - simply to create an access authentication method that avoids the most - serious flaws of Basic authentication. - -3.1.2 Overall Operation - - Like Basic Access Authentication, the Digest scheme is based on a - simple challenge-response paradigm. The Digest scheme challenges - using a nonce value. A valid response contains a checksum (by - - - -Franks, et al. Standards Track [Page 6] - -RFC 2617 HTTP Authentication June 1999 - - - default, the MD5 checksum) of the username, the password, the given - nonce value, the HTTP method, and the requested URI. In this way, the - password is never sent in the clear. Just as with the Basic scheme, - the username and password must be prearranged in some fashion not - addressed by this document. - -3.1.3 Representation of digest values - - An optional header allows the server to specify the algorithm used to - create the checksum or digest. By default the MD5 algorithm is used - and that is the only algorithm described in this document. - - For the purposes of this document, an MD5 digest of 128 bits is - represented as 32 ASCII printable characters. The bits in the 128 bit - digest are converted from most significant to least significant bit, - four bits at a time to their ASCII presentation as follows. Each four - bits is represented by its familiar hexadecimal notation from the - characters 0123456789abcdef. That is, binary 0000 gets represented by - the character '0', 0001, by '1', and so on up to the representation - of 1111 as 'f'. - -3.1.4 Limitations - - The Digest authentication scheme described in this document suffers - from many known limitations. It is intended as a replacement for - Basic authentication and nothing more. It is a password-based system - and (on the server side) suffers from all the same problems of any - password system. In particular, no provision is made in this protocol - for the initial secure arrangement between user and server to - establish the user's password. - - Users and implementors should be aware that this protocol is not as - secure as Kerberos, and not as secure as any client-side private-key - scheme. Nevertheless it is better than nothing, better than what is - commonly used with telnet and ftp, and better than Basic - authentication. - -3.2 Specification of Digest Headers - - The Digest Access Authentication scheme is conceptually similar to - the Basic scheme. The formats of the modified WWW-Authenticate header - line and the Authorization header line are specified below. In - addition, a new header, Authentication-Info, is specified. - - - - - - - - -Franks, et al. Standards Track [Page 7] - -RFC 2617 HTTP Authentication June 1999 - - -3.2.1 The WWW-Authenticate Response Header - - If a server receives a request for an access-protected object, and an - acceptable Authorization header is not sent, the server responds with - a "401 Unauthorized" status code, and a WWW-Authenticate header as - per the framework defined above, which for the digest scheme is - utilized as follows: - - challenge = "Digest" digest-challenge - - digest-challenge = 1#( realm | [ domain ] | nonce | - [ opaque ] |[ stale ] | [ algorithm ] | - [ qop-options ] | [auth-param] ) - - - domain = "domain" "=" <"> URI ( 1*SP URI ) <"> - [[ Should be: ]] - [[ domain = "domain" "=" <"> URI *( 1*SP URI ) <"> ]] - URI = absoluteURI | abs_path - nonce = "nonce" "=" nonce-value - nonce-value = quoted-string - opaque = "opaque" "=" quoted-string - stale = "stale" "=" ( "true" | "false" ) - algorithm = "algorithm" "=" ( "MD5" | "MD5-sess" | - token ) - qop-options = "qop" "=" <"> 1#qop-value <"> - qop-value = "auth" | "auth-int" | token - - The meanings of the values of the directives used above are as - follows: - - realm - A string to be displayed to users so they know which username and - password to use. This string should contain at least the name of - the host performing the authentication and might additionally - indicate the collection of users who might have access. An example - might be "registered_users@gotham.news.com". - - domain - A quoted, space-separated list of URIs, as specified in RFC XURI - [7], that define the protection space. If a URI is an abs_path, it - is relative to the canonical root URL (see section 1.2 above) of - the server being accessed. An absoluteURI in this list may refer to - a different server than the one being accessed. The client can use - this list to determine the set of URIs for which the same - authentication information may be sent: any URI that has a URI in - this list as a prefix (after both have been made absolute) may be - assumed to be in the same protection space. If this directive is - omitted or its value is empty, the client should assume that the - protection space consists of all URIs on the responding server. - - - -Franks, et al. Standards Track [Page 8] - -RFC 2617 HTTP Authentication June 1999 - - - This directive is not meaningful in Proxy-Authenticate headers, for - which the protection space is always the entire proxy; if present - it should be ignored. - - nonce - A server-specified data string which should be uniquely generated - each time a 401 response is made. It is recommended that this - string be base64 or hexadecimal data. Specifically, since the - string is passed in the header lines as a quoted string, the - double-quote character is not allowed. - - The contents of the nonce are implementation dependent. The quality - of the implementation depends on a good choice. A nonce might, for - example, be constructed as the base 64 encoding of - - time-stamp H(time-stamp ":" ETag ":" private-key) - - where time-stamp is a server-generated time or other non-repeating - value, ETag is the value of the HTTP ETag header associated with - the requested entity, and private-key is data known only to the - server. With a nonce of this form a server would recalculate the - hash portion after receiving the client authentication header and - reject the request if it did not match the nonce from that header - or if the time-stamp value is not recent enough. In this way the - server can limit the time of the nonce's validity. The inclusion of - the ETag prevents a replay request for an updated version of the - resource. (Note: including the IP address of the client in the - nonce would appear to offer the server the ability to limit the - reuse of the nonce to the same client that originally got it. - However, that would break proxy farms, where requests from a single - user often go through different proxies in the farm. Also, IP - address spoofing is not that hard.) - - An implementation might choose not to accept a previously used - nonce or a previously used digest, in order to protect against a - replay attack. Or, an implementation might choose to use one-time - nonces or digests for POST or PUT requests and a time-stamp for GET - requests. For more details on the issues involved see section 4. - of this document. - - The nonce is opaque to the client. - - opaque - A string of data, specified by the server, which should be returned - by the client unchanged in the Authorization header of subsequent - requests with URIs in the same protection space. It is recommended - that this string be base64 or hexadecimal data. - - - - -Franks, et al. Standards Track [Page 9] - -RFC 2617 HTTP Authentication June 1999 - - - stale - A flag, indicating that the previous request from the client was - rejected because the nonce value was stale. If stale is TRUE - (case-insensitive), the client may wish to simply retry the request - with a new encrypted response, without reprompting the user for a - new username and password. The server should only set stale to TRUE - if it receives a request for which the nonce is invalid but with a - valid digest for that nonce (indicating that the client knows the - correct username/password). If stale is FALSE, or anything other - than TRUE, or the stale directive is not present, the username - and/or password are invalid, and new values must be obtained. - - algorithm - A string indicating a pair of algorithms used to produce the digest - and a checksum. If this is not present it is assumed to be "MD5". - If the algorithm is not understood, the challenge should be ignored - (and a different one used, if there is more than one). - - In this document the string obtained by applying the digest - algorithm to the data "data" with secret "secret" will be denoted - by KD(secret, data), and the string obtained by applying the - checksum algorithm to the data "data" will be denoted H(data). The - notation unq(X) means the value of the quoted-string X without the - surrounding quotes. - - For the "MD5" and "MD5-sess" algorithms - - H(data) = MD5(data) - - and - - KD(secret, data) = H(concat(secret, ":", data)) - - i.e., the digest is the MD5 of the secret concatenated with a colon - concatenated with the data. The "MD5-sess" algorithm is intended to - allow efficient 3rd party authentication servers; for the - difference in usage, see the description in section 3.2.2.2. - - qop-options - This directive is optional, but is made so only for backward - compatibility with RFC 2069 [6]; it SHOULD be used by all - implementations compliant with this version of the Digest scheme. - If present, it is a quoted string of one or more tokens indicating - the "quality of protection" values supported by the server. The - value "auth" indicates authentication; the value "auth-int" - indicates authentication with integrity protection; see the - - - - - -Franks, et al. Standards Track [Page 10] - -RFC 2617 HTTP Authentication June 1999 - - - descriptions below for calculating the response directive value for - the application of this choice. Unrecognized options MUST be - ignored. - - auth-param - This directive allows for future extensions. Any unrecognized - directive MUST be ignored. - -3.2.2 The Authorization Request Header - - The client is expected to retry the request, passing an Authorization - header line, which is defined according to the framework above, - utilized as follows. - - credentials = "Digest" digest-response - digest-response = 1#( username | realm | nonce | digest-uri - | response | [ algorithm ] | [cnonce] | - [opaque] | [message-qop] | - [nonce-count] | [auth-param] ) - - username = "username" "=" username-value - username-value = quoted-string - digest-uri = "uri" "=" digest-uri-value - digest-uri-value = request-uri ; As specified by HTTP/1.1 - message-qop = "qop" "=" qop-value - cnonce = "cnonce" "=" cnonce-value - cnonce-value = nonce-value - nonce-count = "nc" "=" nc-value - nc-value = 8LHEX - response = "response" "=" request-digest - request-digest = <"> 32LHEX <"> - LHEX = "0" | "1" | "2" | "3" | - "4" | "5" | "6" | "7" | - "8" | "9" | "a" | "b" | - "c" | "d" | "e" | "f" - - The values of the opaque and algorithm fields must be those supplied - in the WWW-Authenticate response header for the entity being - requested. - - response - A string of 32 hex digits computed as defined below, which proves - that the user knows a password - - username - The user's name in the specified realm. - - - - - -Franks, et al. Standards Track [Page 11] - -RFC 2617 HTTP Authentication June 1999 - - - digest-uri - The URI from Request-URI of the Request-Line; duplicated here - because proxies are allowed to change the Request-Line in transit. - - qop - Indicates what "quality of protection" the client has applied to - the message. If present, its value MUST be one of the alternatives - the server indicated it supports in the WWW-Authenticate header. - These values affect the computation of the request-digest. Note - that this is a single token, not a quoted list of alternatives as - in WWW- Authenticate. This directive is optional in order to - preserve backward compatibility with a minimal implementation of - RFC 2069 [6], but SHOULD be used if the server indicated that qop - is supported by providing a qop directive in the WWW-Authenticate - header field. - - cnonce - This MUST be specified if a qop directive is sent (see above), and - MUST NOT be specified if the server did not send a qop directive in - the WWW-Authenticate header field. The cnonce-value is an opaque - quoted string value provided by the client and used by both client - and server to avoid chosen plaintext attacks, to provide mutual - authentication, and to provide some message integrity protection. - See the descriptions below of the calculation of the response- - digest and request-digest values. - - nonce-count - This MUST be specified if a qop directive is sent (see above), and - MUST NOT be specified if the server did not send a qop directive in - the WWW-Authenticate header field. The nc-value is the hexadecimal - count of the number of requests (including the current request) - that the client has sent with the nonce value in this request. For - example, in the first request sent in response to a given nonce - value, the client sends "nc=00000001". The purpose of this - directive is to allow the server to detect request replays by - maintaining its own copy of this count - if the same nc-value is - seen twice, then the request is a replay. See the description - below of the construction of the request-digest value. - - auth-param - This directive allows for future extensions. Any unrecognized - directive MUST be ignored. - - If a directive or its value is improper, or required directives are - missing, the proper response is 400 Bad Request. If the request- - digest is invalid, then a login failure should be logged, since - repeated login failures from a single client may indicate an attacker - attempting to guess passwords. - - - -Franks, et al. Standards Track [Page 12] - -RFC 2617 HTTP Authentication June 1999 - - - The definition of request-digest above indicates the encoding for its - value. The following definitions show how the value is computed. - -3.2.2.1 Request-Digest - - If the "qop" value is "auth" or "auth-int": - - request-digest = <"> < KD ( H(A1), unq(nonce-value) - ":" nc-value - ":" unq(cnonce-value) - ":" unq(qop-value) - ":" H(A2) - ) <"> - - If the "qop" directive is not present (this construction is for - compatibility with RFC 2069): - - request-digest = - <"> < KD ( H(A1), unq(nonce-value) ":" H(A2) ) > - <"> - - See below for the definitions for A1 and A2. - -3.2.2.2 A1 - - If the "algorithm" directive's value is "MD5" or is unspecified, then - A1 is: - - A1 = unq(username-value) ":" unq(realm-value) ":" passwd - - where - - passwd = < user's password > - - If the "algorithm" directive's value is "MD5-sess", then A1 is - calculated only once - on the first request by the client following - receipt of a WWW-Authenticate challenge from the server. It uses the - server nonce from that challenge, and the first client nonce value to - construct A1 as follows: - - A1 = H( unq(username-value) ":" unq(realm-value) - ":" passwd ) - ":" unq(nonce-value) ":" unq(cnonce-value) - - This creates a 'session key' for the authentication of subsequent - requests and responses which is different for each "authentication - session", thus limiting the amount of material hashed with any one - key. (Note: see further discussion of the authentication session in - - - -Franks, et al. Standards Track [Page 13] - -RFC 2617 HTTP Authentication June 1999 - - - section 3.3.) Because the server need only use the hash of the user - credentials in order to create the A1 value, this construction could - be used in conjunction with a third party authentication service so - that the web server would not need the actual password value. The - specification of such a protocol is beyond the scope of this - specification. - -3.2.2.3 A2 - - If the "qop" directive's value is "auth" or is unspecified, then A2 - is: - - A2 = Method ":" digest-uri-value - - If the "qop" value is "auth-int", then A2 is: - - A2 = Method ":" digest-uri-value ":" H(entity-body) - -3.2.2.4 Directive values and quoted-string - - Note that the value of many of the directives, such as "username- - value", are defined as a "quoted-string". However, the "unq" notation - indicates that surrounding quotation marks are removed in forming the - string A1. Thus if the Authorization header includes the fields - - username="Mufasa", realm=myhost@testrealm.com - - and the user Mufasa has password "Circle Of Life" then H(A1) would be - H(Mufasa:myhost@testrealm.com:Circle Of Life) with no quotation marks - in the digested string. - - No white space is allowed in any of the strings to which the digest - function H() is applied unless that white space exists in the quoted - strings or entity body whose contents make up the string to be - digested. For example, the string A1 illustrated above must be - - Mufasa:myhost@testrealm.com:Circle Of Life - - with no white space on either side of the colons, but with the white - space between the words used in the password value. Likewise, the - other strings digested by H() must not have white space on either - side of the colons which delimit their fields unless that white space - was in the quoted strings or entity body being digested. - - Also note that if integrity protection is applied (qop=auth-int), the - H(entity-body) is the hash of the entity body, not the message body - - it is computed before any transfer encoding is applied by the sender - - - - -Franks, et al. Standards Track [Page 14] - -RFC 2617 HTTP Authentication June 1999 - - - and after it has been removed by the recipient. Note that this - includes multipart boundaries and embedded headers in each part of - any multipart content-type. - -3.2.2.5 Various considerations - - The "Method" value is the HTTP request method as specified in section - 5.1.1 of [2]. The "request-uri" value is the Request-URI from the - request line as specified in section 5.1.2 of [2]. This may be "*", - an "absoluteURL" or an "abs_path" as specified in section 5.1.2 of - [2], but it MUST agree with the Request-URI. In particular, it MUST - be an "absoluteURL" if the Request-URI is an "absoluteURL". The - "cnonce-value" is an optional client-chosen value whose purpose is - to foil chosen plaintext attacks. - - The authenticating server must assure that the resource designated by - the "uri" directive is the same as the resource specified in the - Request-Line; if they are not, the server SHOULD return a 400 Bad - Request error. (Since this may be a symptom of an attack, server - implementers may want to consider logging such errors.) The purpose - of duplicating information from the request URL in this field is to - deal with the possibility that an intermediate proxy may alter the - client's Request-Line. This altered (but presumably semantically - equivalent) request would not result in the same digest as that - calculated by the client. - - Implementers should be aware of how authenticated transactions - interact with shared caches. The HTTP/1.1 protocol specifies that - when a shared cache (see section 13.7 of [2]) has received a request - containing an Authorization header and a response from relaying that - request, it MUST NOT return that response as a reply to any other - request, unless one of two Cache-Control (see section 14.9 of [2]) - directives was present in the response. If the original response - included the "must-revalidate" Cache-Control directive, the cache MAY - use the entity of that response in replying to a subsequent request, - but MUST first revalidate it with the origin server, using the - request headers from the new request to allow the origin server to - authenticate the new request. Alternatively, if the original response - included the "public" Cache-Control directive, the response entity - MAY be returned in reply to any subsequent request. - -3.2.3 The Authentication-Info Header - - The Authentication-Info header is used by the server to communicate - some information regarding the successful authentication in the - response. - - - - - -Franks, et al. Standards Track [Page 15] - -RFC 2617 HTTP Authentication June 1999 - - - AuthenticationInfo = "Authentication-Info" ":" auth-info - auth-info = 1#(nextnonce | [ message-qop ] - | [ response-auth ] | [ cnonce ] - | [nonce-count] ) - nextnonce = "nextnonce" "=" nonce-value - response-auth = "rspauth" "=" response-digest - response-digest = <"> *LHEX <"> - - The value of the nextnonce directive is the nonce the server wishes - the client to use for a future authentication response. The server - may send the Authentication-Info header with a nextnonce field as a - means of implementing one-time or otherwise changing nonces. If the - nextnonce field is present the client SHOULD use it when constructing - the Authorization header for its next request. Failure of the client - to do so may result in a request to re-authenticate from the server - with the "stale=TRUE". - - Server implementations should carefully consider the performance - implications of the use of this mechanism; pipelined requests will - not be possible if every response includes a nextnonce directive - that must be used on the next request received by the server. - Consideration should be given to the performance vs. security - tradeoffs of allowing an old nonce value to be used for a limited - time to permit request pipelining. Use of the nonce-count can - retain most of the security advantages of a new server nonce - without the deleterious affects on pipelining. - - message-qop - Indicates the "quality of protection" options applied to the - response by the server. The value "auth" indicates authentication; - the value "auth-int" indicates authentication with integrity - protection. The server SHOULD use the same value for the message- - qop directive in the response as was sent by the client in the - corresponding request. - - The optional response digest in the "response-auth" directive - supports mutual authentication -- the server proves that it knows the - user's secret, and with qop=auth-int also provides limited integrity - protection of the response. The "response-digest" value is calculated - as for the "request-digest" in the Authorization header, except that - if "qop=auth" or is not specified in the Authorization header for the - request, A2 is - - A2 = ":" digest-uri-value - - and if "qop=auth-int", then A2 is - - A2 = ":" digest-uri-value ":" H(entity-body) - - - -Franks, et al. Standards Track [Page 16] - -RFC 2617 HTTP Authentication June 1999 - - - where "digest-uri-value" is the value of the "uri" directive on the - Authorization header in the request. The "cnonce-value" and "nc- - value" MUST be the ones for the client request to which this message - is the response. The "response-auth", "cnonce", and "nonce-count" - directives MUST BE present if "qop=auth" or "qop=auth-int" is - specified. - - The Authentication-Info header is allowed in the trailer of an HTTP - message transferred via chunked transfer-coding. - -3.3 Digest Operation - - Upon receiving the Authorization header, the server may check its - validity by looking up the password that corresponds to the submitted - username. Then, the server must perform the same digest operation - (e.g., MD5) performed by the client, and compare the result to the - given request-digest value. - - Note that the HTTP server does not actually need to know the user's - cleartext password. As long as H(A1) is available to the server, the - validity of an Authorization header may be verified. - - The client response to a WWW-Authenticate challenge for a protection - space starts an authentication session with that protection space. - The authentication session lasts until the client receives another - WWW-Authenticate challenge from any server in the protection space. A - client should remember the username, password, nonce, nonce count and - opaque values associated with an authentication session to use to - construct the Authorization header in future requests within that - protection space. The Authorization header may be included - preemptively; doing so improves server efficiency and avoids extra - round trips for authentication challenges. The server may choose to - accept the old Authorization header information, even though the - nonce value included might not be fresh. Alternatively, the server - may return a 401 response with a new nonce value, causing the client - to retry the request; by specifying stale=TRUE with this response, - the server tells the client to retry with the new nonce, but without - prompting for a new username and password. - - Because the client is required to return the value of the opaque - directive given to it by the server for the duration of a session, - the opaque data may be used to transport authentication session state - information. (Note that any such use can also be accomplished more - easily and safely by including the state in the nonce.) For example, - a server could be responsible for authenticating content that - actually sits on another server. It would achieve this by having the - first 401 response include a domain directive whose value includes a - URI on the second server, and an opaque directive whose value - - - -Franks, et al. Standards Track [Page 17] - -RFC 2617 HTTP Authentication June 1999 - - - contains the state information. The client will retry the request, at - which time the server might respond with a 301/302 redirection, - pointing to the URI on the second server. The client will follow the - redirection, and pass an Authorization header , including the - <opaque> data. - - As with the basic scheme, proxies must be completely transparent in - the Digest access authentication scheme. That is, they must forward - the WWW-Authenticate, Authentication-Info and Authorization headers - untouched. If a proxy wants to authenticate a client before a request - is forwarded to the server, it can be done using the Proxy- - Authenticate and Proxy-Authorization headers described in section 3.6 - below. - -3.4 Security Protocol Negotiation - - It is useful for a server to be able to know which security schemes a - client is capable of handling. - - It is possible that a server may want to require Digest as its - authentication method, even if the server does not know that the - client supports it. A client is encouraged to fail gracefully if the - server specifies only authentication schemes it cannot handle. - -3.5 Example - - The following example assumes that an access-protected document is - being requested from the server via a GET request. The URI of the - document is "http://www.nowhere.org/dir/index.html". Both client and - server know that the username for this document is "Mufasa", and the - password is "Circle Of Life" (with one space between each of the - three words). - - The first time the client requests the document, no Authorization - header is sent, so the server responds with: - - HTTP/1.1 401 Unauthorized - WWW-Authenticate: Digest - realm="testrealm@host.com", - qop="auth,auth-int", - nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093", - opaque="5ccc069c403ebaf9f0171e9517f40e41" - - The client may prompt the user for the username and password, after - which it will respond with a new request, including the following - Authorization header: - - - - - -Franks, et al. Standards Track [Page 18] - -RFC 2617 HTTP Authentication June 1999 - - - Authorization: Digest username="Mufasa", - realm="testrealm@host.com", - nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093", - uri="/dir/index.html", - qop=auth, - nc=00000001, - cnonce="0a4f113b", - response="6629fae49393a05397450978507c4ef1", - opaque="5ccc069c403ebaf9f0171e9517f40e41" - -3.6 Proxy-Authentication and Proxy-Authorization - - The digest authentication scheme may also be used for authenticating - users to proxies, proxies to proxies, or proxies to origin servers by - use of the Proxy-Authenticate and Proxy-Authorization headers. These - headers are instances of the Proxy-Authenticate and Proxy- - Authorization headers specified in sections 10.33 and 10.34 of the - HTTP/1.1 specification [2] and their behavior is subject to - restrictions described there. The transactions for proxy - authentication are very similar to those already described. Upon - receiving a request which requires authentication, the proxy/server - must issue the "407 Proxy Authentication Required" response with a - "Proxy-Authenticate" header. The digest-challenge used in the - Proxy-Authenticate header is the same as that for the WWW- - Authenticate header as defined above in section 3.2.1. - - The client/proxy must then re-issue the request with a Proxy- - Authorization header, with directives as specified for the - Authorization header in section 3.2.2 above. - - On subsequent responses, the server sends Proxy-Authentication-Info - with directives the same as those for the Authentication-Info header - field. - - Note that in principle a client could be asked to authenticate itself - to both a proxy and an end-server, but never in the same response. - -4 Security Considerations - -4.1 Authentication of Clients using Basic Authentication - - The Basic authentication scheme is not a secure method of user - authentication, nor does it in any way protect the entity, which is - transmitted in cleartext across the physical network used as the - carrier. HTTP does not prevent additional authentication schemes and - encryption mechanisms from being employed to increase security or the - addition of enhancements (such as schemes to use one-time passwords) - to Basic authentication. - - - -Franks, et al. Standards Track [Page 19] - -RFC 2617 HTTP Authentication June 1999 - - - The most serious flaw in Basic authentication is that it results in - the essentially cleartext transmission of the user's password over - the physical network. It is this problem which Digest Authentication - attempts to address. - - Because Basic authentication involves the cleartext transmission of - passwords it SHOULD NOT be used (without enhancements) to protect - sensitive or valuable information. - - A common use of Basic authentication is for identification purposes - -- requiring the user to provide a user name and password as a means - of identification, for example, for purposes of gathering accurate - usage statistics on a server. When used in this way it is tempting to - think that there is no danger in its use if illicit access to the - protected documents is not a major concern. This is only correct if - the server issues both user name and password to the users and in - particular does not allow the user to choose his or her own password. - The danger arises because naive users frequently reuse a single - password to avoid the task of maintaining multiple passwords. - - If a server permits users to select their own passwords, then the - threat is not only unauthorized access to documents on the server but - also unauthorized access to any other resources on other systems that - the user protects with the same password. Furthermore, in the - server's password database, many of the passwords may also be users' - passwords for other sites. The owner or administrator of such a - system could therefore expose all users of the system to the risk of - unauthorized access to all those sites if this information is not - maintained in a secure fashion. - - Basic Authentication is also vulnerable to spoofing by counterfeit - servers. If a user can be led to believe that he is connecting to a - host containing information protected by Basic authentication when, - in fact, he is connecting to a hostile server or gateway, then the - attacker can request a password, store it for later use, and feign an - error. This type of attack is not possible with Digest - Authentication. Server implementers SHOULD guard against the - possibility of this sort of counterfeiting by gateways or CGI - scripts. In particular it is very dangerous for a server to simply - turn over a connection to a gateway. That gateway can then use the - persistent connection mechanism to engage in multiple transactions - with the client while impersonating the original server in a way that - is not detectable by the client. - -4.2 Authentication of Clients using Digest Authentication - - Digest Authentication does not provide a strong authentication - mechanism, when compared to public key based mechanisms, for example. - - - -Franks, et al. Standards Track [Page 20] - -RFC 2617 HTTP Authentication June 1999 - - - However, it is significantly stronger than (e.g.) CRAM-MD5, which has - been proposed for use with LDAP [10], POP and IMAP (see RFC 2195 - [9]). It is intended to replace the much weaker and even more - dangerous Basic mechanism. - - Digest Authentication offers no confidentiality protection beyond - protecting the actual password. All of the rest of the request and - response are available to an eavesdropper. - - Digest Authentication offers only limited integrity protection for - the messages in either direction. If qop=auth-int mechanism is used, - those parts of the message used in the calculation of the WWW- - Authenticate and Authorization header field response directive values - (see section 3.2 above) are protected. Most header fields and their - values could be modified as a part of a man-in-the-middle attack. - - Many needs for secure HTTP transactions cannot be met by Digest - Authentication. For those needs TLS or SHTTP are more appropriate - protocols. In particular Digest authentication cannot be used for any - transaction requiring confidentiality protection. Nevertheless many - functions remain for which Digest authentication is both useful and - appropriate. Any service in present use that uses Basic should be - switched to Digest as soon as practical. - -4.3 Limited Use Nonce Values - - The Digest scheme uses a server-specified nonce to seed the - generation of the request-digest value (as specified in section - 3.2.2.1 above). As shown in the example nonce in section 3.2.1, the - server is free to construct the nonce such that it may only be used - from a particular client, for a particular resource, for a limited - period of time or number of uses, or any other restrictions. Doing - so strengthens the protection provided against, for example, replay - attacks (see 4.5). However, it should be noted that the method - chosen for generating and checking the nonce also has performance and - resource implications. For example, a server may choose to allow - each nonce value to be used only once by maintaining a record of - whether or not each recently issued nonce has been returned and - sending a next-nonce directive in the Authentication-Info header - field of every response. This protects against even an immediate - replay attack, but has a high cost checking nonce values, and perhaps - more important will cause authentication failures for any pipelined - requests (presumably returning a stale nonce indication). Similarly, - incorporating a request-specific element such as the Etag value for a - resource limits the use of the nonce to that version of the resource - and also defeats pipelining. Thus it may be useful to do so for - methods with side effects but have unacceptable performance for those - that do not. - - - -Franks, et al. Standards Track [Page 21] - -RFC 2617 HTTP Authentication June 1999 - - -4.4 Comparison of Digest with Basic Authentication - - Both Digest and Basic Authentication are very much on the weak end of - the security strength spectrum. But a comparison between the two - points out the utility, even necessity, of replacing Basic by Digest. - - The greatest threat to the type of transactions for which these - protocols are used is network snooping. This kind of transaction - might involve, for example, online access to a database whose use is - restricted to paying subscribers. With Basic authentication an - eavesdropper can obtain the password of the user. This not only - permits him to access anything in the database, but, often worse, - will permit access to anything else the user protects with the same - password. - - By contrast, with Digest Authentication the eavesdropper only gets - access to the transaction in question and not to the user's password. - The information gained by the eavesdropper would permit a replay - attack, but only with a request for the same document, and even that - may be limited by the server's choice of nonce. - -4.5 Replay Attacks - - A replay attack against Digest authentication would usually be - pointless for a simple GET request since an eavesdropper would - already have seen the only document he could obtain with a replay. - This is because the URI of the requested document is digested in the - client request and the server will only deliver that document. By - contrast under Basic Authentication once the eavesdropper has the - user's password, any document protected by that password is open to - him. - - Thus, for some purposes, it is necessary to protect against replay - attacks. A good Digest implementation can do this in various ways. - The server created "nonce" value is implementation dependent, but if - it contains a digest of the client IP, a time-stamp, the resource - ETag, and a private server key (as recommended above) then a replay - attack is not simple. An attacker must convince the server that the - request is coming from a false IP address and must cause the server - to deliver the document to an IP address different from the address - to which it believes it is sending the document. An attack can only - succeed in the period before the time-stamp expires. Digesting the - client IP and time-stamp in the nonce permits an implementation which - does not maintain state between transactions. - - For applications where no possibility of replay attack can be - tolerated the server can use one-time nonce values which will not be - honored for a second use. This requires the overhead of the server - - - -Franks, et al. Standards Track [Page 22] - -RFC 2617 HTTP Authentication June 1999 - - - remembering which nonce values have been used until the nonce time- - stamp (and hence the digest built with it) has expired, but it - effectively protects against replay attacks. - - An implementation must give special attention to the possibility of - replay attacks with POST and PUT requests. Unless the server employs - one-time or otherwise limited-use nonces and/or insists on the use of - the integrity protection of qop=auth-int, an attacker could replay - valid credentials from a successful request with counterfeit form - data or other message body. Even with the use of integrity protection - most metadata in header fields is not protected. Proper nonce - generation and checking provides some protection against replay of - previously used valid credentials, but see 4.8. - -4.6 Weakness Created by Multiple Authentication Schemes - - An HTTP/1.1 server may return multiple challenges with a 401 - (Authenticate) response, and each challenge may use a different - auth-scheme. A user agent MUST choose to use the strongest auth- - scheme it understands and request credentials from the user based - upon that challenge. - - Note that many browsers will only recognize Basic and will require - that it be the first auth-scheme presented. Servers should only - include Basic if it is minimally acceptable. - - When the server offers choices of authentication schemes using the - WWW-Authenticate header, the strength of the resulting authentication - is only as good as that of the of the weakest of the authentication - schemes. See section 4.8 below for discussion of particular attack - scenarios that exploit multiple authentication schemes. - -4.7 Online dictionary attacks - - If the attacker can eavesdrop, then it can test any overheard - nonce/response pairs against a list of common words. Such a list is - usually much smaller than the total number of possible passwords. The - cost of computing the response for each password on the list is paid - once for each challenge. - - The server can mitigate this attack by not allowing users to select - passwords that are in a dictionary. - - - - - - - - - -Franks, et al. Standards Track [Page 23] - -RFC 2617 HTTP Authentication June 1999 - - -4.8 Man in the Middle - - Both Basic and Digest authentication are vulnerable to "man in the - middle" (MITM) attacks, for example, from a hostile or compromised - proxy. Clearly, this would present all the problems of eavesdropping. - But it also offers some additional opportunities to the attacker. - - A possible man-in-the-middle attack would be to add a weak - authentication scheme to the set of choices, hoping that the client - will use one that exposes the user's credentials (e.g. password). For - this reason, the client should always use the strongest scheme that - it understands from the choices offered. - - An even better MITM attack would be to remove all offered choices, - replacing them with a challenge that requests only Basic - authentication, then uses the cleartext credentials from the Basic - authentication to authenticate to the origin server using the - stronger scheme it requested. A particularly insidious way to mount - such a MITM attack would be to offer a "free" proxy caching service - to gullible users. - - User agents should consider measures such as presenting a visual - indication at the time of the credentials request of what - authentication scheme is to be used, or remembering the strongest - authentication scheme ever requested by a server and produce a - warning message before using a weaker one. It might also be a good - idea for the user agent to be configured to demand Digest - authentication in general, or from specific sites. - - Or, a hostile proxy might spoof the client into making a request the - attacker wanted rather than one the client wanted. Of course, this is - still much harder than a comparable attack against Basic - Authentication. - -4.9 Chosen plaintext attacks - - With Digest authentication, a MITM or a malicious server can - arbitrarily choose the nonce that the client will use to compute the - response. This is called a "chosen plaintext" attack. The ability to - choose the nonce is known to make cryptanalysis much easier [8]. - - However, no way to analyze the MD5 one-way function used by Digest - using chosen plaintext is currently known. - - The countermeasure against this attack is for clients to be - configured to require the use of the optional "cnonce" directive; - this allows the client to vary the input to the hash in a way not - chosen by the attacker. - - - -Franks, et al. Standards Track [Page 24] - -RFC 2617 HTTP Authentication June 1999 - - -4.10 Precomputed dictionary attacks - - With Digest authentication, if the attacker can execute a chosen - plaintext attack, the attacker can precompute the response for many - common words to a nonce of its choice, and store a dictionary of - (response, password) pairs. Such precomputation can often be done in - parallel on many machines. It can then use the chosen plaintext - attack to acquire a response corresponding to that challenge, and - just look up the password in the dictionary. Even if most passwords - are not in the dictionary, some might be. Since the attacker gets to - pick the challenge, the cost of computing the response for each - password on the list can be amortized over finding many passwords. A - dictionary with 100 million password/response pairs would take about - 3.2 gigabytes of disk storage. - - The countermeasure against this attack is to for clients to be - configured to require the use of the optional "cnonce" directive. - -4.11 Batch brute force attacks - - With Digest authentication, a MITM can execute a chosen plaintext - attack, and can gather responses from many users to the same nonce. - It can then find all the passwords within any subset of password - space that would generate one of the nonce/response pairs in a single - pass over that space. It also reduces the time to find the first - password by a factor equal to the number of nonce/response pairs - gathered. This search of the password space can often be done in - parallel on many machines, and even a single machine can search large - subsets of the password space very quickly -- reports exist of - searching all passwords with six or fewer letters in a few hours. - - The countermeasure against this attack is to for clients to be - configured to require the use of the optional "cnonce" directive. - -4.12 Spoofing by Counterfeit Servers - - Basic Authentication is vulnerable to spoofing by counterfeit - servers. If a user can be led to believe that she is connecting to a - host containing information protected by a password she knows, when - in fact she is connecting to a hostile server, then the hostile - server can request a password, store it away for later use, and feign - an error. This type of attack is more difficult with Digest - Authentication -- but the client must know to demand that Digest - authentication be used, perhaps using some of the techniques - described above to counter "man-in-the-middle" attacks. Again, the - user can be helped in detecting this attack by a visual indication of - the authentication mechanism in use with appropriate guidance in - interpreting the implications of each scheme. - - - -Franks, et al. Standards Track [Page 25] - -RFC 2617 HTTP Authentication June 1999 - - -4.13 Storing passwords - - Digest authentication requires that the authenticating agent (usually - the server) store some data derived from the user's name and password - in a "password file" associated with a given realm. Normally this - might contain pairs consisting of username and H(A1), where H(A1) is - the digested value of the username, realm, and password as described - above. - - The security implications of this are that if this password file is - compromised, then an attacker gains immediate access to documents on - the server using this realm. Unlike, say a standard UNIX password - file, this information need not be decrypted in order to access - documents in the server realm associated with this file. On the other - hand, decryption, or more likely a brute force attack, would be - necessary to obtain the user's password. This is the reason that the - realm is part of the digested data stored in the password file. It - means that if one Digest authentication password file is compromised, - it does not automatically compromise others with the same username - and password (though it does expose them to brute force attack). - - There are two important security consequences of this. First the - password file must be protected as if it contained unencrypted - passwords, because for the purpose of accessing documents in its - realm, it effectively does. - - A second consequence of this is that the realm string should be - unique among all realms which any single user is likely to use. In - particular a realm string should include the name of the host doing - the authentication. The inability of the client to authenticate the - server is a weakness of Digest Authentication. - -4.14 Summary - - By modern cryptographic standards Digest Authentication is weak. But - for a large range of purposes it is valuable as a replacement for - Basic Authentication. It remedies some, but not all, weaknesses of - Basic Authentication. Its strength may vary depending on the - implementation. In particular the structure of the nonce (which is - dependent on the server implementation) may affect the ease of - mounting a replay attack. A range of server options is appropriate - since, for example, some implementations may be willing to accept the - server overhead of one-time nonces or digests to eliminate the - possibility of replay. Others may satisfied with a nonce like the one - recommended above restricted to a single IP address and a single ETag - or with a limited lifetime. - - - - - -Franks, et al. Standards Track [Page 26] - -RFC 2617 HTTP Authentication June 1999 - - - The bottom line is that *any* compliant implementation will be - relatively weak by cryptographic standards, but *any* compliant - implementation will be far superior to Basic Authentication. - -5 Sample implementation - - [[ WARNING: DigestCalcHA1 IS WRONG ]] - - The following code implements the calculations of H(A1), H(A2), - request-digest and response-digest, and a test program which computes - the values used in the example of section 3.5. It uses the MD5 - implementation from RFC 1321. - - File "digcalc.h": - -#define HASHLEN 16 -typedef char HASH[HASHLEN]; -#define HASHHEXLEN 32 -typedef char HASHHEX[HASHHEXLEN+1]; -#define IN -#define OUT - -/* calculate H(A1) as per HTTP Digest spec */ -void DigestCalcHA1( - IN char * pszAlg, - IN char * pszUserName, - IN char * pszRealm, - IN char * pszPassword, - IN char * pszNonce, - IN char * pszCNonce, - OUT HASHHEX SessionKey - ); - -/* calculate request-digest/response-digest as per HTTP Digest spec */ -void DigestCalcResponse( - IN HASHHEX HA1, /* H(A1) */ - IN char * pszNonce, /* nonce from server */ - IN char * pszNonceCount, /* 8 hex digits */ - IN char * pszCNonce, /* client nonce */ - IN char * pszQop, /* qop-value: "", "auth", "auth-int" */ - IN char * pszMethod, /* method from the request */ - IN char * pszDigestUri, /* requested URL */ - IN HASHHEX HEntity, /* H(entity body) if qop="auth-int" */ - OUT HASHHEX Response /* request-digest or response-digest */ - ); - -File "digcalc.c": - -#include <global.h> -#include <md5.h> - - - -Franks, et al. Standards Track [Page 27] - -RFC 2617 HTTP Authentication June 1999 - - -#include <string.h> -#include "digcalc.h" - -void CvtHex( - IN HASH Bin, - OUT HASHHEX Hex - ) -{ - unsigned short i; - unsigned char j; - - for (i = 0; i < HASHLEN; i++) { - j = (Bin[i] >> 4) & 0xf; - if (j <= 9) - Hex[i*2] = (j + '0'); - else - Hex[i*2] = (j + 'a' - 10); - j = Bin[i] & 0xf; - if (j <= 9) - Hex[i*2+1] = (j + '0'); - else - Hex[i*2+1] = (j + 'a' - 10); - }; - Hex[HASHHEXLEN] = '\0'; -}; - -/* calculate H(A1) as per spec */ -void DigestCalcHA1( - IN char * pszAlg, - IN char * pszUserName, - IN char * pszRealm, - IN char * pszPassword, - IN char * pszNonce, - IN char * pszCNonce, - OUT HASHHEX SessionKey - ) -{ - MD5_CTX Md5Ctx; - HASH HA1; - - MD5Init(&Md5Ctx); - MD5Update(&Md5Ctx, pszUserName, strlen(pszUserName)); - MD5Update(&Md5Ctx, ":", 1); - MD5Update(&Md5Ctx, pszRealm, strlen(pszRealm)); - MD5Update(&Md5Ctx, ":", 1); - MD5Update(&Md5Ctx, pszPassword, strlen(pszPassword)); - MD5Final(HA1, &Md5Ctx); - if (stricmp(pszAlg, "md5-sess") == 0) { - - - -Franks, et al. Standards Track [Page 28] - -RFC 2617 HTTP Authentication June 1999 - - - MD5Init(&Md5Ctx); - MD5Update(&Md5Ctx, HA1, HASHLEN); - MD5Update(&Md5Ctx, ":", 1); - MD5Update(&Md5Ctx, pszNonce, strlen(pszNonce)); - MD5Update(&Md5Ctx, ":", 1); - MD5Update(&Md5Ctx, pszCNonce, strlen(pszCNonce)); - MD5Final(HA1, &Md5Ctx); - }; - CvtHex(HA1, SessionKey); -}; - -/* calculate request-digest/response-digest as per HTTP Digest spec */ -void DigestCalcResponse( - IN HASHHEX HA1, /* H(A1) */ - IN char * pszNonce, /* nonce from server */ - IN char * pszNonceCount, /* 8 hex digits */ - IN char * pszCNonce, /* client nonce */ - IN char * pszQop, /* qop-value: "", "auth", "auth-int" */ - IN char * pszMethod, /* method from the request */ - IN char * pszDigestUri, /* requested URL */ - IN HASHHEX HEntity, /* H(entity body) if qop="auth-int" */ - OUT HASHHEX Response /* request-digest or response-digest */ - ) -{ - MD5_CTX Md5Ctx; - HASH HA2; - HASH RespHash; - HASHHEX HA2Hex; - - // calculate H(A2) - MD5Init(&Md5Ctx); - MD5Update(&Md5Ctx, pszMethod, strlen(pszMethod)); - MD5Update(&Md5Ctx, ":", 1); - MD5Update(&Md5Ctx, pszDigestUri, strlen(pszDigestUri)); - if (stricmp(pszQop, "auth-int") == 0) { - MD5Update(&Md5Ctx, ":", 1); - MD5Update(&Md5Ctx, HEntity, HASHHEXLEN); - }; - MD5Final(HA2, &Md5Ctx); - CvtHex(HA2, HA2Hex); - - // calculate response - MD5Init(&Md5Ctx); - MD5Update(&Md5Ctx, HA1, HASHHEXLEN); - MD5Update(&Md5Ctx, ":", 1); - MD5Update(&Md5Ctx, pszNonce, strlen(pszNonce)); - MD5Update(&Md5Ctx, ":", 1); - if (*pszQop) { - - - -Franks, et al. Standards Track [Page 29] - -RFC 2617 HTTP Authentication June 1999 - - - MD5Update(&Md5Ctx, pszNonceCount, strlen(pszNonceCount)); - MD5Update(&Md5Ctx, ":", 1); - MD5Update(&Md5Ctx, pszCNonce, strlen(pszCNonce)); - MD5Update(&Md5Ctx, ":", 1); - MD5Update(&Md5Ctx, pszQop, strlen(pszQop)); - MD5Update(&Md5Ctx, ":", 1); - }; - MD5Update(&Md5Ctx, HA2Hex, HASHHEXLEN); - MD5Final(RespHash, &Md5Ctx); - CvtHex(RespHash, Response); -}; - -File "digtest.c": - - -#include <stdio.h> -#include "digcalc.h" - -void main(int argc, char ** argv) { - - char * pszNonce = "dcd98b7102dd2f0e8b11d0f600bfb0c093"; - char * pszCNonce = "0a4f113b"; - char * pszUser = "Mufasa"; - char * pszRealm = "testrealm@host.com"; - char * pszPass = "Circle Of Life"; - char * pszAlg = "md5"; - char szNonceCount[9] = "00000001"; - char * pszMethod = "GET"; - char * pszQop = "auth"; - char * pszURI = "/dir/index.html"; - HASHHEX HA1; - HASHHEX HA2 = ""; - HASHHEX Response; - - DigestCalcHA1(pszAlg, pszUser, pszRealm, pszPass, pszNonce, -pszCNonce, HA1); - DigestCalcResponse(HA1, pszNonce, szNonceCount, pszCNonce, pszQop, - pszMethod, pszURI, HA2, Response); - printf("Response = %s\n", Response); -}; - - - - - - - - - - - -Franks, et al. Standards Track [Page 30] - -RFC 2617 HTTP Authentication June 1999 - - -6 Acknowledgments - - Eric W. Sink, of AbiSource, Inc., was one of the original authors - before the specification underwent substantial revision. - - In addition to the authors, valuable discussion instrumental in - creating this document has come from Peter J. Churchyard, Ned Freed, - and David M. Kristol. - - Jim Gettys and Larry Masinter edited this document for update. - -7 References - - [1] Berners-Lee, T., Fielding, R. and H. Frystyk, "Hypertext - Transfer Protocol -- HTTP/1.0", RFC 1945, May 1996. - - [2] Fielding, R., Gettys, J., Mogul, J., Frysyk, H., Masinter, L., - Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- - HTTP/1.1", RFC 2616, June 1999. - - [3] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April - 1992. - - [4] Freed, N. and N. Borenstein. "Multipurpose Internet Mail - Extensions (MIME) Part One: Format of Internet Message Bodies", - RFC 2045, November 1996. - - [5] Dierks, T. and C. Allen "The TLS Protocol, Version 1.0", RFC - 2246, January 1999. - - [6] Franks, J., Hallam-Baker, P., Hostetler, J., Leach, P., - Luotonen, A., Sink, E. and L. Stewart, "An Extension to HTTP : - Digest Access Authentication", RFC 2069, January 1997. - - [7] Berners Lee, T, Fielding, R. and L. Masinter, "Uniform Resource - Identifiers (URI): Generic Syntax", RFC 2396, August 1998. - - [8] Kaliski, B.,Robshaw, M., "Message Authentication with MD5", - CryptoBytes, Sping 1995, RSA Inc, - (http://www.rsa.com/rsalabs/pubs/cryptobytes/spring95/md5.htm) - - [9] Klensin, J., Catoe, R. and P. Krumviede, "IMAP/POP AUTHorize - Extension for Simple Challenge/Response", RFC 2195, September - 1997. - - [10] Morgan, B., Alvestrand, H., Hodges, J., Wahl, M., - "Authentication Methods for LDAP", Work in Progress. - - - - -Franks, et al. Standards Track [Page 31] - -RFC 2617 HTTP Authentication June 1999 - - -8 Authors' Addresses - - John Franks - Professor of Mathematics - Department of Mathematics - Northwestern University - Evanston, IL 60208-2730, USA - - EMail: john@math.nwu.edu - - - Phillip M. Hallam-Baker - Principal Consultant - Verisign Inc. - 301 Edgewater Place - Suite 210 - Wakefield MA 01880, USA - - EMail: pbaker@verisign.com - - - Jeffery L. Hostetler - Software Craftsman - AbiSource, Inc. - 6 Dunlap Court - Savoy, IL 61874 - - EMail: jeff@AbiSource.com - - - Scott D. Lawrence - Agranat Systems, Inc. - 5 Clocktower Place, Suite 400 - Maynard, MA 01754, USA - - EMail: lawrence@agranat.com - - - Paul J. Leach - Microsoft Corporation - 1 Microsoft Way - Redmond, WA 98052, USA - - EMail: paulle@microsoft.com - - - - - - - -Franks, et al. Standards Track [Page 32] - -RFC 2617 HTTP Authentication June 1999 - - - Ari Luotonen - Member of Technical Staff - Netscape Communications Corporation - 501 East Middlefield Road - Mountain View, CA 94043, USA - - - Lawrence C. Stewart - Open Market, Inc. - 215 First Street - Cambridge, MA 02142, USA - - EMail: stewart@OpenMarket.com - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Franks, et al. Standards Track [Page 33] - -RFC 2617 HTTP Authentication June 1999 - - -9. Full Copyright Statement - - Copyright (C) The Internet Society (1999). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Acknowledgement - - Funding for the RFC Editor function is currently provided by the - Internet Society. - - - - - - - - - - - - - - - - - - - -Franks, et al. Standards Track [Page 34] - diff --git a/docs/specs/rfc2817.txt b/docs/specs/rfc2817.txt deleted file mode 100644 index d7b7e703..00000000 --- a/docs/specs/rfc2817.txt +++ /dev/null @@ -1,731 +0,0 @@ - - - - - - -Network Working Group R. Khare -Request for Comments: 2817 4K Associates / UC Irvine -Updates: 2616 S. Lawrence -Category: Standards Track Agranat Systems, Inc. - May 2000 - - - Upgrading to TLS Within HTTP/1.1 - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (2000). All Rights Reserved. - -Abstract - - This memo explains how to use the Upgrade mechanism in HTTP/1.1 to - initiate Transport Layer Security (TLS) over an existing TCP - connection. This allows unsecured and secured HTTP traffic to share - the same well known port (in this case, http: at 80 rather than - https: at 443). It also enables "virtual hosting", so a single HTTP + - TLS server can disambiguate traffic intended for several hostnames at - a single IP address. - - Since HTTP/1.1 [1] defines Upgrade as a hop-by-hop mechanism, this - memo also documents the HTTP CONNECT method for establishing end-to- - end tunnels across HTTP proxies. Finally, this memo establishes new - IANA registries for public HTTP status codes, as well as public or - private Upgrade product tokens. - - This memo does NOT affect the current definition of the 'https' URI - scheme, which already defines a separate namespace - (http://example.org/ and https://example.org/ are not equivalent). - - - - - - - - - - - -Khare & Lawrence Standards Track [Page 1] - -RFC 2817 HTTP Upgrade to TLS May 2000 - - -Table of Contents - - 1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 2 - 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 - 2.1 Requirements Terminology . . . . . . . . . . . . . . . . . . . 4 - 3. Client Requested Upgrade to HTTP over TLS . . . . . . . . . . 4 - 3.1 Optional Upgrade . . . . . . . . . . . . . . . . . . . . . . . 4 - 3.2 Mandatory Upgrade . . . . . . . . . . . . . . . . . . . . . . 4 - 3.3 Server Acceptance of Upgrade Request . . . . . . . . . . . . . 4 - 4. Server Requested Upgrade to HTTP over TLS . . . . . . . . . . 5 - 4.1 Optional Advertisement . . . . . . . . . . . . . . . . . . . . 5 - 4.2 Mandatory Advertisement . . . . . . . . . . . . . . . . . . . 5 - 5. Upgrade across Proxies . . . . . . . . . . . . . . . . . . . . 6 - 5.1 Implications of Hop By Hop Upgrade . . . . . . . . . . . . . . 6 - 5.2 Requesting a Tunnel with CONNECT . . . . . . . . . . . . . . . 6 - 5.3 Establishing a Tunnel with CONNECT . . . . . . . . . . . . . . 7 - 6. Rationale for the use of a 4xx (client error) Status Code . . 7 - 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 - 7.1 HTTP Status Code Registry . . . . . . . . . . . . . . . . . . 8 - 7.2 HTTP Upgrade Token Registry . . . . . . . . . . . . . . . . . 8 - 8. Security Considerations . . . . . . . . . . . . . . . . . . . 9 - 8.1 Implications for the https: URI Scheme . . . . . . . . . . . . 10 - 8.2 Security Considerations for CONNECT . . . . . . . . . . . . . 10 - References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 11 - A. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 12 - Full Copyright Statement . . . . . . . . . . . . . . . . . . . 13 - -1. Motivation - - The historical practice of deploying HTTP over SSL3 [3] has - distinguished the combination from HTTP alone by a unique URI scheme - and the TCP port number. The scheme 'http' meant the HTTP protocol - alone on port 80, while 'https' meant the HTTP protocol over SSL on - port 443. Parallel well-known port numbers have similarly been - requested -- and in some cases, granted -- to distinguish between - secured and unsecured use of other application protocols (e.g. - snews, ftps). This approach effectively halves the number of - available well known ports. - - At the Washington DC IETF meeting in December 1997, the Applications - Area Directors and the IESG reaffirmed that the practice of issuing - parallel "secure" port numbers should be deprecated. The HTTP/1.1 - Upgrade mechanism can apply Transport Layer Security [6] to an open - HTTP connection. - - - - - - -Khare & Lawrence Standards Track [Page 2] - -RFC 2817 HTTP Upgrade to TLS May 2000 - - - In the nearly two years since, there has been broad acceptance of the - concept behind this proposal, but little interest in implementing - alternatives to port 443 for generic Web browsing. In fact, nothing - in this memo affects the current interpretation of https: URIs. - However, new application protocols built atop HTTP, such as the - Internet Printing Protocol [7], call for just such a mechanism in - order to move ahead in the IETF standards process. - - The Upgrade mechanism also solves the "virtual hosting" problem. - Rather than allocating multiple IP addresses to a single host, an - HTTP/1.1 server will use the Host: header to disambiguate the - intended web service. As HTTP/1.1 usage has grown more prevalent, - more ISPs are offering name-based virtual hosting, thus delaying IP - address space exhaustion. - - TLS (and SSL) have been hobbled by the same limitation as earlier - versions of HTTP: the initial handshake does not specify the intended - hostname, relying exclusively on the IP address. Using a cleartext - HTTP/1.1 Upgrade: preamble to the TLS handshake -- choosing the - certificates based on the initial Host: header -- will allow ISPs to - provide secure name-based virtual hosting as well. - -2. Introduction - - TLS, a.k.a., SSL (Secure Sockets Layer), establishes a private end- - to-end connection, optionally including strong mutual authentication, - using a variety of cryptosystems. Initially, a handshake phase uses - three subprotocols to set up a record layer, authenticate endpoints, - set parameters, as well as report errors. Then, there is an ongoing - layered record protocol that handles encryption, compression, and - reassembly for the remainder of the connection. The latter is - intended to be completely transparent. For example, there is no - dependency between TLS's record markers and or certificates and - HTTP/1.1's chunked encoding or authentication. - - Either the client or server can use the HTTP/1.1 [1] Upgrade - mechanism (Section 14.42) to indicate that a TLS-secured connection - is desired or necessary. This memo defines the "TLS/1.0" Upgrade - token, and a new HTTP Status Code, "426 Upgrade Required". - - Section 3 and Section 4 describe the operation of a directly - connected client and server. Intermediate proxies must establish an - end-to-end tunnel before applying those operations, as explained in - Section 5. - - - - - - - -Khare & Lawrence Standards Track [Page 3] - -RFC 2817 HTTP Upgrade to TLS May 2000 - - -2.1 Requirements Terminology - - Keywords "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT" and - "MAY" that appear in this document are to be interpreted as described - in RFC 2119 [11]. - -3. Client Requested Upgrade to HTTP over TLS - - When the client sends an HTTP/1.1 request with an Upgrade header - field containing the token "TLS/1.0", it is requesting the server to - complete the current HTTP/1.1 request after switching to TLS/1.0. - -3.1 Optional Upgrade - - A client MAY offer to switch to secured operation during any clear - HTTP request when an unsecured response would be acceptable: - - GET http://example.bank.com/acct_stat.html?749394889300 HTTP/1.1 - Host: example.bank.com - Upgrade: TLS/1.0 - Connection: Upgrade - - In this case, the server MAY respond to the clear HTTP operation - normally, OR switch to secured operation (as detailed in the next - section). - - Note that HTTP/1.1 [1] specifies "the upgrade keyword MUST be - supplied within a Connection header field (section 14.10) whenever - Upgrade is present in an HTTP/1.1 message". - -3.2 Mandatory Upgrade - - If an unsecured response would be unacceptable, a client MUST send an - OPTIONS request first to complete the switch to TLS/1.0 (if - possible). - - OPTIONS * HTTP/1.1 - Host: example.bank.com - Upgrade: TLS/1.0 - Connection: Upgrade - -3.3 Server Acceptance of Upgrade Request - - As specified in HTTP/1.1 [1], if the server is prepared to initiate - the TLS handshake, it MUST send the intermediate "101 Switching - Protocol" and MUST include an Upgrade response header specifying the - tokens of the protocol stack it is switching to: - - - - -Khare & Lawrence Standards Track [Page 4] - -RFC 2817 HTTP Upgrade to TLS May 2000 - - - HTTP/1.1 101 Switching Protocols - Upgrade: TLS/1.0, HTTP/1.1 - Connection: Upgrade - - Note that the protocol tokens listed in the Upgrade header of a 101 - Switching Protocols response specify an ordered 'bottom-up' stack. - - As specified in HTTP/1.1 [1], Section 10.1.2: "The server will - switch protocols to those defined by the response's Upgrade header - field immediately after the empty line which terminates the 101 - response". - - Once the TLS handshake completes successfully, the server MUST - continue with the response to the original request. Any TLS handshake - failure MUST lead to disconnection, per the TLS error alert - specification. - -4. Server Requested Upgrade to HTTP over TLS - - The Upgrade response header field advertises possible protocol - upgrades a server MAY accept. In conjunction with the "426 Upgrade - Required" status code, a server can advertise the exact protocol - upgrade(s) that a client MUST accept to complete the request. - -4.1 Optional Advertisement - - As specified in HTTP/1.1 [1], the server MAY include an Upgrade - header in any response other than 101 or 426 to indicate a - willingness to switch to any (combination) of the protocols listed. - -4.2 Mandatory Advertisement - - A server MAY indicate that a client request can not be completed - without TLS using the "426 Upgrade Required" status code, which MUST - include an an Upgrade header field specifying the token of the - required TLS version. - - HTTP/1.1 426 Upgrade Required - Upgrade: TLS/1.0, HTTP/1.1 - Connection: Upgrade - - The server SHOULD include a message body in the 426 response which - indicates in human readable form the reason for the error and - describes any alternative courses which may be available to the user. - - Note that even if a client is willing to use TLS, it must use the - operations in Section 3 to proceed; the TLS handshake cannot begin - immediately after the 426 response. - - - -Khare & Lawrence Standards Track [Page 5] - -RFC 2817 HTTP Upgrade to TLS May 2000 - - -5. Upgrade across Proxies - - As a hop-by-hop header, Upgrade is negotiated between each pair of - HTTP counterparties. If a User Agent sends a request with an Upgrade - header to a proxy, it is requesting a change to the protocol between - itself and the proxy, not an end-to-end change. - - Since TLS, in particular, requires end-to-end connectivity to provide - authentication and prevent man-in-the-middle attacks, this memo - specifies the CONNECT method to establish a tunnel across proxies. - - Once a tunnel is established, any of the operations in Section 3 can - be used to establish a TLS connection. - -5.1 Implications of Hop By Hop Upgrade - - If an origin server receives an Upgrade header from a proxy and - responds with a 101 Switching Protocols response, it is changing the - protocol only on the connection between the proxy and itself. - Similarly, a proxy might return a 101 response to its client to - change the protocol on that connection independently of the protocols - it is using to communicate toward the origin server. - - These scenarios also complicate diagnosis of a 426 response. Since - Upgrade is a hop-by-hop header, a proxy that does not recognize 426 - might remove the accompanying Upgrade header and prevent the client - from determining the required protocol switch. If a client receives - a 426 status without an accompanying Upgrade header, it will need to - request an end to end tunnel connection as described in Section 5.2 - and repeat the request in order to obtain the required upgrade - information. - - This hop-by-hop definition of Upgrade was a deliberate choice. It - allows for incremental deployment on either side of proxies, and for - optimized protocols between cascaded proxies without the knowledge of - the parties that are not a part of the change. - -5.2 Requesting a Tunnel with CONNECT - - A CONNECT method requests that a proxy establish a tunnel connection - on its behalf. The Request-URI portion of the Request-Line is always - an 'authority' as defined by URI Generic Syntax [2], which is to say - the host name and port number destination of the requested connection - separated by a colon: - - CONNECT server.example.com:80 HTTP/1.1 - Host: server.example.com:80 - - - - -Khare & Lawrence Standards Track [Page 6] - -RFC 2817 HTTP Upgrade to TLS May 2000 - - - Other HTTP mechanisms can be used normally with the CONNECT method -- - except end-to-end protocol Upgrade requests, of course, since the - tunnel must be established first. - - For example, proxy authentication might be used to establish the - authority to create a tunnel: - - CONNECT server.example.com:80 HTTP/1.1 - Host: server.example.com:80 - Proxy-Authorization: basic aGVsbG86d29ybGQ= - - Like any other pipelined HTTP/1.1 request, data to be tunneled may be - sent immediately after the blank line. The usual caveats also apply: - data may be discarded if the eventual response is negative, and the - connection may be reset with no response if more than one TCP segment - is outstanding. - -5.3 Establishing a Tunnel with CONNECT - - Any successful (2xx) response to a CONNECT request indicates that the - proxy has established a connection to the requested host and port, - and has switched to tunneling the current connection to that server - connection. - - It may be the case that the proxy itself can only reach the requested - origin server through another proxy. In this case, the first proxy - SHOULD make a CONNECT request of that next proxy, requesting a tunnel - to the authority. A proxy MUST NOT respond with any 2xx status code - unless it has either a direct or tunnel connection established to the - authority. - - An origin server which receives a CONNECT request for itself MAY - respond with a 2xx status code to indicate that a connection is - established. - - If at any point either one of the peers gets disconnected, any - outstanding data that came from that peer will be passed to the other - one, and after that also the other connection will be terminated by - the proxy. If there is outstanding data to that peer undelivered, - that data will be discarded. - -6. Rationale for the use of a 4xx (client error) Status Code - - Reliable, interoperable negotiation of Upgrade features requires an - unambiguous failure signal. The 426 Upgrade Required status code - allows a server to definitively state the precise protocol extensions - a given resource must be served with. - - - - -Khare & Lawrence Standards Track [Page 7] - -RFC 2817 HTTP Upgrade to TLS May 2000 - - - It might at first appear that the response should have been some form - of redirection (a 3xx code), by analogy to an old-style redirection - to an https: URI. User agents that do not understand Upgrade: - preclude this. - - Suppose that a 3xx code had been assigned for "Upgrade Required"; a - user agent that did not recognize it would treat it as 300. It would - then properly look for a "Location" header in the response and - attempt to repeat the request at the URL in that header field. Since - it did not know to Upgrade to incorporate the TLS layer, it would at - best fail again at the new URL. - -7. IANA Considerations - - IANA shall create registries for two name spaces, as described in BCP - 26 [10]: - - o HTTP Status Codes - o HTTP Upgrade Tokens - -7.1 HTTP Status Code Registry - - The HTTP Status Code Registry defines the name space for the Status- - Code token in the Status line of an HTTP response. The initial - values for this name space are those specified by: - - 1. Draft Standard for HTTP/1.1 [1] - 2. Web Distributed Authoring and Versioning [4] [defines 420-424] - 3. WebDAV Advanced Collections [5] (Work in Progress) [defines 425] - 4. Section 6 [defines 426] - - Values to be added to this name space SHOULD be subject to review in - the form of a standards track document within the IETF Applications - Area. Any such document SHOULD be traceable through statuses of - either 'Obsoletes' or 'Updates' to the Draft Standard for - HTTP/1.1 [1]. - -7.2 HTTP Upgrade Token Registry - - The HTTP Upgrade Token Registry defines the name space for product - tokens used to identify protocols in the Upgrade HTTP header field. - Each registered token should be associated with one or a set of - specifications, and with contact information. - - The Draft Standard for HTTP/1.1 [1] specifies that these tokens obey - the production for 'product': - - - - - -Khare & Lawrence Standards Track [Page 8] - -RFC 2817 HTTP Upgrade to TLS May 2000 - - - product = token ["/" product-version] - product-version = token - - Registrations should be allowed on a First Come First Served basis as - described in BCP 26 [10]. These specifications need not be IETF - documents or be subject to IESG review, but should obey the following - rules: - - 1. A token, once registered, stays registered forever. - 2. The registration MUST name a responsible party for the - registration. - 3. The registration MUST name a point of contact. - 4. The registration MAY name the documentation required for the - token. - 5. The responsible party MAY change the registration at any time. - The IANA will keep a record of all such changes, and make them - available upon request. - 6. The responsible party for the first registration of a "product" - token MUST approve later registrations of a "version" token - together with that "product" token before they can be registered. - 7. If absolutely required, the IESG MAY reassign the responsibility - for a token. This will normally only be used in the case when a - responsible party cannot be contacted. - - This specification defines the protocol token "TLS/1.0" as the - identifier for the protocol specified by The TLS Protocol [6]. - - It is NOT required that specifications for upgrade tokens be made - publicly available, but the contact information for the registration - SHOULD be. - -8. Security Considerations - - The potential for a man-in-the-middle attack (deleting the Upgrade - header) remains the same as current, mixed http/https practice: - - o Removing the Upgrade header is similar to rewriting web pages to - change https:// links to http:// links. - o The risk is only present if the server is willing to vend such - information over both a secure and an insecure channel in the - first place. - o If the client knows for a fact that a server is TLS-compliant, it - can insist on it by only sending an Upgrade request with a no-op - method like OPTIONS. - o Finally, as the https: specification warns, "users should - carefully examine the certificate presented by the server to - determine if it meets their expectations". - - - - -Khare & Lawrence Standards Track [Page 9] - -RFC 2817 HTTP Upgrade to TLS May 2000 - - - Furthermore, for clients that do not explicitly try to invoke TLS, - servers can use the Upgrade header in any response other than 101 or - 426 to advertise TLS compliance. Since TLS compliance should be - considered a feature of the server and not the resource at hand, it - should be sufficient to send it once, and let clients cache that - fact. - -8.1 Implications for the https: URI Scheme - - While nothing in this memo affects the definition of the 'https' URI - scheme, widespread adoption of this mechanism for HyperText content - could use 'http' to identify both secure and non-secure resources. - - The choice of what security characteristics are required on the - connection is left to the client and server. This allows either - party to use any information available in making this determination. - For example, user agents may rely on user preference settings or - information about the security of the network such as 'TLS required - on all POST operations not on my local net', or servers may apply - resource access rules such as 'the FORM on this page must be served - and submitted using TLS'. - -8.2 Security Considerations for CONNECT - - A generic TCP tunnel is fraught with security risks. First, such - authorization should be limited to a small number of known ports. - The Upgrade: mechanism defined here only requires onward tunneling at - port 80. Second, since tunneled data is opaque to the proxy, there - are additional risks to tunneling to other well-known or reserved - ports. A putative HTTP client CONNECTing to port 25 could relay spam - via SMTP, for example. - -References - - [1] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., - Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- - HTTP/1.1", RFC 2616, June 1999. - - [2] Berners-Lee, T., Fielding, R. and L. Masinter, "URI Generic - Syntax", RFC 2396, August 1998. - - [3] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000. - - [4] Goland, Y., Whitehead, E., Faizi, A., Carter, S. and D. Jensen, - "Web Distributed Authoring and Versioning", RFC 2518, February - 1999. - - - - - -Khare & Lawrence Standards Track [Page 10] - -RFC 2817 HTTP Upgrade to TLS May 2000 - - - [5] Slein, J., Whitehead, E.J., et al., "WebDAV Advanced Collections - Protocol", Work In Progress. - - [6] Dierks, T. and C. Allen, "The TLS Protocol", RFC 2246, January - 1999. - - [7] Herriot, R., Butler, S., Moore, P. and R. Turner, "Internet - Printing Protocol/1.0: Encoding and Transport", RFC 2565, April - 1999. - - [8] Luotonen, A., "Tunneling TCP based protocols through Web proxy - servers", Work In Progress. (Also available in: Luotonen, Ari. - Web Proxy Servers, Prentice-Hall, 1997 ISBN:0136806120.) - - [9] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, June - 1999. - - [10] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA - Considerations Section in RFCs", BCP 26, RFC 2434, October 1998. - - [11] Bradner, S., "Key words for use in RFCs to Indicate Requirement - Levels", BCP 14, RFC 2119, March 1997. - -Authors' Addresses - - Rohit Khare - 4K Associates / UC Irvine - 3207 Palo Verde - Irvine, CA 92612 - US - - Phone: +1 626 806 7574 - EMail: rohit@4K-associates.com - URI: http://www.4K-associates.com/ - - - Scott Lawrence - Agranat Systems, Inc. - 5 Clocktower Place - Suite 400 - Maynard, MA 01754 - US - - Phone: +1 978 461 0888 - EMail: lawrence@agranat.com - URI: http://www.agranat.com/ - - - - - -Khare & Lawrence Standards Track [Page 11] - -RFC 2817 HTTP Upgrade to TLS May 2000 - - -Appendix A. Acknowledgments - - The CONNECT method was originally described in a Work in Progress - titled, "Tunneling TCP based protocols through Web proxy servers", - [8] by Ari Luotonen of Netscape Communications Corporation. It was - widely implemented by HTTP proxies, but was never made a part of any - IETF Standards Track document. The method name CONNECT was reserved, - but not defined in [1]. - - The definition provided here is derived directly from that earlier - memo, with some editorial changes and conformance to the stylistic - conventions since established in other HTTP specifications. - - Additional Thanks to: - - o Paul Hoffman for his work on the STARTTLS command extension for - ESMTP. - o Roy Fielding for assistance with the rationale behind Upgrade: - and its interaction with OPTIONS. - o Eric Rescorla for his work on standardizing the existing https: - practice to compare with. - o Marshall Rose, for the xml2rfc document type description and tools - [9]. - o Jim Whitehead, for sorting out the current range of available HTTP - status codes. - o Henrik Frystyk Nielsen, whose work on the Mandatory extension - mechanism pointed out a hop-by-hop Upgrade still requires - tunneling. - o Harald Alvestrand for improvements to the token registration - rules. - - - - - - - - - - - - - - - - - - - - - -Khare & Lawrence Standards Track [Page 12] - -RFC 2817 HTTP Upgrade to TLS May 2000 - - -Full Copyright Statement - - Copyright (C) The Internet Society (2000). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Acknowledgement - - Funding for the RFC Editor function is currently provided by the - Internet Society. - - - - - - - - - - - - - - - - - - - -Khare & Lawrence Standards Track [Page 13] - diff --git a/docs/specs/rfc2818.txt b/docs/specs/rfc2818.txt deleted file mode 100644 index 219a1c42..00000000 --- a/docs/specs/rfc2818.txt +++ /dev/null @@ -1,395 +0,0 @@ - - - - - - -Network Working Group E. Rescorla -Request for Comments: 2818 RTFM, Inc. -Category: Informational May 2000 - - - HTTP Over TLS - -Status of this Memo - - This memo provides information for the Internet community. It does - not specify an Internet standard of any kind. Distribution of this - memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (2000). All Rights Reserved. - -Abstract - - This memo describes how to use TLS to secure HTTP connections over - the Internet. Current practice is to layer HTTP over SSL (the - predecessor to TLS), distinguishing secured traffic from insecure - traffic by the use of a different server port. This document - documents that practice using TLS. A companion document describes a - method for using HTTP/TLS over the same port as normal HTTP - [RFC2817]. - -Table of Contents - - 1. Introduction . . . . . . . . . . . . . . . . . . . . . . 2 - 1.1. Requirements Terminology . . . . . . . . . . . . . . . 2 - 2. HTTP Over TLS . . . . . . . . . . . . . . . . . . . . . . 2 - 2.1. Connection Initiation . . . . . . . . . . . . . . . . . 2 - 2.2. Connection Closure . . . . . . . . . . . . . . . . . . 2 - 2.2.1. Client Behavior . . . . . . . . . . . . . . . . . . . 3 - 2.2.2. Server Behavior . . . . . . . . . . . . . . . . . . . 3 - 2.3. Port Number . . . . . . . . . . . . . . . . . . . . . . 4 - 2.4. URI Format . . . . . . . . . . . . . . . . . . . . . . 4 - 3. Endpoint Identification . . . . . . . . . . . . . . . . . 4 - 3.1. Server Identity . . . . . . . . . . . . . . . . . . . . 4 - 3.2. Client Identity . . . . . . . . . . . . . . . . . . . . 5 - References . . . . . . . . . . . . . . . . . . . . . . . . . 6 - Security Considerations . . . . . . . . . . . . . . . . . . 6 - Author's Address . . . . . . . . . . . . . . . . . . . . . . 6 - Full Copyright Statement . . . . . . . . . . . . . . . . . . 7 - - - - - - -Rescorla Informational [Page 1] - -RFC 2818 HTTP Over TLS May 2000 - - -1. Introduction - - HTTP [RFC2616] was originally used in the clear on the Internet. - However, increased use of HTTP for sensitive applications has - required security measures. SSL, and its successor TLS [RFC2246] were - designed to provide channel-oriented security. This document - describes how to use HTTP over TLS. - -1.1. Requirements Terminology - - Keywords "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT" and - "MAY" that appear in this document are to be interpreted as described - in [RFC2119]. - -2. HTTP Over TLS - - Conceptually, HTTP/TLS is very simple. Simply use HTTP over TLS - precisely as you would use HTTP over TCP. - -2.1. Connection Initiation - - The agent acting as the HTTP client should also act as the TLS - client. It should initiate a connection to the server on the - appropriate port and then send the TLS ClientHello to begin the TLS - handshake. When the TLS handshake has finished. The client may then - initiate the first HTTP request. All HTTP data MUST be sent as TLS - "application data". Normal HTTP behavior, including retained - connections should be followed. - -2.2. Connection Closure - - TLS provides a facility for secure connection closure. When a valid - closure alert is received, an implementation can be assured that no - further data will be received on that connection. TLS - implementations MUST initiate an exchange of closure alerts before - closing a connection. A TLS implementation MAY, after sending a - closure alert, close the connection without waiting for the peer to - send its closure alert, generating an "incomplete close". Note that - an implementation which does this MAY choose to reuse the session. - This SHOULD only be done when the application knows (typically - through detecting HTTP message boundaries) that it has received all - the message data that it cares about. - - As specified in [RFC2246], any implementation which receives a - connection close without first receiving a valid closure alert (a - "premature close") MUST NOT reuse that session. Note that a - premature close does not call into question the security of the data - already received, but simply indicates that subsequent data might - - - -Rescorla Informational [Page 2] - -RFC 2818 HTTP Over TLS May 2000 - - - have been truncated. Because TLS is oblivious to HTTP - request/response boundaries, it is necessary to examine the HTTP data - itself (specifically the Content-Length header) to determine whether - the truncation occurred inside a message or between messages. - -2.2.1. Client Behavior - - Because HTTP uses connection closure to signal end of server data, - client implementations MUST treat any premature closes as errors and - the data received as potentially truncated. While in some cases the - HTTP protocol allows the client to find out whether truncation took - place so that, if it received the complete reply, it may tolerate - such errors following the principle to "[be] strict when sending and - tolerant when receiving" [RFC1958], often truncation does not show in - the HTTP protocol data; two cases in particular deserve special note: - - A HTTP response without a Content-Length header. Since data length - in this situation is signalled by connection close a premature - close generated by the server cannot be distinguished from a - spurious close generated by an attacker. - - A HTTP response with a valid Content-Length header closed before - all data has been read. Because TLS does not provide document - oriented protection, it is impossible to determine whether the - server has miscomputed the Content-Length or an attacker has - truncated the connection. - - There is one exception to the above rule. When encountering a - premature close, a client SHOULD treat as completed all requests for - which it has received as much data as specified in the Content-Length - header. - - A client detecting an incomplete close SHOULD recover gracefully. It - MAY resume a TLS session closed in this fashion. - - Clients MUST send a closure alert before closing the connection. - Clients which are unprepared to receive any more data MAY choose not - to wait for the server's closure alert and simply close the - connection, thus generating an incomplete close on the server side. - -2.2.2. Server Behavior - - RFC 2616 permits an HTTP client to close the connection at any time, - and requires servers to recover gracefully. In particular, servers - SHOULD be prepared to receive an incomplete close from the client, - since the client can often determine when the end of server data is. - Servers SHOULD be willing to resume TLS sessions closed in this - fashion. - - - -Rescorla Informational [Page 3] - -RFC 2818 HTTP Over TLS May 2000 - - - Implementation note: In HTTP implementations which do not use - persistent connections, the server ordinarily expects to be able to - signal end of data by closing the connection. When Content-Length is - used, however, the client may have already sent the closure alert and - dropped the connection. - - Servers MUST attempt to initiate an exchange of closure alerts with - the client before closing the connection. Servers MAY close the - connection after sending the closure alert, thus generating an - incomplete close on the client side. - -2.3. Port Number - - The first data that an HTTP server expects to receive from the client - is the Request-Line production. The first data that a TLS server (and - hence an HTTP/TLS server) expects to receive is the ClientHello. - Consequently, common practice has been to run HTTP/TLS over a - separate port in order to distinguish which protocol is being used. - When HTTP/TLS is being run over a TCP/IP connection, the default port - is 443. This does not preclude HTTP/TLS from being run over another - transport. TLS only presumes a reliable connection-oriented data - stream. - -2.4. URI Format - - HTTP/TLS is differentiated from HTTP URIs by using the 'https' - protocol identifier in place of the 'http' protocol identifier. An - example URI specifying HTTP/TLS is: - - https://www.example.com/~smith/home.html - -3. Endpoint Identification - -3.1. Server Identity - - In general, HTTP/TLS requests are generated by dereferencing a URI. - As a consequence, the hostname for the server is known to the client. - If the hostname is available, the client MUST check it against the - server's identity as presented in the server's Certificate message, - in order to prevent man-in-the-middle attacks. - - If the client has external information as to the expected identity of - the server, the hostname check MAY be omitted. (For instance, a - client may be connecting to a machine whose address and hostname are - dynamic but the client knows the certificate that the server will - present.) In such cases, it is important to narrow the scope of - acceptable certificates as much as possible in order to prevent man - - - - -Rescorla Informational [Page 4] - -RFC 2818 HTTP Over TLS May 2000 - - - in the middle attacks. In special cases, it may be appropriate for - the client to simply ignore the server's identity, but it must be - understood that this leaves the connection open to active attack. - - If a subjectAltName extension of type dNSName is present, that MUST - be used as the identity. Otherwise, the (most specific) Common Name - field in the Subject field of the certificate MUST be used. Although - the use of the Common Name is existing practice, it is deprecated and - Certification Authorities are encouraged to use the dNSName instead. - - Matching is performed using the matching rules specified by - [RFC2459]. If more than one identity of a given type is present in - the certificate (e.g., more than one dNSName name, a match in any one - of the set is considered acceptable.) Names may contain the wildcard - character * which is considered to match any single domain name - component or component fragment. E.g., *.a.com matches foo.a.com but - not bar.foo.a.com. f*.com matches foo.com but not bar.com. - - In some cases, the URI is specified as an IP address rather than a - hostname. In this case, the iPAddress subjectAltName must be present - in the certificate and must exactly match the IP in the URI. - - If the hostname does not match the identity in the certificate, user - oriented clients MUST either notify the user (clients MAY give the - user the opportunity to continue with the connection in any case) or - terminate the connection with a bad certificate error. Automated - clients MUST log the error to an appropriate audit log (if available) - and SHOULD terminate the connection (with a bad certificate error). - Automated clients MAY provide a configuration setting that disables - this check, but MUST provide a setting which enables it. - - Note that in many cases the URI itself comes from an untrusted - source. The above-described check provides no protection against - attacks where this source is compromised. For example, if the URI was - obtained by clicking on an HTML page which was itself obtained - without using HTTP/TLS, a man in the middle could have replaced the - URI. In order to prevent this form of attack, users should carefully - examine the certificate presented by the server to determine if it - meets their expectations. - -3.2. Client Identity - - Typically, the server has no external knowledge of what the client's - identity ought to be and so checks (other than that the client has a - certificate chain rooted in an appropriate CA) are not possible. If a - server has such knowledge (typically from some source external to - HTTP or TLS) it SHOULD check the identity as described above. - - - - -Rescorla Informational [Page 5] - -RFC 2818 HTTP Over TLS May 2000 - - -References - - [RFC2459] Housley, R., Ford, W., Polk, W. and D. Solo, "Internet - Public Key Infrastructure: Part I: X.509 Certificate and - CRL Profile", RFC 2459, January 1999. - - [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, - L., Leach, P. and T. Berners-Lee, "Hypertext Transfer - Protocol, HTTP/1.1", RFC 2616, June 1999. - - [RFC2119] Bradner, S., "Key Words for use in RFCs to indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [RFC2246] Dierks, T. and C. Allen, "The TLS Protocol", RFC 2246, - January 1999. - - [RFC2817] Khare, R. and S. Lawrence, "Upgrading to TLS Within - HTTP/1.1", RFC 2817, May 2000. - -Security Considerations - - This entire document is about security. - -Author's Address - - Eric Rescorla - RTFM, Inc. - 30 Newell Road, #16 - East Palo Alto, CA 94303 - - Phone: (650) 328-8631 - EMail: ekr@rtfm.com - - - - - - - - - - - - - - - - - - - -Rescorla Informational [Page 6] - -RFC 2818 HTTP Over TLS May 2000 - - -Full Copyright Statement - - Copyright (C) The Internet Society (2000). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Acknowledgement - - Funding for the RFC Editor function is currently provided by the - Internet Society. - - - - - - - - - - - - - - - - - - - -Rescorla Informational [Page 7] - diff --git a/docs/specs/rfc2965.txt b/docs/specs/rfc2965.txt deleted file mode 100644 index 8a4d02b1..00000000 --- a/docs/specs/rfc2965.txt +++ /dev/null @@ -1,1459 +0,0 @@ - - - - - - -Network Working Group D. Kristol -Request for Comments: 2965 Bell Laboratories, Lucent Technologies -Obsoletes: 2109 L. Montulli -Category: Standards Track Epinions.com, Inc. - October 2000 - - - HTTP State Management Mechanism - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (2000). All Rights Reserved. - -IESG Note - - The IESG notes that this mechanism makes use of the .local top-level - domain (TLD) internally when handling host names that don't contain - any dots, and that this mechanism might not work in the expected way - should an actual .local TLD ever be registered. - -Abstract - - This document specifies a way to create a stateful session with - Hypertext Transfer Protocol (HTTP) requests and responses. It - describes three new headers, Cookie, Cookie2, and Set-Cookie2, which - carry state information between participating origin servers and user - agents. The method described here differs from Netscape's Cookie - proposal [Netscape], but it can interoperate with HTTP/1.0 user - agents that use Netscape's method. (See the HISTORICAL section.) - - This document reflects implementation experience with RFC 2109 and - obsoletes it. - -1. TERMINOLOGY - - The terms user agent, client, server, proxy, origin server, and - http_URL have the same meaning as in the HTTP/1.1 specification - [RFC2616]. The terms abs_path and absoluteURI have the same meaning - as in the URI Syntax specification [RFC2396]. - - - - -Kristol & Montulli Standards Track [Page 1] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - Host name (HN) means either the host domain name (HDN) or the numeric - Internet Protocol (IP) address of a host. The fully qualified domain - name is preferred; use of numeric IP addresses is strongly - discouraged. - - The terms request-host and request-URI refer to the values the client - would send to the server as, respectively, the host (but not port) - and abs_path portions of the absoluteURI (http_URL) of the HTTP - request line. Note that request-host is a HN. - - The term effective host name is related to host name. If a host name - contains no dots, the effective host name is that name with the - string .local appended to it. Otherwise the effective host name is - the same as the host name. Note that all effective host names - contain at least one dot. - - The term request-port refers to the port portion of the absoluteURI - (http_URL) of the HTTP request line. If the absoluteURI has no - explicit port, the request-port is the HTTP default, 80. The - request-port of a cookie is the request-port of the request in which - a Set-Cookie2 response header was returned to the user agent. - - Host names can be specified either as an IP address or a HDN string. - Sometimes we compare one host name with another. (Such comparisons - SHALL be case-insensitive.) Host A's name domain-matches host B's if - - * their host name strings string-compare equal; or - - * A is a HDN string and has the form NB, where N is a non-empty - name string, B has the form .B', and B' is a HDN string. (So, - x.y.com domain-matches .Y.com but not Y.com.) - - Note that domain-match is not a commutative operation: a.b.c.com - domain-matches .c.com, but not the reverse. - - The reach R of a host name H is defined as follows: - - * If - - - H is the host domain name of a host; and, - - - H has the form A.B; and - - - A has no embedded (that is, interior) dots; and - - - B has at least one embedded dot, or B is the string "local". - then the reach of H is .B. - - - - -Kristol & Montulli Standards Track [Page 2] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - * Otherwise, the reach of H is H. - - For two strings that represent paths, P1 and P2, P1 path-matches P2 - if P2 is a prefix of P1 (including the case where P1 and P2 string- - compare equal). Thus, the string /tec/waldo path-matches /tec. - - Because it was used in Netscape's original implementation of state - management, we will use the term cookie to refer to the state - information that passes between an origin server and user agent, and - that gets stored by the user agent. - -1.1 Requirements - - The key words "MAY", "MUST", "MUST NOT", "OPTIONAL", "RECOMMENDED", - "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT" in this - document are to be interpreted as described in RFC 2119 [RFC2119]. - -2. STATE AND SESSIONS - - This document describes a way to create stateful sessions with HTTP - requests and responses. Currently, HTTP servers respond to each - client request without relating that request to previous or - subsequent requests; the state management mechanism allows clients - and servers that wish to exchange state information to place HTTP - requests and responses within a larger context, which we term a - "session". This context might be used to create, for example, a - "shopping cart", in which user selections can be aggregated before - purchase, or a magazine browsing system, in which a user's previous - reading affects which offerings are presented. - - Neither clients nor servers are required to support cookies. A - server MAY refuse to provide content to a client that does not return - the cookies it sends. - -3. DESCRIPTION - - We describe here a way for an origin server to send state information - to the user agent, and for the user agent to return the state - information to the origin server. The goal is to have a minimal - impact on HTTP and user agents. - -3.1 Syntax: General - - The two state management headers, Set-Cookie2 and Cookie, have common - syntactic properties involving attribute-value pairs. The following - grammar uses the notation, and tokens DIGIT (decimal digits), token - - - - - -Kristol & Montulli Standards Track [Page 3] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - (informally, a sequence of non-special, non-white space characters), - and http_URL from the HTTP/1.1 specification [RFC2616] to describe - their syntax. - - av-pairs = av-pair *(";" av-pair) - av-pair = attr ["=" value] ; optional value - attr = token - value = token | quoted-string - - Attributes (names) (attr) are case-insensitive. White space is - permitted between tokens. Note that while the above syntax - description shows value as optional, most attrs require them. - - NOTE: The syntax above allows whitespace between the attribute and - the = sign. - -3.2 Origin Server Role - - 3.2.1 General The origin server initiates a session, if it so - desires. To do so, it returns an extra response header to the - client, Set-Cookie2. (The details follow later.) - - A user agent returns a Cookie request header (see below) to the - origin server if it chooses to continue a session. The origin server - MAY ignore it or use it to determine the current state of the - session. It MAY send back to the client a Set-Cookie2 response - header with the same or different information, or it MAY send no - Set-Cookie2 header at all. The origin server effectively ends a - session by sending the client a Set-Cookie2 header with Max-Age=0. - - Servers MAY return Set-Cookie2 response headers with any response. - User agents SHOULD send Cookie request headers, subject to other - rules detailed below, with every request. - - An origin server MAY include multiple Set-Cookie2 headers in a - response. Note that an intervening gateway could fold multiple such - headers into a single header. - - - - - - - - - - - - - - -Kristol & Montulli Standards Track [Page 4] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - 3.2.2 Set-Cookie2 Syntax The syntax for the Set-Cookie2 response - header is - - set-cookie = "Set-Cookie2:" cookies - cookies = 1#cookie - cookie = NAME "=" VALUE *(";" set-cookie-av) - NAME = attr - VALUE = value - set-cookie-av = "Comment" "=" value - | "CommentURL" "=" <"> http_URL <"> - | "Discard" - | "Domain" "=" value - | "Max-Age" "=" value - | "Path" "=" value - | "Port" [ "=" <"> portlist <"> ] - | "Secure" - | "Version" "=" 1*DIGIT - portlist = 1#portnum - portnum = 1*DIGIT - - Informally, the Set-Cookie2 response header comprises the token Set- - Cookie2:, followed by a comma-separated list of one or more cookies. - Each cookie begins with a NAME=VALUE pair, followed by zero or more - semi-colon-separated attribute-value pairs. The syntax for - attribute-value pairs was shown earlier. The specific attributes and - the semantics of their values follows. The NAME=VALUE attribute- - value pair MUST come first in each cookie. The others, if present, - can occur in any order. If an attribute appears more than once in a - cookie, the client SHALL use only the value associated with the first - appearance of the attribute; a client MUST ignore values after the - first. - - The NAME of a cookie MAY be the same as one of the attributes in this - specification. However, because the cookie's NAME must come first in - a Set-Cookie2 response header, the NAME and its VALUE cannot be - confused with an attribute-value pair. - - NAME=VALUE - REQUIRED. The name of the state information ("cookie") is NAME, - and its value is VALUE. NAMEs that begin with $ are reserved and - MUST NOT be used by applications. - - The VALUE is opaque to the user agent and may be anything the - origin server chooses to send, possibly in a server-selected - printable ASCII encoding. "Opaque" implies that the content is of - interest and relevance only to the origin server. The content - may, in fact, be readable by anyone that examines the Set-Cookie2 - header. - - - -Kristol & Montulli Standards Track [Page 5] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - Comment=value - OPTIONAL. Because cookies can be used to derive or store private - information about a user, the value of the Comment attribute - allows an origin server to document how it intends to use the - cookie. The user can inspect the information to decide whether to - initiate or continue a session with this cookie. Characters in - value MUST be in UTF-8 encoding. [RFC2279] - - CommentURL="http_URL" - OPTIONAL. Because cookies can be used to derive or store private - information about a user, the CommentURL attribute allows an - origin server to document how it intends to use the cookie. The - user can inspect the information identified by the URL to decide - whether to initiate or continue a session with this cookie. - - Discard - OPTIONAL. The Discard attribute instructs the user agent to - discard the cookie unconditionally when the user agent terminates. - - Domain=value - OPTIONAL. The value of the Domain attribute specifies the domain - for which the cookie is valid. If an explicitly specified value - does not start with a dot, the user agent supplies a leading dot. - - Max-Age=value - OPTIONAL. The value of the Max-Age attribute is delta-seconds, - the lifetime of the cookie in seconds, a decimal non-negative - integer. To handle cached cookies correctly, a client SHOULD - calculate the age of the cookie according to the age calculation - rules in the HTTP/1.1 specification [RFC2616]. When the age is - greater than delta-seconds seconds, the client SHOULD discard the - cookie. A value of zero means the cookie SHOULD be discarded - immediately. - - Path=value - OPTIONAL. The value of the Path attribute specifies the subset of - URLs on the origin server to which this cookie applies. - - Port[="portlist"] - OPTIONAL. The Port attribute restricts the port to which a cookie - may be returned in a Cookie request header. Note that the syntax - REQUIREs quotes around the OPTIONAL portlist even if there is only - one portnum in portlist. - - - - - - - - -Kristol & Montulli Standards Track [Page 6] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - Secure - OPTIONAL. The Secure attribute (with no value) directs the user - agent to use only (unspecified) secure means to contact the origin - server whenever it sends back this cookie, to protect the - confidentially and authenticity of the information in the cookie. - - The user agent (possibly with user interaction) MAY determine what - level of security it considers appropriate for "secure" cookies. - The Secure attribute should be considered security advice from the - server to the user agent, indicating that it is in the session's - interest to protect the cookie contents. When it sends a "secure" - cookie back to a server, the user agent SHOULD use no less than - the same level of security as was used when it received the cookie - from the server. - - Version=value - REQUIRED. The value of the Version attribute, a decimal integer, - identifies the version of the state management specification to - which the cookie conforms. For this specification, Version=1 - applies. - - 3.2.3 Controlling Caching An origin server must be cognizant of the - effect of possible caching of both the returned resource and the - Set-Cookie2 header. Caching "public" documents is desirable. For - example, if the origin server wants to use a public document such as - a "front door" page as a sentinel to indicate the beginning of a - session for which a Set-Cookie2 response header must be generated, - the page SHOULD be stored in caches "pre-expired" so that the origin - server will see further requests. "Private documents", for example - those that contain information strictly private to a session, SHOULD - NOT be cached in shared caches. - - If the cookie is intended for use by a single user, the Set-Cookie2 - header SHOULD NOT be cached. A Set-Cookie2 header that is intended - to be shared by multiple users MAY be cached. - - The origin server SHOULD send the following additional HTTP/1.1 - response headers, depending on circumstances: - - * To suppress caching of the Set-Cookie2 header: - - Cache-control: no-cache="set-cookie2" - - and one of the following: - - * To suppress caching of a private document in shared caches: - - Cache-control: private - - - -Kristol & Montulli Standards Track [Page 7] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - * To allow caching of a document and require that it be validated - before returning it to the client: - - Cache-Control: must-revalidate, max-age=0 - - * To allow caching of a document, but to require that proxy - caches (not user agent caches) validate it before returning it - to the client: - - Cache-Control: proxy-revalidate, max-age=0 - - * To allow caching of a document and request that it be validated - before returning it to the client (by "pre-expiring" it): - - Cache-control: max-age=0 - - Not all caches will revalidate the document in every case. - - HTTP/1.1 servers MUST send Expires: old-date (where old-date is a - date long in the past) on responses containing Set-Cookie2 response - headers unless they know for certain (by out of band means) that - there are no HTTP/1.0 proxies in the response chain. HTTP/1.1 - servers MAY send other Cache-Control directives that permit caching - by HTTP/1.1 proxies in addition to the Expires: old-date directive; - the Cache-Control directive will override the Expires: old-date for - HTTP/1.1 proxies. - -3.3 User Agent Role - - 3.3.1 Interpreting Set-Cookie2 The user agent keeps separate track - of state information that arrives via Set-Cookie2 response headers - from each origin server (as distinguished by name or IP address and - port). The user agent MUST ignore attribute-value pairs whose - attribute it does not recognize. The user agent applies these - defaults for optional attributes that are missing: - - Discard The default behavior is dictated by the presence or absence - of a Max-Age attribute. - - Domain Defaults to the effective request-host. (Note that because - there is no dot at the beginning of effective request-host, - the default Domain can only domain-match itself.) - - Max-Age The default behavior is to discard the cookie when the user - agent exits. - - Path Defaults to the path of the request URL that generated the - Set-Cookie2 response, up to and including the right-most /. - - - -Kristol & Montulli Standards Track [Page 8] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - Port The default behavior is that a cookie MAY be returned to any - request-port. - - Secure If absent, the user agent MAY send the cookie over an - insecure channel. - - 3.3.2 Rejecting Cookies To prevent possible security or privacy - violations, a user agent rejects a cookie according to rules below. - The goal of the rules is to try to limit the set of servers for which - a cookie is valid, based on the values of the Path, Domain, and Port - attributes and the request-URI, request-host and request-port. - - A user agent rejects (SHALL NOT store its information) if the Version - attribute is missing. Moreover, a user agent rejects (SHALL NOT - store its information) if any of the following is true of the - attributes explicitly present in the Set-Cookie2 response header: - - * The value for the Path attribute is not a prefix of the - request-URI. - - * The value for the Domain attribute contains no embedded dots, - and the value is not .local. - - * The effective host name that derives from the request-host does - not domain-match the Domain attribute. - - * The request-host is a HDN (not IP address) and has the form HD, - where D is the value of the Domain attribute, and H is a string - that contains one or more dots. - - * The Port attribute has a "port-list", and the request-port was - not in the list. - - Examples: - - * A Set-Cookie2 from request-host y.x.foo.com for Domain=.foo.com - would be rejected, because H is y.x and contains a dot. - - * A Set-Cookie2 from request-host x.foo.com for Domain=.foo.com - would be accepted. - - * A Set-Cookie2 with Domain=.com or Domain=.com., will always be - rejected, because there is no embedded dot. - - * A Set-Cookie2 with Domain=ajax.com will be accepted, and the - value for Domain will be taken to be .ajax.com, because a dot - gets prepended to the value. - - - - -Kristol & Montulli Standards Track [Page 9] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - * A Set-Cookie2 with Port="80,8000" will be accepted if the - request was made to port 80 or 8000 and will be rejected - otherwise. - - * A Set-Cookie2 from request-host example for Domain=.local will - be accepted, because the effective host name for the request- - host is example.local, and example.local domain-matches .local. - - 3.3.3 Cookie Management If a user agent receives a Set-Cookie2 - response header whose NAME is the same as that of a cookie it has - previously stored, the new cookie supersedes the old when: the old - and new Domain attribute values compare equal, using a case- - insensitive string-compare; and, the old and new Path attribute - values string-compare equal (case-sensitive). However, if the Set- - Cookie2 has a value for Max-Age of zero, the (old and new) cookie is - discarded. Otherwise a cookie persists (resources permitting) until - whichever happens first, then gets discarded: its Max-Age lifetime is - exceeded; or, if the Discard attribute is set, the user agent - terminates the session. - - Because user agents have finite space in which to store cookies, they - MAY also discard older cookies to make space for newer ones, using, - for example, a least-recently-used algorithm, along with constraints - on the maximum number of cookies that each origin server may set. - - If a Set-Cookie2 response header includes a Comment attribute, the - user agent SHOULD store that information in a human-readable form - with the cookie and SHOULD display the comment text as part of a - cookie inspection user interface. - - If a Set-Cookie2 response header includes a CommentURL attribute, the - user agent SHOULD store that information in a human-readable form - with the cookie, or, preferably, SHOULD allow the user to follow the - http_URL link as part of a cookie inspection user interface. - - The cookie inspection user interface may include a facility whereby a - user can decide, at the time the user agent receives the Set-Cookie2 - response header, whether or not to accept the cookie. A potentially - confusing situation could arise if the following sequence occurs: - - * the user agent receives a cookie that contains a CommentURL - attribute; - - * the user agent's cookie inspection interface is configured so - that it presents a dialog to the user before the user agent - accepts the cookie; - - - - - -Kristol & Montulli Standards Track [Page 10] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - * the dialog allows the user to follow the CommentURL link when - the user agent receives the cookie; and, - - * when the user follows the CommentURL link, the origin server - (or another server, via other links in the returned content) - returns another cookie. - - The user agent SHOULD NOT send any cookies in this context. The user - agent MAY discard any cookie it receives in this context that the - user has not, through some user agent mechanism, deemed acceptable. - - User agents SHOULD allow the user to control cookie destruction, but - they MUST NOT extend the cookie's lifetime beyond that controlled by - the Discard and Max-Age attributes. An infrequently-used cookie may - function as a "preferences file" for network applications, and a user - may wish to keep it even if it is the least-recently-used cookie. One - possible implementation would be an interface that allows the - permanent storage of a cookie through a checkbox (or, conversely, its - immediate destruction). - - Privacy considerations dictate that the user have considerable - control over cookie management. The PRIVACY section contains more - information. - - 3.3.4 Sending Cookies to the Origin Server When it sends a request - to an origin server, the user agent includes a Cookie request header - if it has stored cookies that are applicable to the request, based on - - * the request-host and request-port; - - * the request-URI; - - * the cookie's age. - - The syntax for the header is: - -cookie = "Cookie:" cookie-version 1*((";" | ",") cookie-value) -cookie-value = NAME "=" VALUE [";" path] [";" domain] [";" port] -cookie-version = "$Version" "=" value -NAME = attr -VALUE = value -path = "$Path" "=" value -domain = "$Domain" "=" value -port = "$Port" [ "=" <"> value <"> ] - - The value of the cookie-version attribute MUST be the value from the - Version attribute of the corresponding Set-Cookie2 response header. - Otherwise the value for cookie-version is 0. The value for the path - - - -Kristol & Montulli Standards Track [Page 11] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - attribute MUST be the value from the Path attribute, if one was - present, of the corresponding Set-Cookie2 response header. Otherwise - the attribute SHOULD be omitted from the Cookie request header. The - value for the domain attribute MUST be the value from the Domain - attribute, if one was present, of the corresponding Set-Cookie2 - response header. Otherwise the attribute SHOULD be omitted from the - Cookie request header. - - The port attribute of the Cookie request header MUST mirror the Port - attribute, if one was present, in the corresponding Set-Cookie2 - response header. That is, the port attribute MUST be present if the - Port attribute was present in the Set-Cookie2 header, and it MUST - have the same value, if any. Otherwise, if the Port attribute was - absent from the Set-Cookie2 header, the attribute likewise MUST be - omitted from the Cookie request header. - - Note that there is neither a Comment nor a CommentURL attribute in - the Cookie request header corresponding to the ones in the Set- - Cookie2 response header. The user agent does not return the comment - information to the origin server. - - The user agent applies the following rules to choose applicable - cookie-values to send in Cookie request headers from among all the - cookies it has received. - - Domain Selection - The origin server's effective host name MUST domain-match the - Domain attribute of the cookie. - - Port Selection - There are three possible behaviors, depending on the Port - attribute in the Set-Cookie2 response header: - - 1. By default (no Port attribute), the cookie MAY be sent to any - port. - - 2. If the attribute is present but has no value (e.g., Port), the - cookie MUST only be sent to the request-port it was received - from. - - 3. If the attribute has a port-list, the cookie MUST only be - returned if the new request-port is one of those listed in - port-list. - - Path Selection - The request-URI MUST path-match the Path attribute of the cookie. - - - - - -Kristol & Montulli Standards Track [Page 12] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - Max-Age Selection - Cookies that have expired should have been discarded and thus are - not forwarded to an origin server. - - If multiple cookies satisfy the criteria above, they are ordered in - the Cookie header such that those with more specific Path attributes - precede those with less specific. Ordering with respect to other - attributes (e.g., Domain) is unspecified. - - Note: For backward compatibility, the separator in the Cookie header - is semi-colon (;) everywhere. A server SHOULD also accept comma (,) - as the separator between cookie-values for future compatibility. - - 3.3.5 Identifying What Version is Understood: Cookie2 The Cookie2 - request header facilitates interoperation between clients and servers - that understand different versions of the cookie specification. When - the client sends one or more cookies to an origin server, if at least - one of those cookies contains a $Version attribute whose value is - different from the version that the client understands, then the - client MUST also send a Cookie2 request header, the syntax for which - is - - cookie2 = "Cookie2:" cookie-version - - Here the value for cookie-version is the highest version of cookie - specification (currently 1) that the client understands. The client - needs to send at most one such request header per request. - - 3.3.6 Sending Cookies in Unverifiable Transactions Users MUST have - control over sessions in order to ensure privacy. (See PRIVACY - section below.) To simplify implementation and to prevent an - additional layer of complexity where adequate safeguards exist, - however, this document distinguishes between transactions that are - verifiable and those that are unverifiable. A transaction is - verifiable if the user, or a user-designated agent, has the option to - review the request-URI prior to its use in the transaction. A - transaction is unverifiable if the user does not have that option. - Unverifiable transactions typically arise when a user agent - automatically requests inlined or embedded entities or when it - resolves redirection (3xx) responses from an origin server. - Typically the origin transaction, the transaction that the user - initiates, is verifiable, and that transaction may directly or - indirectly induce the user agent to make unverifiable transactions. - - An unverifiable transaction is to a third-party host if its request- - host U does not domain-match the reach R of the request-host O in the - origin transaction. - - - - -Kristol & Montulli Standards Track [Page 13] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - When it makes an unverifiable transaction, a user agent MUST disable - all cookie processing (i.e., MUST NOT send cookies, and MUST NOT - accept any received cookies) if the transaction is to a third-party - host. - - This restriction prevents a malicious service author from using - unverifiable transactions to induce a user agent to start or continue - a session with a server in a different domain. The starting or - continuation of such sessions could be contrary to the privacy - expectations of the user, and could also be a security problem. - - User agents MAY offer configurable options that allow the user agent, - or any autonomous programs that the user agent executes, to ignore - the above rule, so long as these override options default to "off". - - (N.B. Mechanisms may be proposed that will automate overriding the - third-party restrictions under controlled conditions.) - - Many current user agents already provide a review option that would - render many links verifiable. For instance, some user agents display - the URL that would be referenced for a particular link when the mouse - pointer is placed over that link. The user can therefore determine - whether to visit that site before causing the browser to do so. - (Though not implemented on current user agents, a similar technique - could be used for a button used to submit a form -- the user agent - could display the action to be taken if the user were to select that - button.) However, even this would not make all links verifiable; for - example, links to automatically loaded images would not normally be - subject to "mouse pointer" verification. - - Many user agents also provide the option for a user to view the HTML - source of a document, or to save the source to an external file where - it can be viewed by another application. While such an option does - provide a crude review mechanism, some users might not consider it - acceptable for this purpose. - -3.4 How an Origin Server Interprets the Cookie Header - - A user agent returns much of the information in the Set-Cookie2 - header to the origin server when the request-URI path-matches the - Path attribute of the cookie. When it receives a Cookie header, the - origin server SHOULD treat cookies with NAMEs whose prefix is $ - specially, as an attribute for the cookie. - - - - - - - - -Kristol & Montulli Standards Track [Page 14] - -RFC 2965 HTTP State Management Mechanism October 2000 - - -3.5 Caching Proxy Role - - One reason for separating state information from both a URL and - document content is to facilitate the scaling that caching permits. - To support cookies, a caching proxy MUST obey these rules already in - the HTTP specification: - - * Honor requests from the cache, if possible, based on cache - validity rules. - - * Pass along a Cookie request header in any request that the - proxy must make of another server. - - * Return the response to the client. Include any Set-Cookie2 - response header. - - * Cache the received response subject to the control of the usual - headers, such as Expires, - - Cache-control: no-cache - - and - - Cache-control: private - - * Cache the Set-Cookie2 subject to the control of the usual - header, - - Cache-control: no-cache="set-cookie2" - - (The Set-Cookie2 header should usually not be cached.) - - Proxies MUST NOT introduce Set-Cookie2 (Cookie) headers of their own - in proxy responses (requests). - -4. EXAMPLES - -4.1 Example 1 - - Most detail of request and response headers has been omitted. Assume - the user agent has no stored cookies. - - 1. User Agent -> Server - - POST /acme/login HTTP/1.1 - [form data] - - User identifies self via a form. - - - -Kristol & Montulli Standards Track [Page 15] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - 2. Server -> User Agent - - HTTP/1.1 200 OK - Set-Cookie2: Customer="WILE_E_COYOTE"; Version="1"; Path="/acme" - - Cookie reflects user's identity. - - 3. User Agent -> Server - - POST /acme/pickitem HTTP/1.1 - Cookie: $Version="1"; Customer="WILE_E_COYOTE"; $Path="/acme" - [form data] - - User selects an item for "shopping basket". - - 4. Server -> User Agent - - HTTP/1.1 200 OK - Set-Cookie2: Part_Number="Rocket_Launcher_0001"; Version="1"; - Path="/acme" - - Shopping basket contains an item. - - 5. User Agent -> Server - - POST /acme/shipping HTTP/1.1 - Cookie: $Version="1"; - Customer="WILE_E_COYOTE"; $Path="/acme"; - Part_Number="Rocket_Launcher_0001"; $Path="/acme" - [form data] - - User selects shipping method from form. - - 6. Server -> User Agent - - HTTP/1.1 200 OK - Set-Cookie2: Shipping="FedEx"; Version="1"; Path="/acme" - - New cookie reflects shipping method. - - 7. User Agent -> Server - - POST /acme/process HTTP/1.1 - Cookie: $Version="1"; - Customer="WILE_E_COYOTE"; $Path="/acme"; - Part_Number="Rocket_Launcher_0001"; $Path="/acme"; - Shipping="FedEx"; $Path="/acme" - [form data] - - - -Kristol & Montulli Standards Track [Page 16] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - User chooses to process order. - - 8. Server -> User Agent - - HTTP/1.1 200 OK - - Transaction is complete. - - The user agent makes a series of requests on the origin server, after - each of which it receives a new cookie. All the cookies have the - same Path attribute and (default) domain. Because the request-URIs - all path-match /acme, the Path attribute of each cookie, each request - contains all the cookies received so far. - -4.2 Example 2 - - This example illustrates the effect of the Path attribute. All - detail of request and response headers has been omitted. Assume the - user agent has no stored cookies. - - Imagine the user agent has received, in response to earlier requests, - the response headers - - Set-Cookie2: Part_Number="Rocket_Launcher_0001"; Version="1"; - Path="/acme" - - and - - Set-Cookie2: Part_Number="Riding_Rocket_0023"; Version="1"; - Path="/acme/ammo" - - A subsequent request by the user agent to the (same) server for URLs - of the form /acme/ammo/... would include the following request - header: - - Cookie: $Version="1"; - Part_Number="Riding_Rocket_0023"; $Path="/acme/ammo"; - Part_Number="Rocket_Launcher_0001"; $Path="/acme" - - Note that the NAME=VALUE pair for the cookie with the more specific - Path attribute, /acme/ammo, comes before the one with the less - specific Path attribute, /acme. Further note that the same cookie - name appears more than once. - - A subsequent request by the user agent to the (same) server for a URL - of the form /acme/parts/ would include the following request header: - - - - - -Kristol & Montulli Standards Track [Page 17] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - Cookie: $Version="1"; Part_Number="Rocket_Launcher_0001"; - $Path="/acme" - - Here, the second cookie's Path attribute /acme/ammo is not a prefix - of the request URL, /acme/parts/, so the cookie does not get - forwarded to the server. - -5. IMPLEMENTATION CONSIDERATIONS - - Here we provide guidance on likely or desirable details for an origin - server that implements state management. - -5.1 Set-Cookie2 Content - - An origin server's content should probably be divided into disjoint - application areas, some of which require the use of state - information. The application areas can be distinguished by their - request URLs. The Set-Cookie2 header can incorporate information - about the application areas by setting the Path attribute for each - one. - - The session information can obviously be clear or encoded text that - describes state. However, if it grows too large, it can become - unwieldy. Therefore, an implementor might choose for the session - information to be a key to a server-side resource. Of course, using - a database creates some problems that this state management - specification was meant to avoid, namely: - - 1. keeping real state on the server side; - - 2. how and when to garbage-collect the database entry, in case the - user agent terminates the session by, for example, exiting. - -5.2 Stateless Pages - - Caching benefits the scalability of WWW. Therefore it is important - to reduce the number of documents that have state embedded in them - inherently. For example, if a shopping-basket-style application - always displays a user's current basket contents on each page, those - pages cannot be cached, because each user's basket's contents would - be different. On the other hand, if each page contains just a link - that allows the user to "Look at My Shopping Basket", the page can be - cached. - - - - - - - - -Kristol & Montulli Standards Track [Page 18] - -RFC 2965 HTTP State Management Mechanism October 2000 - - -5.3 Implementation Limits - - Practical user agent implementations have limits on the number and - size of cookies that they can store. In general, user agents' cookie - support should have no fixed limits. They should strive to store as - many frequently-used cookies as possible. Furthermore, general-use - user agents SHOULD provide each of the following minimum capabilities - individually, although not necessarily simultaneously: - - * at least 300 cookies - - * at least 4096 bytes per cookie (as measured by the characters - that comprise the cookie non-terminal in the syntax description - of the Set-Cookie2 header, and as received in the Set-Cookie2 - header) - - * at least 20 cookies per unique host or domain name - - User agents created for specific purposes or for limited-capacity - devices SHOULD provide at least 20 cookies of 4096 bytes, to ensure - that the user can interact with a session-based origin server. - - The information in a Set-Cookie2 response header MUST be retained in - its entirety. If for some reason there is inadequate space to store - the cookie, it MUST be discarded, not truncated. - - Applications should use as few and as small cookies as possible, and - they should cope gracefully with the loss of a cookie. - - 5.3.1 Denial of Service Attacks User agents MAY choose to set an - upper bound on the number of cookies to be stored from a given host - or domain name or on the size of the cookie information. Otherwise a - malicious server could attempt to flood a user agent with many - cookies, or large cookies, on successive responses, which would force - out cookies the user agent had received from other servers. However, - the minima specified above SHOULD still be supported. - -6. PRIVACY - - Informed consent should guide the design of systems that use cookies. - A user should be able to find out how a web site plans to use - information in a cookie and should be able to choose whether or not - those policies are acceptable. Both the user agent and the origin - server must assist informed consent. - - - - - - - -Kristol & Montulli Standards Track [Page 19] - -RFC 2965 HTTP State Management Mechanism October 2000 - - -6.1 User Agent Control - - An origin server could create a Set-Cookie2 header to track the path - of a user through the server. Users may object to this behavior as - an intrusive accumulation of information, even if their identity is - not evident. (Identity might become evident, for example, if a user - subsequently fills out a form that contains identifying information.) - This state management specification therefore requires that a user - agent give the user control over such a possible intrusion, although - the interface through which the user is given this control is left - unspecified. However, the control mechanisms provided SHALL at least - allow the user - - * to completely disable the sending and saving of cookies. - - * to determine whether a stateful session is in progress. - - * to control the saving of a cookie on the basis of the cookie's - Domain attribute. - - Such control could be provided, for example, by mechanisms - - * to notify the user when the user agent is about to send a - cookie to the origin server, to offer the option not to begin a - session. - - * to display a visual indication that a stateful session is in - progress. - - * to let the user decide which cookies, if any, should be saved - when the user concludes a window or user agent session. - - * to let the user examine and delete the contents of a cookie at - any time. - - A user agent usually begins execution with no remembered state - information. It SHOULD be possible to configure a user agent never - to send Cookie headers, in which case it can never sustain state with - an origin server. (The user agent would then behave like one that is - unaware of how to handle Set-Cookie2 response headers.) - - When the user agent terminates execution, it SHOULD let the user - discard all state information. Alternatively, the user agent MAY ask - the user whether state information should be retained; the default - should be "no". If the user chooses to retain state information, it - would be restored the next time the user agent runs. - - - - - -Kristol & Montulli Standards Track [Page 20] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - NOTE: User agents should probably be cautious about using files to - store cookies long-term. If a user runs more than one instance of - the user agent, the cookies could be commingled or otherwise - corrupted. - -6.2 Origin Server Role - - An origin server SHOULD promote informed consent by adding CommentURL - or Comment information to the cookies it sends. CommentURL is - preferred because of the opportunity to provide richer information in - a multiplicity of languages. - -6.3 Clear Text - - The information in the Set-Cookie2 and Cookie headers is unprotected. - As a consequence: - - 1. Any sensitive information that is conveyed in them is exposed - to intruders. - - 2. A malicious intermediary could alter the headers as they travel - in either direction, with unpredictable results. - - These facts imply that information of a personal and/or financial - nature should only be sent over a secure channel. For less sensitive - information, or when the content of the header is a database key, an - origin server should be vigilant to prevent a bad Cookie value from - causing failures. - - A user agent in a shared user environment poses a further risk. - Using a cookie inspection interface, User B could examine the - contents of cookies that were saved when User A used the machine. - -7. SECURITY CONSIDERATIONS - -7.1 Protocol Design - - The restrictions on the value of the Domain attribute, and the rules - concerning unverifiable transactions, are meant to reduce the ways - that cookies can "leak" to the "wrong" site. The intent is to - restrict cookies to one host, or a closely related set of hosts. - Therefore a request-host is limited as to what values it can set for - Domain. We consider it acceptable for hosts host1.foo.com and - host2.foo.com to share cookies, but not a.com and b.com. - - Similarly, a server can set a Path only for cookies that are related - to the request-URI. - - - - -Kristol & Montulli Standards Track [Page 21] - -RFC 2965 HTTP State Management Mechanism October 2000 - - -7.2 Cookie Spoofing - - Proper application design can avoid spoofing attacks from related - domains. Consider: - - 1. User agent makes request to victim.cracker.edu, gets back - cookie session_id="1234" and sets the default domain - victim.cracker.edu. - - 2. User agent makes request to spoof.cracker.edu, gets back cookie - session-id="1111", with Domain=".cracker.edu". - - 3. User agent makes request to victim.cracker.edu again, and - passes - - Cookie: $Version="1"; session_id="1234", - $Version="1"; session_id="1111"; $Domain=".cracker.edu" - - The server at victim.cracker.edu should detect that the second - cookie was not one it originated by noticing that the Domain - attribute is not for itself and ignore it. - -7.3 Unexpected Cookie Sharing - - A user agent SHOULD make every attempt to prevent the sharing of - session information between hosts that are in different domains. - Embedded or inlined objects may cause particularly severe privacy - problems if they can be used to share cookies between disparate - hosts. For example, a malicious server could embed cookie - information for host a.com in a URI for a CGI on host b.com. User - agent implementors are strongly encouraged to prevent this sort of - exchange whenever possible. - -7.4 Cookies For Account Information - - While it is common practice to use them this way, cookies are not - designed or intended to be used to hold authentication information, - such as account names and passwords. Unless such cookies are - exchanged over an encrypted path, the account information they - contain is highly vulnerable to perusal and theft. - -8. OTHER, SIMILAR, PROPOSALS - - Apart from RFC 2109, three other proposals have been made to - accomplish similar goals. This specification began as an amalgam of - Kristol's State-Info proposal [DMK95] and Netscape's Cookie proposal - [Netscape]. - - - - -Kristol & Montulli Standards Track [Page 22] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - Brian Behlendorf proposed a Session-ID header that would be user- - agent-initiated and could be used by an origin server to track - "clicktrails". It would not carry any origin-server-defined state, - however. Phillip Hallam-Baker has proposed another client-defined - session ID mechanism for similar purposes. - - While both session IDs and cookies can provide a way to sustain - stateful sessions, their intended purpose is different, and, - consequently, the privacy requirements for them are different. A - user initiates session IDs to allow servers to track progress through - them, or to distinguish multiple users on a shared machine. Cookies - are server-initiated, so the cookie mechanism described here gives - users control over something that would otherwise take place without - the users' awareness. Furthermore, cookies convey rich, server- - selected information, whereas session IDs comprise user-selected, - simple information. - -9. HISTORICAL - -9.1 Compatibility with Existing Implementations - - Existing cookie implementations, based on the Netscape specification, - use the Set-Cookie (not Set-Cookie2) header. User agents that - receive in the same response both a Set-Cookie and Set-Cookie2 - response header for the same cookie MUST discard the Set-Cookie - information and use only the Set-Cookie2 information. Furthermore, a - user agent MUST assume, if it received a Set-Cookie2 response header, - that the sending server complies with this document and will - understand Cookie request headers that also follow this - specification. - - New cookies MUST replace both equivalent old- and new-style cookies. - That is, if a user agent that follows both this specification and - Netscape's original specification receives a Set-Cookie2 response - header, and the NAME and the Domain and Path attributes match (per - the Cookie Management section) a Netscape-style cookie, the - Netscape-style cookie MUST be discarded, and the user agent MUST - retain only the cookie adhering to this specification. - - Older user agents that do not understand this specification, but that - do understand Netscape's original specification, will not recognize - the Set-Cookie2 response header and will receive and send cookies - according to the older specification. - - - - - - - - -Kristol & Montulli Standards Track [Page 23] - -RFC 2965 HTTP State Management Mechanism October 2000 - - - A user agent that supports both this specification and Netscape-style - cookies SHOULD send a Cookie request header that follows the older - Netscape specification if it received the cookie in a Set-Cookie - response header and not in a Set-Cookie2 response header. However, - it SHOULD send the following request header as well: - - Cookie2: $Version="1" - - The Cookie2 header advises the server that the user agent understands - new-style cookies. If the server understands new-style cookies, as - well, it SHOULD continue the stateful session by sending a Set- - Cookie2 response header, rather than Set-Cookie. A server that does - not understand new-style cookies will simply ignore the Cookie2 - request header. - -9.2 Caching and HTTP/1.0 - - Some caches, such as those conforming to HTTP/1.0, will inevitably - cache the Set-Cookie2 and Set-Cookie headers, because there was no - mechanism to suppress caching of headers prior to HTTP/1.1. This - caching can lead to security problems. Documents transmitted by an - origin server along with Set-Cookie2 and Set-Cookie headers usually - either will be uncachable, or will be "pre-expired". As long as - caches obey instructions not to cache documents (following Expires: - <a date in the past> or Pragma: no-cache (HTTP/1.0), or Cache- - control: no-cache (HTTP/1.1)) uncachable documents present no - problem. However, pre-expired documents may be stored in caches. - They require validation (a conditional GET) on each new request, but - some cache operators loosen the rules for their caches, and sometimes - serve expired documents without first validating them. This - combination of factors can lead to cookies meant for one user later - being sent to another user. The Set-Cookie2 and Set-Cookie headers - are stored in the cache, and, although the document is stale - (expired), the cache returns the document in response to later - requests, including cached headers. - -10. ACKNOWLEDGEMENTS - - This document really represents the collective efforts of the HTTP - Working Group of the IETF and, particularly, the following people, in - addition to the authors: Roy Fielding, Yaron Goland, Marc Hedlund, - Ted Hardie, Koen Holtman, Shel Kaphan, Rohit Khare, Foteos Macrides, - David W. Morris. - - - - - - - - -Kristol & Montulli Standards Track [Page 24] - -RFC 2965 HTTP State Management Mechanism October 2000 - - -11. AUTHORS' ADDRESSES - - David M. Kristol - Bell Laboratories, Lucent Technologies - 600 Mountain Ave. Room 2A-333 - Murray Hill, NJ 07974 - - Phone: (908) 582-2250 - Fax: (908) 582-1239 - EMail: dmk@bell-labs.com - - - Lou Montulli - Epinions.com, Inc. - 2037 Landings Dr. - Mountain View, CA 94301 - - EMail: lou@montulli.org - -12. REFERENCES - - [DMK95] Kristol, D.M., "Proposed HTTP State-Info Mechanism", - available at <http://portal.research.bell- - labs.com/~dmk/state-info.html>, September, 1995. - - [Netscape] "Persistent Client State -- HTTP Cookies", available at - <http://www.netscape.com/newsref/std/cookie_spec.html>, - undated. - - [RFC2109] Kristol, D. and L. Montulli, "HTTP State Management - Mechanism", RFC 2109, February 1997. - - [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [RFC2279] Yergeau, F., "UTF-8, a transformation format of Unicode - and ISO-10646", RFC 2279, January 1998. - - [RFC2396] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform - Resource Identifiers (URI): Generic Syntax", RFC 2396, - August 1998. - - [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H. and T. - Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", - RFC 2616, June 1999. - - - - - - -Kristol & Montulli Standards Track [Page 25] - -RFC 2965 HTTP State Management Mechanism October 2000 - - -13. Full Copyright Statement - - Copyright (C) The Internet Society (2000). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Acknowledgement - - Funding for the RFC Editor function is currently provided by the - Internet Society. - - - - - - - - - - - - - - - - - - - -Kristol & Montulli Standards Track [Page 26] - diff --git a/docs/specs/rfc3986.txt b/docs/specs/rfc3986.txt deleted file mode 100644 index c56ed4eb..00000000 --- a/docs/specs/rfc3986.txt +++ /dev/null @@ -1,3419 +0,0 @@ - - - - - - -Network Working Group T. Berners-Lee -Request for Comments: 3986 W3C/MIT -STD: 66 R. Fielding -Updates: 1738 Day Software -Obsoletes: 2732, 2396, 1808 L. Masinter -Category: Standards Track Adobe Systems - January 2005 - - - Uniform Resource Identifier (URI): Generic Syntax - -Status of This Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (2005). - -Abstract - - A Uniform Resource Identifier (URI) is a compact sequence of - characters that identifies an abstract or physical resource. This - specification defines the generic URI syntax and a process for - resolving URI references that might be in relative form, along with - guidelines and security considerations for the use of URIs on the - Internet. The URI syntax defines a grammar that is a superset of all - valid URIs, allowing an implementation to parse the common components - of a URI reference without knowing the scheme-specific requirements - of every possible identifier. This specification does not define a - generative grammar for URIs; that task is performed by the individual - specifications of each URI scheme. - - - - - - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 1] - -RFC 3986 URI Generic Syntax January 2005 - - -Table of Contents - - 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 1.1. Overview of URIs . . . . . . . . . . . . . . . . . . . . 4 - 1.1.1. Generic Syntax . . . . . . . . . . . . . . . . . 6 - 1.1.2. Examples . . . . . . . . . . . . . . . . . . . . 7 - 1.1.3. URI, URL, and URN . . . . . . . . . . . . . . . 7 - 1.2. Design Considerations . . . . . . . . . . . . . . . . . 8 - 1.2.1. Transcription . . . . . . . . . . . . . . . . . 8 - 1.2.2. Separating Identification from Interaction . . . 9 - 1.2.3. Hierarchical Identifiers . . . . . . . . . . . . 10 - 1.3. Syntax Notation . . . . . . . . . . . . . . . . . . . . 11 - 2. Characters . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 2.1. Percent-Encoding . . . . . . . . . . . . . . . . . . . . 12 - 2.2. Reserved Characters . . . . . . . . . . . . . . . . . . 12 - 2.3. Unreserved Characters . . . . . . . . . . . . . . . . . 13 - 2.4. When to Encode or Decode . . . . . . . . . . . . . . . . 14 - 2.5. Identifying Data . . . . . . . . . . . . . . . . . . . . 14 - 3. Syntax Components . . . . . . . . . . . . . . . . . . . . . . 16 - 3.1. Scheme . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 3.2. Authority . . . . . . . . . . . . . . . . . . . . . . . 17 - 3.2.1. User Information . . . . . . . . . . . . . . . . 18 - 3.2.2. Host . . . . . . . . . . . . . . . . . . . . . . 18 - 3.2.3. Port . . . . . . . . . . . . . . . . . . . . . . 22 - 3.3. Path . . . . . . . . . . . . . . . . . . . . . . . . . . 22 - 3.4. Query . . . . . . . . . . . . . . . . . . . . . . . . . 23 - 3.5. Fragment . . . . . . . . . . . . . . . . . . . . . . . . 24 - 4. Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 - 4.1. URI Reference . . . . . . . . . . . . . . . . . . . . . 25 - 4.2. Relative Reference . . . . . . . . . . . . . . . . . . . 26 - 4.3. Absolute URI . . . . . . . . . . . . . . . . . . . . . . 27 - 4.4. Same-Document Reference . . . . . . . . . . . . . . . . 27 - 4.5. Suffix Reference . . . . . . . . . . . . . . . . . . . . 27 - 5. Reference Resolution . . . . . . . . . . . . . . . . . . . . . 28 - 5.1. Establishing a Base URI . . . . . . . . . . . . . . . . 28 - 5.1.1. Base URI Embedded in Content . . . . . . . . . . 29 - 5.1.2. Base URI from the Encapsulating Entity . . . . . 29 - 5.1.3. Base URI from the Retrieval URI . . . . . . . . 30 - 5.1.4. Default Base URI . . . . . . . . . . . . . . . . 30 - 5.2. Relative Resolution . . . . . . . . . . . . . . . . . . 30 - 5.2.1. Pre-parse the Base URI . . . . . . . . . . . . . 31 - 5.2.2. Transform References . . . . . . . . . . . . . . 31 - 5.2.3. Merge Paths . . . . . . . . . . . . . . . . . . 32 - 5.2.4. Remove Dot Segments . . . . . . . . . . . . . . 33 - 5.3. Component Recomposition . . . . . . . . . . . . . . . . 35 - 5.4. Reference Resolution Examples . . . . . . . . . . . . . 35 - 5.4.1. Normal Examples . . . . . . . . . . . . . . . . 36 - 5.4.2. Abnormal Examples . . . . . . . . . . . . . . . 36 - - - -Berners-Lee, et al. Standards Track [Page 2] - -RFC 3986 URI Generic Syntax January 2005 - - - 6. Normalization and Comparison . . . . . . . . . . . . . . . . . 38 - 6.1. Equivalence . . . . . . . . . . . . . . . . . . . . . . 38 - 6.2. Comparison Ladder . . . . . . . . . . . . . . . . . . . 39 - 6.2.1. Simple String Comparison . . . . . . . . . . . . 39 - 6.2.2. Syntax-Based Normalization . . . . . . . . . . . 40 - 6.2.3. Scheme-Based Normalization . . . . . . . . . . . 41 - 6.2.4. Protocol-Based Normalization . . . . . . . . . . 42 - 7. Security Considerations . . . . . . . . . . . . . . . . . . . 43 - 7.1. Reliability and Consistency . . . . . . . . . . . . . . 43 - 7.2. Malicious Construction . . . . . . . . . . . . . . . . . 43 - 7.3. Back-End Transcoding . . . . . . . . . . . . . . . . . . 44 - 7.4. Rare IP Address Formats . . . . . . . . . . . . . . . . 45 - 7.5. Sensitive Information . . . . . . . . . . . . . . . . . 45 - 7.6. Semantic Attacks . . . . . . . . . . . . . . . . . . . . 45 - 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 46 - 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 46 - 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 46 - 10.1. Normative References . . . . . . . . . . . . . . . . . . 46 - 10.2. Informative References . . . . . . . . . . . . . . . . . 47 - A. Collected ABNF for URI . . . . . . . . . . . . . . . . . . . . 49 - B. Parsing a URI Reference with a Regular Expression . . . . . . 50 - C. Delimiting a URI in Context . . . . . . . . . . . . . . . . . 51 - D. Changes from RFC 2396 . . . . . . . . . . . . . . . . . . . . 53 - D.1. Additions . . . . . . . . . . . . . . . . . . . . . . . 53 - D.2. Modifications . . . . . . . . . . . . . . . . . . . . . 53 - Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 60 - Full Copyright Statement . . . . . . . . . . . . . . . . . . . . . 61 - - - - - - - - - - - - - - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 3] - -RFC 3986 URI Generic Syntax January 2005 - - -1. Introduction - - A Uniform Resource Identifier (URI) provides a simple and extensible - means for identifying a resource. This specification of URI syntax - and semantics is derived from concepts introduced by the World Wide - Web global information initiative, whose use of these identifiers - dates from 1990 and is described in "Universal Resource Identifiers - in WWW" [RFC1630]. The syntax is designed to meet the - recommendations laid out in "Functional Recommendations for Internet - Resource Locators" [RFC1736] and "Functional Requirements for Uniform - Resource Names" [RFC1737]. - - This document obsoletes [RFC2396], which merged "Uniform Resource - Locators" [RFC1738] and "Relative Uniform Resource Locators" - [RFC1808] in order to define a single, generic syntax for all URIs. - It obsoletes [RFC2732], which introduced syntax for an IPv6 address. - It excludes portions of RFC 1738 that defined the specific syntax of - individual URI schemes; those portions will be updated as separate - documents. The process for registration of new URI schemes is - defined separately by [BCP35]. Advice for designers of new URI - schemes can be found in [RFC2718]. All significant changes from RFC - 2396 are noted in Appendix D. - - This specification uses the terms "character" and "coded character - set" in accordance with the definitions provided in [BCP19], and - "character encoding" in place of what [BCP19] refers to as a - "charset". - -1.1. Overview of URIs - - URIs are characterized as follows: - - Uniform - - Uniformity provides several benefits. It allows different types - of resource identifiers to be used in the same context, even when - the mechanisms used to access those resources may differ. It - allows uniform semantic interpretation of common syntactic - conventions across different types of resource identifiers. It - allows introduction of new types of resource identifiers without - interfering with the way that existing identifiers are used. It - allows the identifiers to be reused in many different contexts, - thus permitting new applications or protocols to leverage a pre- - existing, large, and widely used set of resource identifiers. - - - - - - - -Berners-Lee, et al. Standards Track [Page 4] - -RFC 3986 URI Generic Syntax January 2005 - - - Resource - - This specification does not limit the scope of what might be a - resource; rather, the term "resource" is used in a general sense - for whatever might be identified by a URI. Familiar examples - include an electronic document, an image, a source of information - with a consistent purpose (e.g., "today's weather report for Los - Angeles"), a service (e.g., an HTTP-to-SMS gateway), and a - collection of other resources. A resource is not necessarily - accessible via the Internet; e.g., human beings, corporations, and - bound books in a library can also be resources. Likewise, - abstract concepts can be resources, such as the operators and - operands of a mathematical equation, the types of a relationship - (e.g., "parent" or "employee"), or numeric values (e.g., zero, - one, and infinity). - - Identifier - - An identifier embodies the information required to distinguish - what is being identified from all other things within its scope of - identification. Our use of the terms "identify" and "identifying" - refer to this purpose of distinguishing one resource from all - other resources, regardless of how that purpose is accomplished - (e.g., by name, address, or context). These terms should not be - mistaken as an assumption that an identifier defines or embodies - the identity of what is referenced, though that may be the case - for some identifiers. Nor should it be assumed that a system - using URIs will access the resource identified: in many cases, - URIs are used to denote resources without any intention that they - be accessed. Likewise, the "one" resource identified might not be - singular in nature (e.g., a resource might be a named set or a - mapping that varies over time). - - A URI is an identifier consisting of a sequence of characters - matching the syntax rule named <URI> in Section 3. It enables - uniform identification of resources via a separately defined - extensible set of naming schemes (Section 3.1). How that - identification is accomplished, assigned, or enabled is delegated to - each scheme specification. - - This specification does not place any limits on the nature of a - resource, the reasons why an application might seek to refer to a - resource, or the kinds of systems that might use URIs for the sake of - identifying resources. This specification does not require that a - URI persists in identifying the same resource over time, though that - is a common goal of all URI schemes. Nevertheless, nothing in this - - - - - -Berners-Lee, et al. Standards Track [Page 5] - -RFC 3986 URI Generic Syntax January 2005 - - - specification prevents an application from limiting itself to - particular types of resources, or to a subset of URIs that maintains - characteristics desired by that application. - - URIs have a global scope and are interpreted consistently regardless - of context, though the result of that interpretation may be in - relation to the end-user's context. For example, "http://localhost/" - has the same interpretation for every user of that reference, even - though the network interface corresponding to "localhost" may be - different for each end-user: interpretation is independent of access. - However, an action made on the basis of that reference will take - place in relation to the end-user's context, which implies that an - action intended to refer to a globally unique thing must use a URI - that distinguishes that resource from all other things. URIs that - identify in relation to the end-user's local context should only be - used when the context itself is a defining aspect of the resource, - such as when an on-line help manual refers to a file on the end- - user's file system (e.g., "file:///etc/hosts"). - -1.1.1. Generic Syntax - - Each URI begins with a scheme name, as defined in Section 3.1, that - refers to a specification for assigning identifiers within that - scheme. As such, the URI syntax is a federated and extensible naming - system wherein each scheme's specification may further restrict the - syntax and semantics of identifiers using that scheme. - - This specification defines those elements of the URI syntax that are - required of all URI schemes or are common to many URI schemes. It - thus defines the syntax and semantics needed to implement a scheme- - independent parsing mechanism for URI references, by which the - scheme-dependent handling of a URI can be postponed until the - scheme-dependent semantics are needed. Likewise, protocols and data - formats that make use of URI references can refer to this - specification as a definition for the range of syntax allowed for all - URIs, including those schemes that have yet to be defined. This - decouples the evolution of identification schemes from the evolution - of protocols, data formats, and implementations that make use of - URIs. - - A parser of the generic URI syntax can parse any URI reference into - its major components. Once the scheme is determined, further - scheme-specific parsing can be performed on the components. In other - words, the URI generic syntax is a superset of the syntax of all URI - schemes. - - - - - - -Berners-Lee, et al. Standards Track [Page 6] - -RFC 3986 URI Generic Syntax January 2005 - - -1.1.2. Examples - - The following example URIs illustrate several URI schemes and - variations in their common syntax components: - - ftp://ftp.is.co.za/rfc/rfc1808.txt - - http://www.ietf.org/rfc/rfc2396.txt - - ldap://[2001:db8::7]/c=GB?objectClass?one - - mailto:John.Doe@example.com - - news:comp.infosystems.www.servers.unix - - tel:+1-816-555-1212 - - telnet://192.0.2.16:80/ - - urn:oasis:names:specification:docbook:dtd:xml:4.1.2 - - -1.1.3. URI, URL, and URN - - A URI can be further classified as a locator, a name, or both. The - term "Uniform Resource Locator" (URL) refers to the subset of URIs - that, in addition to identifying a resource, provide a means of - locating the resource by describing its primary access mechanism - (e.g., its network "location"). The term "Uniform Resource Name" - (URN) has been used historically to refer to both URIs under the - "urn" scheme [RFC2141], which are required to remain globally unique - and persistent even when the resource ceases to exist or becomes - unavailable, and to any other URI with the properties of a name. - - An individual scheme does not have to be classified as being just one - of "name" or "locator". Instances of URIs from any given scheme may - have the characteristics of names or locators or both, often - depending on the persistence and care in the assignment of - identifiers by the naming authority, rather than on any quality of - the scheme. Future specifications and related documentation should - use the general term "URI" rather than the more restrictive terms - "URL" and "URN" [RFC3305]. - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 7] - -RFC 3986 URI Generic Syntax January 2005 - - -1.2. Design Considerations - -1.2.1. Transcription - - The URI syntax has been designed with global transcription as one of - its main considerations. A URI is a sequence of characters from a - very limited set: the letters of the basic Latin alphabet, digits, - and a few special characters. A URI may be represented in a variety - of ways; e.g., ink on paper, pixels on a screen, or a sequence of - character encoding octets. The interpretation of a URI depends only - on the characters used and not on how those characters are - represented in a network protocol. - - The goal of transcription can be described by a simple scenario. - Imagine two colleagues, Sam and Kim, sitting in a pub at an - international conference and exchanging research ideas. Sam asks Kim - for a location to get more information, so Kim writes the URI for the - research site on a napkin. Upon returning home, Sam takes out the - napkin and types the URI into a computer, which then retrieves the - information to which Kim referred. - - There are several design considerations revealed by the scenario: - - o A URI is a sequence of characters that is not always represented - as a sequence of octets. - - o A URI might be transcribed from a non-network source and thus - should consist of characters that are most likely able to be - entered into a computer, within the constraints imposed by - keyboards (and related input devices) across languages and - locales. - - o A URI often has to be remembered by people, and it is easier for - people to remember a URI when it consists of meaningful or - familiar components. - - These design considerations are not always in alignment. For - example, it is often the case that the most meaningful name for a URI - component would require characters that cannot be typed into some - systems. The ability to transcribe a resource identifier from one - medium to another has been considered more important than having a - URI consist of the most meaningful of components. - - In local or regional contexts and with improving technology, users - might benefit from being able to use a wider range of characters; - such use is not defined by this specification. Percent-encoded - octets (Section 2.1) may be used within a URI to represent characters - outside the range of the US-ASCII coded character set if this - - - -Berners-Lee, et al. Standards Track [Page 8] - -RFC 3986 URI Generic Syntax January 2005 - - - representation is allowed by the scheme or by the protocol element in - which the URI is referenced. Such a definition should specify the - character encoding used to map those characters to octets prior to - being percent-encoded for the URI. - -1.2.2. Separating Identification from Interaction - - A common misunderstanding of URIs is that they are only used to refer - to accessible resources. The URI itself only provides - identification; access to the resource is neither guaranteed nor - implied by the presence of a URI. Instead, any operation associated - with a URI reference is defined by the protocol element, data format - attribute, or natural language text in which it appears. - - Given a URI, a system may attempt to perform a variety of operations - on the resource, as might be characterized by words such as "access", - "update", "replace", or "find attributes". Such operations are - defined by the protocols that make use of URIs, not by this - specification. However, we do use a few general terms for describing - common operations on URIs. URI "resolution" is the process of - determining an access mechanism and the appropriate parameters - necessary to dereference a URI; this resolution may require several - iterations. To use that access mechanism to perform an action on the - URI's resource is to "dereference" the URI. - - When URIs are used within information retrieval systems to identify - sources of information, the most common form of URI dereference is - "retrieval": making use of a URI in order to retrieve a - representation of its associated resource. A "representation" is a - sequence of octets, along with representation metadata describing - those octets, that constitutes a record of the state of the resource - at the time when the representation is generated. Retrieval is - achieved by a process that might include using the URI as a cache key - to check for a locally cached representation, resolution of the URI - to determine an appropriate access mechanism (if any), and - dereference of the URI for the sake of applying a retrieval - operation. Depending on the protocols used to perform the retrieval, - additional information might be supplied about the resource (resource - metadata) and its relation to other resources. - - URI references in information retrieval systems are designed to be - late-binding: the result of an access is generally determined when it - is accessed and may vary over time or due to other aspects of the - interaction. These references are created in order to be used in the - future: what is being identified is not some specific result that was - obtained in the past, but rather some characteristic that is expected - to be true for future results. In such cases, the resource referred - to by the URI is actually a sameness of characteristics as observed - - - -Berners-Lee, et al. Standards Track [Page 9] - -RFC 3986 URI Generic Syntax January 2005 - - - over time, perhaps elucidated by additional comments or assertions - made by the resource provider. - - Although many URI schemes are named after protocols, this does not - imply that use of these URIs will result in access to the resource - via the named protocol. URIs are often used simply for the sake of - identification. Even when a URI is used to retrieve a representation - of a resource, that access might be through gateways, proxies, - caches, and name resolution services that are independent of the - protocol associated with the scheme name. The resolution of some - URIs may require the use of more than one protocol (e.g., both DNS - and HTTP are typically used to access an "http" URI's origin server - when a representation isn't found in a local cache). - -1.2.3. Hierarchical Identifiers - - The URI syntax is organized hierarchically, with components listed in - order of decreasing significance from left to right. For some URI - schemes, the visible hierarchy is limited to the scheme itself: - everything after the scheme component delimiter (":") is considered - opaque to URI processing. Other URI schemes make the hierarchy - explicit and visible to generic parsing algorithms. - - The generic syntax uses the slash ("/"), question mark ("?"), and - number sign ("#") characters to delimit components that are - significant to the generic parser's hierarchical interpretation of an - identifier. In addition to aiding the readability of such - identifiers through the consistent use of familiar syntax, this - uniform representation of hierarchy across naming schemes allows - scheme-independent references to be made relative to that hierarchy. - - It is often the case that a group or "tree" of documents has been - constructed to serve a common purpose, wherein the vast majority of - URI references in these documents point to resources within the tree - rather than outside it. Similarly, documents located at a particular - site are much more likely to refer to other resources at that site - than to resources at remote sites. Relative referencing of URIs - allows document trees to be partially independent of their location - and access scheme. For instance, it is possible for a single set of - hypertext documents to be simultaneously accessible and traversable - via each of the "file", "http", and "ftp" schemes if the documents - refer to each other with relative references. Furthermore, such - document trees can be moved, as a whole, without changing any of the - relative references. - - A relative reference (Section 4.2) refers to a resource by describing - the difference within a hierarchical name space between the reference - context and the target URI. The reference resolution algorithm, - - - -Berners-Lee, et al. Standards Track [Page 10] - -RFC 3986 URI Generic Syntax January 2005 - - - presented in Section 5, defines how such a reference is transformed - to the target URI. As relative references can only be used within - the context of a hierarchical URI, designers of new URI schemes - should use a syntax consistent with the generic syntax's hierarchical - components unless there are compelling reasons to forbid relative - referencing within that scheme. - - NOTE: Previous specifications used the terms "partial URI" and - "relative URI" to denote a relative reference to a URI. As some - readers misunderstood those terms to mean that relative URIs are a - subset of URIs rather than a method of referencing URIs, this - specification simply refers to them as relative references. - - All URI references are parsed by generic syntax parsers when used. - However, because hierarchical processing has no effect on an absolute - URI used in a reference unless it contains one or more dot-segments - (complete path segments of "." or "..", as described in Section 3.3), - URI scheme specifications can define opaque identifiers by - disallowing use of slash characters, question mark characters, and - the URIs "scheme:." and "scheme:..". - -1.3. Syntax Notation - - This specification uses the Augmented Backus-Naur Form (ABNF) - notation of [RFC2234], including the following core ABNF syntax rules - defined by that specification: ALPHA (letters), CR (carriage return), - DIGIT (decimal digits), DQUOTE (double quote), HEXDIG (hexadecimal - digits), LF (line feed), and SP (space). The complete URI syntax is - collected in Appendix A. - -2. Characters - - The URI syntax provides a method of encoding data, presumably for the - sake of identifying a resource, as a sequence of characters. The URI - characters are, in turn, frequently encoded as octets for transport - or presentation. This specification does not mandate any particular - character encoding for mapping between URI characters and the octets - used to store or transmit those characters. When a URI appears in a - protocol element, the character encoding is defined by that protocol; - without such a definition, a URI is assumed to be in the same - character encoding as the surrounding text. - - The ABNF notation defines its terminal values to be non-negative - integers (codepoints) based on the US-ASCII coded character set - [ASCII]. Because a URI is a sequence of characters, we must invert - that relation in order to understand the URI syntax. Therefore, the - - - - - -Berners-Lee, et al. Standards Track [Page 11] - -RFC 3986 URI Generic Syntax January 2005 - - - integer values used by the ABNF must be mapped back to their - corresponding characters via US-ASCII in order to complete the syntax - rules. - - A URI is composed from a limited set of characters consisting of - digits, letters, and a few graphic symbols. A reserved subset of - those characters may be used to delimit syntax components within a - URI while the remaining characters, including both the unreserved set - and those reserved characters not acting as delimiters, define each - component's identifying data. - -2.1. Percent-Encoding - - A percent-encoding mechanism is used to represent a data octet in a - component when that octet's corresponding character is outside the - allowed set or is being used as a delimiter of, or within, the - component. A percent-encoded octet is encoded as a character - triplet, consisting of the percent character "%" followed by the two - hexadecimal digits representing that octet's numeric value. For - example, "%20" is the percent-encoding for the binary octet - "00100000" (ABNF: %x20), which in US-ASCII corresponds to the space - character (SP). Section 2.4 describes when percent-encoding and - decoding is applied. - - pct-encoded = "%" HEXDIG HEXDIG - - The uppercase hexadecimal digits 'A' through 'F' are equivalent to - the lowercase digits 'a' through 'f', respectively. If two URIs - differ only in the case of hexadecimal digits used in percent-encoded - octets, they are equivalent. For consistency, URI producers and - normalizers should use uppercase hexadecimal digits for all percent- - encodings. - -2.2. Reserved Characters - - URIs include components and subcomponents that are delimited by - characters in the "reserved" set. These characters are called - "reserved" because they may (or may not) be defined as delimiters by - the generic syntax, by each scheme-specific syntax, or by the - implementation-specific syntax of a URI's dereferencing algorithm. - If data for a URI component would conflict with a reserved - character's purpose as a delimiter, then the conflicting data must be - percent-encoded before the URI is formed. - - - - - - - - -Berners-Lee, et al. Standards Track [Page 12] - -RFC 3986 URI Generic Syntax January 2005 - - - reserved = gen-delims / sub-delims - - gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" - - sub-delims = "!" / "$" / "&" / "'" / "(" / ")" - / "*" / "+" / "," / ";" / "=" - - The purpose of reserved characters is to provide a set of delimiting - characters that are distinguishable from other data within a URI. - URIs that differ in the replacement of a reserved character with its - corresponding percent-encoded octet are not equivalent. Percent- - encoding a reserved character, or decoding a percent-encoded octet - that corresponds to a reserved character, will change how the URI is - interpreted by most applications. Thus, characters in the reserved - set are protected from normalization and are therefore safe to be - used by scheme-specific and producer-specific algorithms for - delimiting data subcomponents within a URI. - - A subset of the reserved characters (gen-delims) is used as - delimiters of the generic URI components described in Section 3. A - component's ABNF syntax rule will not use the reserved or gen-delims - rule names directly; instead, each syntax rule lists the characters - allowed within that component (i.e., not delimiting it), and any of - those characters that are also in the reserved set are "reserved" for - use as subcomponent delimiters within the component. Only the most - common subcomponents are defined by this specification; other - subcomponents may be defined by a URI scheme's specification, or by - the implementation-specific syntax of a URI's dereferencing - algorithm, provided that such subcomponents are delimited by - characters in the reserved set allowed within that component. - - URI producing applications should percent-encode data octets that - correspond to characters in the reserved set unless these characters - are specifically allowed by the URI scheme to represent data in that - component. If a reserved character is found in a URI component and - no delimiting role is known for that character, then it must be - interpreted as representing the data octet corresponding to that - character's encoding in US-ASCII. - -2.3. Unreserved Characters - - Characters that are allowed in a URI but do not have a reserved - purpose are called unreserved. These include uppercase and lowercase - letters, decimal digits, hyphen, period, underscore, and tilde. - - unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" - - - - - -Berners-Lee, et al. Standards Track [Page 13] - -RFC 3986 URI Generic Syntax January 2005 - - - URIs that differ in the replacement of an unreserved character with - its corresponding percent-encoded US-ASCII octet are equivalent: they - identify the same resource. However, URI comparison implementations - do not always perform normalization prior to comparison (see Section - 6). For consistency, percent-encoded octets in the ranges of ALPHA - (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E), - underscore (%5F), or tilde (%7E) should not be created by URI - producers and, when found in a URI, should be decoded to their - corresponding unreserved characters by URI normalizers. - -2.4. When to Encode or Decode - - Under normal circumstances, the only time when octets within a URI - are percent-encoded is during the process of producing the URI from - its component parts. This is when an implementation determines which - of the reserved characters are to be used as subcomponent delimiters - and which can be safely used as data. Once produced, a URI is always - in its percent-encoded form. - - When a URI is dereferenced, the components and subcomponents - significant to the scheme-specific dereferencing process (if any) - must be parsed and separated before the percent-encoded octets within - those components can be safely decoded, as otherwise the data may be - mistaken for component delimiters. The only exception is for - percent-encoded octets corresponding to characters in the unreserved - set, which can be decoded at any time. For example, the octet - corresponding to the tilde ("~") character is often encoded as "%7E" - by older URI processing implementations; the "%7E" can be replaced by - "~" without changing its interpretation. - - Because the percent ("%") character serves as the indicator for - percent-encoded octets, it must be percent-encoded as "%25" for that - octet to be used as data within a URI. Implementations must not - percent-encode or decode the same string more than once, as decoding - an already decoded string might lead to misinterpreting a percent - data octet as the beginning of a percent-encoding, or vice versa in - the case of percent-encoding an already percent-encoded string. - -2.5. Identifying Data - - URI characters provide identifying data for each of the URI - components, serving as an external interface for identification - between systems. Although the presence and nature of the URI - production interface is hidden from clients that use its URIs (and is - thus beyond the scope of the interoperability requirements defined by - this specification), it is a frequent source of confusion and errors - in the interpretation of URI character issues. Implementers have to - be aware that there are multiple character encodings involved in the - - - -Berners-Lee, et al. Standards Track [Page 14] - -RFC 3986 URI Generic Syntax January 2005 - - - production and transmission of URIs: local name and data encoding, - public interface encoding, URI character encoding, data format - encoding, and protocol encoding. - - Local names, such as file system names, are stored with a local - character encoding. URI producing applications (e.g., origin - servers) will typically use the local encoding as the basis for - producing meaningful names. The URI producer will transform the - local encoding to one that is suitable for a public interface and - then transform the public interface encoding into the restricted set - of URI characters (reserved, unreserved, and percent-encodings). - Those characters are, in turn, encoded as octets to be used as a - reference within a data format (e.g., a document charset), and such - data formats are often subsequently encoded for transmission over - Internet protocols. - - For most systems, an unreserved character appearing within a URI - component is interpreted as representing the data octet corresponding - to that character's encoding in US-ASCII. Consumers of URIs assume - that the letter "X" corresponds to the octet "01011000", and even - when that assumption is incorrect, there is no harm in making it. A - system that internally provides identifiers in the form of a - different character encoding, such as EBCDIC, will generally perform - character translation of textual identifiers to UTF-8 [STD63] (or - some other superset of the US-ASCII character encoding) at an - internal interface, thereby providing more meaningful identifiers - than those resulting from simply percent-encoding the original - octets. - - For example, consider an information service that provides data, - stored locally using an EBCDIC-based file system, to clients on the - Internet through an HTTP server. When an author creates a file with - the name "Laguna Beach" on that file system, the "http" URI - corresponding to that resource is expected to contain the meaningful - string "Laguna%20Beach". If, however, that server produces URIs by - using an overly simplistic raw octet mapping, then the result would - be a URI containing "%D3%81%87%A4%95%81@%C2%85%81%83%88". An - internal transcoding interface fixes this problem by transcoding the - local name to a superset of US-ASCII prior to producing the URI. - Naturally, proper interpretation of an incoming URI on such an - interface requires that percent-encoded octets be decoded (e.g., - "%20" to SP) before the reverse transcoding is applied to obtain the - local name. - - In some cases, the internal interface between a URI component and the - identifying data that it has been crafted to represent is much less - direct than a character encoding translation. For example, portions - of a URI might reflect a query on non-ASCII data, or numeric - - - -Berners-Lee, et al. Standards Track [Page 15] - -RFC 3986 URI Generic Syntax January 2005 - - - coordinates on a map. Likewise, a URI scheme may define components - with additional encoding requirements that are applied prior to - forming the component and producing the URI. - - When a new URI scheme defines a component that represents textual - data consisting of characters from the Universal Character Set [UCS], - the data should first be encoded as octets according to the UTF-8 - character encoding [STD63]; then only those octets that do not - correspond to characters in the unreserved set should be percent- - encoded. For example, the character A would be represented as "A", - the character LATIN CAPITAL LETTER A WITH GRAVE would be represented - as "%C3%80", and the character KATAKANA LETTER A would be represented - as "%E3%82%A2". - -3. Syntax Components - - The generic URI syntax consists of a hierarchical sequence of - components referred to as the scheme, authority, path, query, and - fragment. - - URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] - - hier-part = "//" authority path-abempty - / path-absolute - / path-rootless - / path-empty - - The scheme and path components are required, though the path may be - empty (no characters). When authority is present, the path must - either be empty or begin with a slash ("/") character. When - authority is not present, the path cannot begin with two slash - characters ("//"). These restrictions result in five different ABNF - rules for a path (Section 3.3), only one of which will match any - given URI reference. - - The following are two example URIs and their component parts: - - foo://example.com:8042/over/there?name=ferret#nose - \_/ \______________/\_________/ \_________/ \__/ - | | | | | - scheme authority path query fragment - | _____________________|__ - / \ / \ - urn:example:animal:ferret:nose - - - - - - - -Berners-Lee, et al. Standards Track [Page 16] - -RFC 3986 URI Generic Syntax January 2005 - - -3.1. Scheme - - Each URI begins with a scheme name that refers to a specification for - assigning identifiers within that scheme. As such, the URI syntax is - a federated and extensible naming system wherein each scheme's - specification may further restrict the syntax and semantics of - identifiers using that scheme. - - Scheme names consist of a sequence of characters beginning with a - letter and followed by any combination of letters, digits, plus - ("+"), period ("."), or hyphen ("-"). Although schemes are case- - insensitive, the canonical form is lowercase and documents that - specify schemes must do so with lowercase letters. An implementation - should accept uppercase letters as equivalent to lowercase in scheme - names (e.g., allow "HTTP" as well as "http") for the sake of - robustness but should only produce lowercase scheme names for - consistency. - - scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) - - Individual schemes are not specified by this document. The process - for registration of new URI schemes is defined separately by [BCP35]. - The scheme registry maintains the mapping between scheme names and - their specifications. Advice for designers of new URI schemes can be - found in [RFC2718]. URI scheme specifications must define their own - syntax so that all strings matching their scheme-specific syntax will - also match the <absolute-URI> grammar, as described in Section 4.3. - - When presented with a URI that violates one or more scheme-specific - restrictions, the scheme-specific resolution process should flag the - reference as an error rather than ignore the unused parts; doing so - reduces the number of equivalent URIs and helps detect abuses of the - generic syntax, which might indicate that the URI has been - constructed to mislead the user (Section 7.6). - -3.2. Authority - - Many URI schemes include a hierarchical element for a naming - authority so that governance of the name space defined by the - remainder of the URI is delegated to that authority (which may, in - turn, delegate it further). The generic syntax provides a common - means for distinguishing an authority based on a registered name or - server address, along with optional port and user information. - - The authority component is preceded by a double slash ("//") and is - terminated by the next slash ("/"), question mark ("?"), or number - sign ("#") character, or by the end of the URI. - - - - -Berners-Lee, et al. Standards Track [Page 17] - -RFC 3986 URI Generic Syntax January 2005 - - - authority = [ userinfo "@" ] host [ ":" port ] - - URI producers and normalizers should omit the ":" delimiter that - separates host from port if the port component is empty. Some - schemes do not allow the userinfo and/or port subcomponents. - - If a URI contains an authority component, then the path component - must either be empty or begin with a slash ("/") character. Non- - validating parsers (those that merely separate a URI reference into - its major components) will often ignore the subcomponent structure of - authority, treating it as an opaque string from the double-slash to - the first terminating delimiter, until such time as the URI is - dereferenced. - -3.2.1. User Information - - The userinfo subcomponent may consist of a user name and, optionally, - scheme-specific information about how to gain authorization to access - the resource. The user information, if present, is followed by a - commercial at-sign ("@") that delimits it from the host. - - userinfo = *( unreserved / pct-encoded / sub-delims / ":" ) - - Use of the format "user:password" in the userinfo field is - deprecated. Applications should not render as clear text any data - after the first colon (":") character found within a userinfo - subcomponent unless the data after the colon is the empty string - (indicating no password). Applications may choose to ignore or - reject such data when it is received as part of a reference and - should reject the storage of such data in unencrypted form. The - passing of authentication information in clear text has proven to be - a security risk in almost every case where it has been used. - - Applications that render a URI for the sake of user feedback, such as - in graphical hypertext browsing, should render userinfo in a way that - is distinguished from the rest of a URI, when feasible. Such - rendering will assist the user in cases where the userinfo has been - misleadingly crafted to look like a trusted domain name - (Section 7.6). - -3.2.2. Host - - The host subcomponent of authority is identified by an IP literal - encapsulated within square brackets, an IPv4 address in dotted- - decimal form, or a registered name. The host subcomponent is case- - insensitive. The presence of a host subcomponent within a URI does - not imply that the scheme requires access to the given host on the - Internet. In many cases, the host syntax is used only for the sake - - - -Berners-Lee, et al. Standards Track [Page 18] - -RFC 3986 URI Generic Syntax January 2005 - - - of reusing the existing registration process created and deployed for - DNS, thus obtaining a globally unique name without the cost of - deploying another registry. However, such use comes with its own - costs: domain name ownership may change over time for reasons not - anticipated by the URI producer. In other cases, the data within the - host component identifies a registered name that has nothing to do - with an Internet host. We use the name "host" for the ABNF rule - because that is its most common purpose, not its only purpose. - - host = IP-literal / IPv4address / reg-name - - The syntax rule for host is ambiguous because it does not completely - distinguish between an IPv4address and a reg-name. In order to - disambiguate the syntax, we apply the "first-match-wins" algorithm: - If host matches the rule for IPv4address, then it should be - considered an IPv4 address literal and not a reg-name. Although host - is case-insensitive, producers and normalizers should use lowercase - for registered names and hexadecimal addresses for the sake of - uniformity, while only using uppercase letters for percent-encodings. - - A host identified by an Internet Protocol literal address, version 6 - [RFC3513] or later, is distinguished by enclosing the IP literal - within square brackets ("[" and "]"). This is the only place where - square bracket characters are allowed in the URI syntax. In - anticipation of future, as-yet-undefined IP literal address formats, - an implementation may use an optional version flag to indicate such a - format explicitly rather than rely on heuristic determination. - - IP-literal = "[" ( IPv6address / IPvFuture ) "]" - - IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" ) - - The version flag does not indicate the IP version; rather, it - indicates future versions of the literal format. As such, - implementations must not provide the version flag for the existing - IPv4 and IPv6 literal address forms described below. If a URI - containing an IP-literal that starts with "v" (case-insensitive), - indicating that the version flag is present, is dereferenced by an - application that does not know the meaning of that version flag, then - the application should return an appropriate error for "address - mechanism not supported". - - A host identified by an IPv6 literal address is represented inside - the square brackets without a preceding version flag. The ABNF - provided here is a translation of the text definition of an IPv6 - literal address provided in [RFC3513]. This syntax does not support - IPv6 scoped addressing zone identifiers. - - - - -Berners-Lee, et al. Standards Track [Page 19] - -RFC 3986 URI Generic Syntax January 2005 - - - A 128-bit IPv6 address is divided into eight 16-bit pieces. Each - piece is represented numerically in case-insensitive hexadecimal, - using one to four hexadecimal digits (leading zeroes are permitted). - The eight encoded pieces are given most-significant first, separated - by colon characters. Optionally, the least-significant two pieces - may instead be represented in IPv4 address textual format. A - sequence of one or more consecutive zero-valued 16-bit pieces within - the address may be elided, omitting all their digits and leaving - exactly two consecutive colons in their place to mark the elision. - - IPv6address = 6( h16 ":" ) ls32 - / "::" 5( h16 ":" ) ls32 - / [ h16 ] "::" 4( h16 ":" ) ls32 - / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32 - / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32 - / [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32 - / [ *4( h16 ":" ) h16 ] "::" ls32 - / [ *5( h16 ":" ) h16 ] "::" h16 - / [ *6( h16 ":" ) h16 ] "::" - - ls32 = ( h16 ":" h16 ) / IPv4address - ; least-significant 32 bits of address - - h16 = 1*4HEXDIG - ; 16 bits of address represented in hexadecimal - - A host identified by an IPv4 literal address is represented in - dotted-decimal notation (a sequence of four decimal numbers in the - range 0 to 255, separated by "."), as described in [RFC1123] by - reference to [RFC0952]. Note that other forms of dotted notation may - be interpreted on some platforms, as described in Section 7.4, but - only the dotted-decimal form of four octets is allowed by this - grammar. - - IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet - - dec-octet = DIGIT ; 0-9 - / %x31-39 DIGIT ; 10-99 - / "1" 2DIGIT ; 100-199 - / "2" %x30-34 DIGIT ; 200-249 - / "25" %x30-35 ; 250-255 - - A host identified by a registered name is a sequence of characters - usually intended for lookup within a locally defined host or service - name registry, though the URI's scheme-specific semantics may require - that a specific registry (or fixed name table) be used instead. The - most common name registry mechanism is the Domain Name System (DNS). - A registered name intended for lookup in the DNS uses the syntax - - - -Berners-Lee, et al. Standards Track [Page 20] - -RFC 3986 URI Generic Syntax January 2005 - - - defined in Section 3.5 of [RFC1034] and Section 2.1 of [RFC1123]. - Such a name consists of a sequence of domain labels separated by ".", - each domain label starting and ending with an alphanumeric character - and possibly also containing "-" characters. The rightmost domain - label of a fully qualified domain name in DNS may be followed by a - single "." and should be if it is necessary to distinguish between - the complete domain name and some local domain. - - reg-name = *( unreserved / pct-encoded / sub-delims ) - - If the URI scheme defines a default for host, then that default - applies when the host subcomponent is undefined or when the - registered name is empty (zero length). For example, the "file" URI - scheme is defined so that no authority, an empty host, and - "localhost" all mean the end-user's machine, whereas the "http" - scheme considers a missing authority or empty host invalid. - - This specification does not mandate a particular registered name - lookup technology and therefore does not restrict the syntax of reg- - name beyond what is necessary for interoperability. Instead, it - delegates the issue of registered name syntax conformance to the - operating system of each application performing URI resolution, and - that operating system decides what it will allow for the purpose of - host identification. A URI resolution implementation might use DNS, - host tables, yellow pages, NetInfo, WINS, or any other system for - lookup of registered names. However, a globally scoped naming - system, such as DNS fully qualified domain names, is necessary for - URIs intended to have global scope. URI producers should use names - that conform to the DNS syntax, even when use of DNS is not - immediately apparent, and should limit these names to no more than - 255 characters in length. - - The reg-name syntax allows percent-encoded octets in order to - represent non-ASCII registered names in a uniform way that is - independent of the underlying name resolution technology. Non-ASCII - characters must first be encoded according to UTF-8 [STD63], and then - each octet of the corresponding UTF-8 sequence must be percent- - encoded to be represented as URI characters. URI producing - applications must not use percent-encoding in host unless it is used - to represent a UTF-8 character sequence. When a non-ASCII registered - name represents an internationalized domain name intended for - resolution via the DNS, the name must be transformed to the IDNA - encoding [RFC3490] prior to name lookup. URI producers should - provide these registered names in the IDNA encoding, rather than a - percent-encoding, if they wish to maximize interoperability with - legacy URI resolvers. - - - - - -Berners-Lee, et al. Standards Track [Page 21] - -RFC 3986 URI Generic Syntax January 2005 - - -3.2.3. Port - - The port subcomponent of authority is designated by an optional port - number in decimal following the host and delimited from it by a - single colon (":") character. - - port = *DIGIT - - A scheme may define a default port. For example, the "http" scheme - defines a default port of "80", corresponding to its reserved TCP - port number. The type of port designated by the port number (e.g., - TCP, UDP, SCTP) is defined by the URI scheme. URI producers and - normalizers should omit the port component and its ":" delimiter if - port is empty or if its value would be the same as that of the - scheme's default. - -3.3. Path - - The path component contains data, usually organized in hierarchical - form, that, along with data in the non-hierarchical query component - (Section 3.4), serves to identify a resource within the scope of the - URI's scheme and naming authority (if any). The path is terminated - by the first question mark ("?") or number sign ("#") character, or - by the end of the URI. - - If a URI contains an authority component, then the path component - must either be empty or begin with a slash ("/") character. If a URI - does not contain an authority component, then the path cannot begin - with two slash characters ("//"). In addition, a URI reference - (Section 4.1) may be a relative-path reference, in which case the - first path segment cannot contain a colon (":") character. The ABNF - requires five separate rules to disambiguate these cases, only one of - which will match the path substring within a given URI reference. We - use the generic term "path component" to describe the URI substring - matched by the parser to one of these rules. - - path = path-abempty ; begins with "/" or is empty - / path-absolute ; begins with "/" but not "//" - / path-noscheme ; begins with a non-colon segment - / path-rootless ; begins with a segment - / path-empty ; zero characters - - path-abempty = *( "/" segment ) - path-absolute = "/" [ segment-nz *( "/" segment ) ] - path-noscheme = segment-nz-nc *( "/" segment ) - path-rootless = segment-nz *( "/" segment ) - path-empty = 0<pchar> - - - - -Berners-Lee, et al. Standards Track [Page 22] - -RFC 3986 URI Generic Syntax January 2005 - - - segment = *pchar - segment-nz = 1*pchar - segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" ) - ; non-zero-length segment without any colon ":" - - pchar = unreserved / pct-encoded / sub-delims / ":" / "@" - - A path consists of a sequence of path segments separated by a slash - ("/") character. A path is always defined for a URI, though the - defined path may be empty (zero length). Use of the slash character - to indicate hierarchy is only required when a URI will be used as the - context for relative references. For example, the URI - <mailto:fred@example.com> has a path of "fred@example.com", whereas - the URI <foo://info.example.com?fred> has an empty path. - - The path segments "." and "..", also known as dot-segments, are - defined for relative reference within the path name hierarchy. They - are intended for use at the beginning of a relative-path reference - (Section 4.2) to indicate relative position within the hierarchical - tree of names. This is similar to their role within some operating - systems' file directory structures to indicate the current directory - and parent directory, respectively. However, unlike in a file - system, these dot-segments are only interpreted within the URI path - hierarchy and are removed as part of the resolution process (Section - 5.2). - - Aside from dot-segments in hierarchical paths, a path segment is - considered opaque by the generic syntax. URI producing applications - often use the reserved characters allowed in a segment to delimit - scheme-specific or dereference-handler-specific subcomponents. For - example, the semicolon (";") and equals ("=") reserved characters are - often used to delimit parameters and parameter values applicable to - that segment. The comma (",") reserved character is often used for - similar purposes. For example, one URI producer might use a segment - such as "name;v=1.1" to indicate a reference to version 1.1 of - "name", whereas another might use a segment such as "name,1.1" to - indicate the same. Parameter types may be defined by scheme-specific - semantics, but in most cases the syntax of a parameter is specific to - the implementation of the URI's dereferencing algorithm. - -3.4. Query - - The query component contains non-hierarchical data that, along with - data in the path component (Section 3.3), serves to identify a - resource within the scope of the URI's scheme and naming authority - (if any). The query component is indicated by the first question - mark ("?") character and terminated by a number sign ("#") character - or by the end of the URI. - - - -Berners-Lee, et al. Standards Track [Page 23] - -RFC 3986 URI Generic Syntax January 2005 - - - query = *( pchar / "/" / "?" ) - - The characters slash ("/") and question mark ("?") may represent data - within the query component. Beware that some older, erroneous - implementations may not handle such data correctly when it is used as - the base URI for relative references (Section 5.1), apparently - because they fail to distinguish query data from path data when - looking for hierarchical separators. However, as query components - are often used to carry identifying information in the form of - "key=value" pairs and one frequently used value is a reference to - another URI, it is sometimes better for usability to avoid percent- - encoding those characters. - -3.5. Fragment - - The fragment identifier component of a URI allows indirect - identification of a secondary resource by reference to a primary - resource and additional identifying information. The identified - secondary resource may be some portion or subset of the primary - resource, some view on representations of the primary resource, or - some other resource defined or described by those representations. A - fragment identifier component is indicated by the presence of a - number sign ("#") character and terminated by the end of the URI. - - fragment = *( pchar / "/" / "?" ) - - The semantics of a fragment identifier are defined by the set of - representations that might result from a retrieval action on the - primary resource. The fragment's format and resolution is therefore - dependent on the media type [RFC2046] of a potentially retrieved - representation, even though such a retrieval is only performed if the - URI is dereferenced. If no such representation exists, then the - semantics of the fragment are considered unknown and are effectively - unconstrained. Fragment identifier semantics are independent of the - URI scheme and thus cannot be redefined by scheme specifications. - - Individual media types may define their own restrictions on or - structures within the fragment identifier syntax for specifying - different types of subsets, views, or external references that are - identifiable as secondary resources by that media type. If the - primary resource has multiple representations, as is often the case - for resources whose representation is selected based on attributes of - the retrieval request (a.k.a., content negotiation), then whatever is - identified by the fragment should be consistent across all of those - representations. Each representation should either define the - fragment so that it corresponds to the same secondary resource, - regardless of how it is represented, or should leave the fragment - undefined (i.e., not found). - - - -Berners-Lee, et al. Standards Track [Page 24] - -RFC 3986 URI Generic Syntax January 2005 - - - As with any URI, use of a fragment identifier component does not - imply that a retrieval action will take place. A URI with a fragment - identifier may be used to refer to the secondary resource without any - implication that the primary resource is accessible or will ever be - accessed. - - Fragment identifiers have a special role in information retrieval - systems as the primary form of client-side indirect referencing, - allowing an author to specifically identify aspects of an existing - resource that are only indirectly provided by the resource owner. As - such, the fragment identifier is not used in the scheme-specific - processing of a URI; instead, the fragment identifier is separated - from the rest of the URI prior to a dereference, and thus the - identifying information within the fragment itself is dereferenced - solely by the user agent, regardless of the URI scheme. Although - this separate handling is often perceived to be a loss of - information, particularly for accurate redirection of references as - resources move over time, it also serves to prevent information - providers from denying reference authors the right to refer to - information within a resource selectively. Indirect referencing also - provides additional flexibility and extensibility to systems that use - URIs, as new media types are easier to define and deploy than new - schemes of identification. - - The characters slash ("/") and question mark ("?") are allowed to - represent data within the fragment identifier. Beware that some - older, erroneous implementations may not handle this data correctly - when it is used as the base URI for relative references (Section - 5.1). - -4. Usage - - When applications make reference to a URI, they do not always use the - full form of reference defined by the "URI" syntax rule. To save - space and take advantage of hierarchical locality, many Internet - protocol elements and media type formats allow an abbreviation of a - URI, whereas others restrict the syntax to a particular form of URI. - We define the most common forms of reference syntax in this - specification because they impact and depend upon the design of the - generic syntax, requiring a uniform parsing algorithm in order to be - interpreted consistently. - -4.1. URI Reference - - URI-reference is used to denote the most common usage of a resource - identifier. - - URI-reference = URI / relative-ref - - - -Berners-Lee, et al. Standards Track [Page 25] - -RFC 3986 URI Generic Syntax January 2005 - - - A URI-reference is either a URI or a relative reference. If the - URI-reference's prefix does not match the syntax of a scheme followed - by its colon separator, then the URI-reference is a relative - reference. - - A URI-reference is typically parsed first into the five URI - components, in order to determine what components are present and - whether the reference is relative. Then, each component is parsed - for its subparts and their validation. The ABNF of URI-reference, - along with the "first-match-wins" disambiguation rule, is sufficient - to define a validating parser for the generic syntax. Readers - familiar with regular expressions should see Appendix B for an - example of a non-validating URI-reference parser that will take any - given string and extract the URI components. - -4.2. Relative Reference - - A relative reference takes advantage of the hierarchical syntax - (Section 1.2.3) to express a URI reference relative to the name space - of another hierarchical URI. - - relative-ref = relative-part [ "?" query ] [ "#" fragment ] - - relative-part = "//" authority path-abempty - / path-absolute - / path-noscheme - / path-empty - - The URI referred to by a relative reference, also known as the target - URI, is obtained by applying the reference resolution algorithm of - Section 5. - - A relative reference that begins with two slash characters is termed - a network-path reference; such references are rarely used. A - relative reference that begins with a single slash character is - termed an absolute-path reference. A relative reference that does - not begin with a slash character is termed a relative-path reference. - - A path segment that contains a colon character (e.g., "this:that") - cannot be used as the first segment of a relative-path reference, as - it would be mistaken for a scheme name. Such a segment must be - preceded by a dot-segment (e.g., "./this:that") to make a relative- - path reference. - - - - - - - - -Berners-Lee, et al. Standards Track [Page 26] - -RFC 3986 URI Generic Syntax January 2005 - - -4.3. Absolute URI - - Some protocol elements allow only the absolute form of a URI without - a fragment identifier. For example, defining a base URI for later - use by relative references calls for an absolute-URI syntax rule that - does not allow a fragment. - - absolute-URI = scheme ":" hier-part [ "?" query ] - - URI scheme specifications must define their own syntax so that all - strings matching their scheme-specific syntax will also match the - <absolute-URI> grammar. Scheme specifications will not define - fragment identifier syntax or usage, regardless of its applicability - to resources identifiable via that scheme, as fragment identification - is orthogonal to scheme definition. However, scheme specifications - are encouraged to include a wide range of examples, including - examples that show use of the scheme's URIs with fragment identifiers - when such usage is appropriate. - -4.4. Same-Document Reference - - When a URI reference refers to a URI that is, aside from its fragment - component (if any), identical to the base URI (Section 5.1), that - reference is called a "same-document" reference. The most frequent - examples of same-document references are relative references that are - empty or include only the number sign ("#") separator followed by a - fragment identifier. - - When a same-document reference is dereferenced for a retrieval - action, the target of that reference is defined to be within the same - entity (representation, document, or message) as the reference; - therefore, a dereference should not result in a new retrieval action. - - Normalization of the base and target URIs prior to their comparison, - as described in Sections 6.2.2 and 6.2.3, is allowed but rarely - performed in practice. Normalization may increase the set of same- - document references, which may be of benefit to some caching - applications. As such, reference authors should not assume that a - slightly different, though equivalent, reference URI will (or will - not) be interpreted as a same-document reference by any given - application. - -4.5. Suffix Reference - - The URI syntax is designed for unambiguous reference to resources and - extensibility via the URI scheme. However, as URI identification and - usage have become commonplace, traditional media (television, radio, - newspapers, billboards, etc.) have increasingly used a suffix of the - - - -Berners-Lee, et al. Standards Track [Page 27] - -RFC 3986 URI Generic Syntax January 2005 - - - URI as a reference, consisting of only the authority and path - portions of the URI, such as - - www.w3.org/Addressing/ - - or simply a DNS registered name on its own. Such references are - primarily intended for human interpretation rather than for machines, - with the assumption that context-based heuristics are sufficient to - complete the URI (e.g., most registered names beginning with "www" - are likely to have a URI prefix of "http://"). Although there is no - standard set of heuristics for disambiguating a URI suffix, many - client implementations allow them to be entered by the user and - heuristically resolved. - - Although this practice of using suffix references is common, it - should be avoided whenever possible and should never be used in - situations where long-term references are expected. The heuristics - noted above will change over time, particularly when a new URI scheme - becomes popular, and are often incorrect when used out of context. - Furthermore, they can lead to security issues along the lines of - those described in [RFC1535]. - - As a URI suffix has the same syntax as a relative-path reference, a - suffix reference cannot be used in contexts where a relative - reference is expected. As a result, suffix references are limited to - places where there is no defined base URI, such as dialog boxes and - off-line advertisements. - -5. Reference Resolution - - This section defines the process of resolving a URI reference within - a context that allows relative references so that the result is a - string matching the <URI> syntax rule of Section 3. - -5.1. Establishing a Base URI - - The term "relative" implies that a "base URI" exists against which - the relative reference is applied. Aside from fragment-only - references (Section 4.4), relative references are only usable when a - base URI is known. A base URI must be established by the parser - prior to parsing URI references that might be relative. A base URI - must conform to the <absolute-URI> syntax rule (Section 4.3). If the - base URI is obtained from a URI reference, then that reference must - be converted to absolute form and stripped of any fragment component - prior to its use as a base URI. - - - - - - -Berners-Lee, et al. Standards Track [Page 28] - -RFC 3986 URI Generic Syntax January 2005 - - - The base URI of a reference can be established in one of four ways, - discussed below in order of precedence. The order of precedence can - be thought of in terms of layers, where the innermost defined base - URI has the highest precedence. This can be visualized graphically - as follows: - - .----------------------------------------------------------. - | .----------------------------------------------------. | - | | .----------------------------------------------. | | - | | | .----------------------------------------. | | | - | | | | .----------------------------------. | | | | - | | | | | <relative-reference> | | | | | - | | | | `----------------------------------' | | | | - | | | | (5.1.1) Base URI embedded in content | | | | - | | | `----------------------------------------' | | | - | | | (5.1.2) Base URI of the encapsulating entity | | | - | | | (message, representation, or none) | | | - | | `----------------------------------------------' | | - | | (5.1.3) URI used to retrieve the entity | | - | `----------------------------------------------------' | - | (5.1.4) Default Base URI (application-dependent) | - `----------------------------------------------------------' - -5.1.1. Base URI Embedded in Content - - Within certain media types, a base URI for relative references can be - embedded within the content itself so that it can be readily obtained - by a parser. This can be useful for descriptive documents, such as - tables of contents, which may be transmitted to others through - protocols other than their usual retrieval context (e.g., email or - USENET news). - - It is beyond the scope of this specification to specify how, for each - media type, a base URI can be embedded. The appropriate syntax, when - available, is described by the data format specification associated - with each media type. - -5.1.2. Base URI from the Encapsulating Entity - - If no base URI is embedded, the base URI is defined by the - representation's retrieval context. For a document that is enclosed - within another entity, such as a message or archive, the retrieval - context is that entity. Thus, the default base URI of a - representation is the base URI of the entity in which the - representation is encapsulated. - - - - - - -Berners-Lee, et al. Standards Track [Page 29] - -RFC 3986 URI Generic Syntax January 2005 - - - A mechanism for embedding a base URI within MIME container types - (e.g., the message and multipart types) is defined by MHTML - [RFC2557]. Protocols that do not use the MIME message header syntax, - but that do allow some form of tagged metadata to be included within - messages, may define their own syntax for defining a base URI as part - of a message. - -5.1.3. Base URI from the Retrieval URI - - If no base URI is embedded and the representation is not encapsulated - within some other entity, then, if a URI was used to retrieve the - representation, that URI shall be considered the base URI. Note that - if the retrieval was the result of a redirected request, the last URI - used (i.e., the URI that resulted in the actual retrieval of the - representation) is the base URI. - -5.1.4. Default Base URI - - If none of the conditions described above apply, then the base URI is - defined by the context of the application. As this definition is - necessarily application-dependent, failing to define a base URI by - using one of the other methods may result in the same content being - interpreted differently by different types of applications. - - A sender of a representation containing relative references is - responsible for ensuring that a base URI for those references can be - established. Aside from fragment-only references, relative - references can only be used reliably in situations where the base URI - is well defined. - -5.2. Relative Resolution - - This section describes an algorithm for converting a URI reference - that might be relative to a given base URI into the parsed components - of the reference's target. The components can then be recomposed, as - described in Section 5.3, to form the target URI. This algorithm - provides definitive results that can be used to test the output of - other implementations. Applications may implement relative reference - resolution by using some other algorithm, provided that the results - match what would be given by this one. - - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 30] - -RFC 3986 URI Generic Syntax January 2005 - - -5.2.1. Pre-parse the Base URI - - The base URI (Base) is established according to the procedure of - Section 5.1 and parsed into the five main components described in - Section 3. Note that only the scheme component is required to be - present in a base URI; the other components may be empty or - undefined. A component is undefined if its associated delimiter does - not appear in the URI reference; the path component is never - undefined, though it may be empty. - - Normalization of the base URI, as described in Sections 6.2.2 and - 6.2.3, is optional. A URI reference must be transformed to its - target URI before it can be normalized. - -5.2.2. Transform References - - For each URI reference (R), the following pseudocode describes an - algorithm for transforming R into its target URI (T): - - -- The URI reference is parsed into the five URI components - -- - (R.scheme, R.authority, R.path, R.query, R.fragment) = parse(R); - - -- A non-strict parser may ignore a scheme in the reference - -- if it is identical to the base URI's scheme. - -- - if ((not strict) and (R.scheme == Base.scheme)) then - undefine(R.scheme); - endif; - - - - - - - - - - - - - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 31] - -RFC 3986 URI Generic Syntax January 2005 - - - if defined(R.scheme) then - T.scheme = R.scheme; - T.authority = R.authority; - T.path = remove_dot_segments(R.path); - T.query = R.query; - else - if defined(R.authority) then - T.authority = R.authority; - T.path = remove_dot_segments(R.path); - T.query = R.query; - else - if (R.path == "") then - T.path = Base.path; - if defined(R.query) then - T.query = R.query; - else - T.query = Base.query; - endif; - else - if (R.path starts-with "/") then - T.path = remove_dot_segments(R.path); - else - T.path = merge(Base.path, R.path); - T.path = remove_dot_segments(T.path); - endif; - T.query = R.query; - endif; - T.authority = Base.authority; - endif; - T.scheme = Base.scheme; - endif; - - T.fragment = R.fragment; - -5.2.3. Merge Paths - - The pseudocode above refers to a "merge" routine for merging a - relative-path reference with the path of the base URI. This is - accomplished as follows: - - o If the base URI has a defined authority component and an empty - path, then return a string consisting of "/" concatenated with the - reference's path; otherwise, - - - - - - - - -Berners-Lee, et al. Standards Track [Page 32] - -RFC 3986 URI Generic Syntax January 2005 - - - o return a string consisting of the reference's path component - appended to all but the last segment of the base URI's path (i.e., - excluding any characters after the right-most "/" in the base URI - path, or excluding the entire base URI path if it does not contain - any "/" characters). - -5.2.4. Remove Dot Segments - - The pseudocode also refers to a "remove_dot_segments" routine for - interpreting and removing the special "." and ".." complete path - segments from a referenced path. This is done after the path is - extracted from a reference, whether or not the path was relative, in - order to remove any invalid or extraneous dot-segments prior to - forming the target URI. Although there are many ways to accomplish - this removal process, we describe a simple method using two string - buffers. - - 1. The input buffer is initialized with the now-appended path - components and the output buffer is initialized to the empty - string. - - 2. While the input buffer is not empty, loop as follows: - - A. If the input buffer begins with a prefix of "../" or "./", - then remove that prefix from the input buffer; otherwise, - - B. if the input buffer begins with a prefix of "/./" or "/.", - where "." is a complete path segment, then replace that - prefix with "/" in the input buffer; otherwise, - - C. if the input buffer begins with a prefix of "/../" or "/..", - where ".." is a complete path segment, then replace that - prefix with "/" in the input buffer and remove the last - segment and its preceding "/" (if any) from the output - buffer; otherwise, - - D. if the input buffer consists only of "." or "..", then remove - that from the input buffer; otherwise, - - E. move the first path segment in the input buffer to the end of - the output buffer, including the initial "/" character (if - any) and any subsequent characters up to, but not including, - the next "/" character or the end of the input buffer. - - 3. Finally, the output buffer is returned as the result of - remove_dot_segments. - - - - - -Berners-Lee, et al. Standards Track [Page 33] - -RFC 3986 URI Generic Syntax January 2005 - - - Note that dot-segments are intended for use in URI references to - express an identifier relative to the hierarchy of names in the base - URI. The remove_dot_segments algorithm respects that hierarchy by - removing extra dot-segments rather than treat them as an error or - leaving them to be misinterpreted by dereference implementations. - - The following illustrates how the above steps are applied for two - examples of merged paths, showing the state of the two buffers after - each step. - - STEP OUTPUT BUFFER INPUT BUFFER - - 1 : /a/b/c/./../../g - 2E: /a /b/c/./../../g - 2E: /a/b /c/./../../g - 2E: /a/b/c /./../../g - 2B: /a/b/c /../../g - 2C: /a/b /../g - 2C: /a /g - 2E: /a/g - - STEP OUTPUT BUFFER INPUT BUFFER - - 1 : mid/content=5/../6 - 2E: mid /content=5/../6 - 2E: mid/content=5 /../6 - 2C: mid /6 - 2E: mid/6 - - Some applications may find it more efficient to implement the - remove_dot_segments algorithm by using two segment stacks rather than - strings. - - Note: Beware that some older, erroneous implementations will fail - to separate a reference's query component from its path component - prior to merging the base and reference paths, resulting in an - interoperability failure if the query component contains the - strings "/../" or "/./". - - - - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 34] - -RFC 3986 URI Generic Syntax January 2005 - - -5.3. Component Recomposition - - Parsed URI components can be recomposed to obtain the corresponding - URI reference string. Using pseudocode, this would be: - - result = "" - - if defined(scheme) then - append scheme to result; - append ":" to result; - endif; - - if defined(authority) then - append "//" to result; - append authority to result; - endif; - - append path to result; - - if defined(query) then - append "?" to result; - append query to result; - endif; - - if defined(fragment) then - append "#" to result; - append fragment to result; - endif; - - return result; - - Note that we are careful to preserve the distinction between a - component that is undefined, meaning that its separator was not - present in the reference, and a component that is empty, meaning that - the separator was present and was immediately followed by the next - component separator or the end of the reference. - -5.4. Reference Resolution Examples - - Within a representation with a well defined base URI of - - http://a/b/c/d;p?q - - a relative reference is transformed to its target URI as follows. - - - - - - - -Berners-Lee, et al. Standards Track [Page 35] - -RFC 3986 URI Generic Syntax January 2005 - - -5.4.1. Normal Examples - - "g:h" = "g:h" - "g" = "http://a/b/c/g" - "./g" = "http://a/b/c/g" - "g/" = "http://a/b/c/g/" - "/g" = "http://a/g" - "//g" = "http://g" - "?y" = "http://a/b/c/d;p?y" - "g?y" = "http://a/b/c/g?y" - "#s" = "http://a/b/c/d;p?q#s" - "g#s" = "http://a/b/c/g#s" - "g?y#s" = "http://a/b/c/g?y#s" - ";x" = "http://a/b/c/;x" - "g;x" = "http://a/b/c/g;x" - "g;x?y#s" = "http://a/b/c/g;x?y#s" - "" = "http://a/b/c/d;p?q" - "." = "http://a/b/c/" - "./" = "http://a/b/c/" - ".." = "http://a/b/" - "../" = "http://a/b/" - "../g" = "http://a/b/g" - "../.." = "http://a/" - "../../" = "http://a/" - "../../g" = "http://a/g" - -5.4.2. Abnormal Examples - - Although the following abnormal examples are unlikely to occur in - normal practice, all URI parsers should be capable of resolving them - consistently. Each example uses the same base as that above. - - Parsers must be careful in handling cases where there are more ".." - segments in a relative-path reference than there are hierarchical - levels in the base URI's path. Note that the ".." syntax cannot be - used to change the authority component of a URI. - - "../../../g" = "http://a/g" - "../../../../g" = "http://a/g" - - - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 36] - -RFC 3986 URI Generic Syntax January 2005 - - - Similarly, parsers must remove the dot-segments "." and ".." when - they are complete components of a path, but not when they are only - part of a segment. - - "/./g" = "http://a/g" - "/../g" = "http://a/g" - "g." = "http://a/b/c/g." - ".g" = "http://a/b/c/.g" - "g.." = "http://a/b/c/g.." - "..g" = "http://a/b/c/..g" - - Less likely are cases where the relative reference uses unnecessary - or nonsensical forms of the "." and ".." complete path segments. - - "./../g" = "http://a/b/g" - "./g/." = "http://a/b/c/g/" - "g/./h" = "http://a/b/c/g/h" - "g/../h" = "http://a/b/c/h" - "g;x=1/./y" = "http://a/b/c/g;x=1/y" - "g;x=1/../y" = "http://a/b/c/y" - - Some applications fail to separate the reference's query and/or - fragment components from the path component before merging it with - the base path and removing dot-segments. This error is rarely - noticed, as typical usage of a fragment never includes the hierarchy - ("/") character and the query component is not normally used within - relative references. - - "g?y/./x" = "http://a/b/c/g?y/./x" - "g?y/../x" = "http://a/b/c/g?y/../x" - "g#s/./x" = "http://a/b/c/g#s/./x" - "g#s/../x" = "http://a/b/c/g#s/../x" - - Some parsers allow the scheme name to be present in a relative - reference if it is the same as the base URI scheme. This is - considered to be a loophole in prior specifications of partial URI - [RFC1630]. Its use should be avoided but is allowed for backward - compatibility. - - "http:g" = "http:g" ; for strict parsers - / "http://a/b/c/g" ; for backward compatibility - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 37] - -RFC 3986 URI Generic Syntax January 2005 - - -6. Normalization and Comparison - - One of the most common operations on URIs is simple comparison: - determining whether two URIs are equivalent without using the URIs to - access their respective resource(s). A comparison is performed every - time a response cache is accessed, a browser checks its history to - color a link, or an XML parser processes tags within a namespace. - Extensive normalization prior to comparison of URIs is often used by - spiders and indexing engines to prune a search space or to reduce - duplication of request actions and response storage. - - URI comparison is performed for some particular purpose. Protocols - or implementations that compare URIs for different purposes will - often be subject to differing design trade-offs in regards to how - much effort should be spent in reducing aliased identifiers. This - section describes various methods that may be used to compare URIs, - the trade-offs between them, and the types of applications that might - use them. - -6.1. Equivalence - - Because URIs exist to identify resources, presumably they should be - considered equivalent when they identify the same resource. However, - this definition of equivalence is not of much practical use, as there - is no way for an implementation to compare two resources unless it - has full knowledge or control of them. For this reason, - determination of equivalence or difference of URIs is based on string - comparison, perhaps augmented by reference to additional rules - provided by URI scheme definitions. We use the terms "different" and - "equivalent" to describe the possible outcomes of such comparisons, - but there are many application-dependent versions of equivalence. - - Even though it is possible to determine that two URIs are equivalent, - URI comparison is not sufficient to determine whether two URIs - identify different resources. For example, an owner of two different - domain names could decide to serve the same resource from both, - resulting in two different URIs. Therefore, comparison methods are - designed to minimize false negatives while strictly avoiding false - positives. - - In testing for equivalence, applications should not directly compare - relative references; the references should be converted to their - respective target URIs before comparison. When URIs are compared to - select (or avoid) a network action, such as retrieval of a - representation, fragment components (if any) should be excluded from - the comparison. - - - - - -Berners-Lee, et al. Standards Track [Page 38] - -RFC 3986 URI Generic Syntax January 2005 - - -6.2. Comparison Ladder - - A variety of methods are used in practice to test URI equivalence. - These methods fall into a range, distinguished by the amount of - processing required and the degree to which the probability of false - negatives is reduced. As noted above, false negatives cannot be - eliminated. In practice, their probability can be reduced, but this - reduction requires more processing and is not cost-effective for all - applications. - - If this range of comparison practices is considered as a ladder, the - following discussion will climb the ladder, starting with practices - that are cheap but have a relatively higher chance of producing false - negatives, and proceeding to those that have higher computational - cost and lower risk of false negatives. - -6.2.1. Simple String Comparison - - If two URIs, when considered as character strings, are identical, - then it is safe to conclude that they are equivalent. This type of - equivalence test has very low computational cost and is in wide use - in a variety of applications, particularly in the domain of parsing. - - Testing strings for equivalence requires some basic precautions. - This procedure is often referred to as "bit-for-bit" or - "byte-for-byte" comparison, which is potentially misleading. Testing - strings for equality is normally based on pair comparison of the - characters that make up the strings, starting from the first and - proceeding until both strings are exhausted and all characters are - found to be equal, until a pair of characters compares unequal, or - until one of the strings is exhausted before the other. - - This character comparison requires that each pair of characters be - put in comparable form. For example, should one URI be stored in a - byte array in EBCDIC encoding and the second in a Java String object - (UTF-16), bit-for-bit comparisons applied naively will produce - errors. It is better to speak of equality on a character-for- - character basis rather than on a byte-for-byte or bit-for-bit basis. - In practical terms, character-by-character comparisons should be done - codepoint-by-codepoint after conversion to a common character - encoding. - - False negatives are caused by the production and use of URI aliases. - Unnecessary aliases can be reduced, regardless of the comparison - method, by consistently providing URI references in an already- - normalized form (i.e., a form identical to what would be produced - after normalization is applied, as described below). - - - - -Berners-Lee, et al. Standards Track [Page 39] - -RFC 3986 URI Generic Syntax January 2005 - - - Protocols and data formats often limit some URI comparisons to simple - string comparison, based on the theory that people and - implementations will, in their own best interest, be consistent in - providing URI references, or at least consistent enough to negate any - efficiency that might be obtained from further normalization. - -6.2.2. Syntax-Based Normalization - - Implementations may use logic based on the definitions provided by - this specification to reduce the probability of false negatives. - This processing is moderately higher in cost than character-for- - character string comparison. For example, an application using this - approach could reasonably consider the following two URIs equivalent: - - example://a/b/c/%7Bfoo%7D - eXAMPLE://a/./b/../b/%63/%7bfoo%7d - - Web user agents, such as browsers, typically apply this type of URI - normalization when determining whether a cached response is - available. Syntax-based normalization includes such techniques as - case normalization, percent-encoding normalization, and removal of - dot-segments. - -6.2.2.1. Case Normalization - - For all URIs, the hexadecimal digits within a percent-encoding - triplet (e.g., "%3a" versus "%3A") are case-insensitive and therefore - should be normalized to use uppercase letters for the digits A-F. - - When a URI uses components of the generic syntax, the component - syntax equivalence rules always apply; namely, that the scheme and - host are case-insensitive and therefore should be normalized to - lowercase. For example, the URI <HTTP://www.EXAMPLE.com/> is - equivalent to <http://www.example.com/>. The other generic syntax - components are assumed to be case-sensitive unless specifically - defined otherwise by the scheme (see Section 6.2.3). - -6.2.2.2. Percent-Encoding Normalization - - The percent-encoding mechanism (Section 2.1) is a frequent source of - variance among otherwise identical URIs. In addition to the case - normalization issue noted above, some URI producers percent-encode - octets that do not require percent-encoding, resulting in URIs that - are equivalent to their non-encoded counterparts. These URIs should - be normalized by decoding any percent-encoded octet that corresponds - to an unreserved character, as described in Section 2.3. - - - - - -Berners-Lee, et al. Standards Track [Page 40] - -RFC 3986 URI Generic Syntax January 2005 - - -6.2.2.3. Path Segment Normalization - - The complete path segments "." and ".." are intended only for use - within relative references (Section 4.1) and are removed as part of - the reference resolution process (Section 5.2). However, some - deployed implementations incorrectly assume that reference resolution - is not necessary when the reference is already a URI and thus fail to - remove dot-segments when they occur in non-relative paths. URI - normalizers should remove dot-segments by applying the - remove_dot_segments algorithm to the path, as described in - Section 5.2.4. - -6.2.3. Scheme-Based Normalization - - The syntax and semantics of URIs vary from scheme to scheme, as - described by the defining specification for each scheme. - Implementations may use scheme-specific rules, at further processing - cost, to reduce the probability of false negatives. For example, - because the "http" scheme makes use of an authority component, has a - default port of "80", and defines an empty path to be equivalent to - "/", the following four URIs are equivalent: - - http://example.com - http://example.com/ - http://example.com:/ - http://example.com:80/ - - In general, a URI that uses the generic syntax for authority with an - empty path should be normalized to a path of "/". Likewise, an - explicit ":port", for which the port is empty or the default for the - scheme, is equivalent to one where the port and its ":" delimiter are - elided and thus should be removed by scheme-based normalization. For - example, the second URI above is the normal form for the "http" - scheme. - - Another case where normalization varies by scheme is in the handling - of an empty authority component or empty host subcomponent. For many - scheme specifications, an empty authority or host is considered an - error; for others, it is considered equivalent to "localhost" or the - end-user's host. When a scheme defines a default for authority and a - URI reference to that default is desired, the reference should be - normalized to an empty authority for the sake of uniformity, brevity, - and internationalization. If, however, either the userinfo or port - subcomponents are non-empty, then the host should be given explicitly - even if it matches the default. - - Normalization should not remove delimiters when their associated - component is empty unless licensed to do so by the scheme - - - -Berners-Lee, et al. Standards Track [Page 41] - -RFC 3986 URI Generic Syntax January 2005 - - - specification. For example, the URI "http://example.com/?" cannot be - assumed to be equivalent to any of the examples above. Likewise, the - presence or absence of delimiters within a userinfo subcomponent is - usually significant to its interpretation. The fragment component is - not subject to any scheme-based normalization; thus, two URIs that - differ only by the suffix "#" are considered different regardless of - the scheme. - - Some schemes define additional subcomponents that consist of case- - insensitive data, giving an implicit license to normalizers to - convert this data to a common case (e.g., all lowercase). For - example, URI schemes that define a subcomponent of path to contain an - Internet hostname, such as the "mailto" URI scheme, cause that - subcomponent to be case-insensitive and thus subject to case - normalization (e.g., "mailto:Joe@Example.COM" is equivalent to - "mailto:Joe@example.com", even though the generic syntax considers - the path component to be case-sensitive). - - Other scheme-specific normalizations are possible. - -6.2.4. Protocol-Based Normalization - - Substantial effort to reduce the incidence of false negatives is - often cost-effective for web spiders. Therefore, they implement even - more aggressive techniques in URI comparison. For example, if they - observe that a URI such as - - http://example.com/data - - redirects to a URI differing only in the trailing slash - - http://example.com/data/ - - they will likely regard the two as equivalent in the future. This - kind of technique is only appropriate when equivalence is clearly - indicated by both the result of accessing the resources and the - common conventions of their scheme's dereference algorithm (in this - case, use of redirection by HTTP origin servers to avoid problems - with relative references). - - - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 42] - -RFC 3986 URI Generic Syntax January 2005 - - -7. Security Considerations - - A URI does not in itself pose a security threat. However, as URIs - are often used to provide a compact set of instructions for access to - network resources, care must be taken to properly interpret the data - within a URI, to prevent that data from causing unintended access, - and to avoid including data that should not be revealed in plain - text. - -7.1. Reliability and Consistency - - There is no guarantee that once a URI has been used to retrieve - information, the same information will be retrievable by that URI in - the future. Nor is there any guarantee that the information - retrievable via that URI in the future will be observably similar to - that retrieved in the past. The URI syntax does not constrain how a - given scheme or authority apportions its namespace or maintains it - over time. Such guarantees can only be obtained from the person(s) - controlling that namespace and the resource in question. A specific - URI scheme may define additional semantics, such as name persistence, - if those semantics are required of all naming authorities for that - scheme. - -7.2. Malicious Construction - - It is sometimes possible to construct a URI so that an attempt to - perform a seemingly harmless, idempotent operation, such as the - retrieval of a representation, will in fact cause a possibly damaging - remote operation. The unsafe URI is typically constructed by - specifying a port number other than that reserved for the network - protocol in question. The client unwittingly contacts a site running - a different protocol service, and data within the URI contains - instructions that, when interpreted according to this other protocol, - cause an unexpected operation. A frequent example of such abuse has - been the use of a protocol-based scheme with a port component of - "25", thereby fooling user agent software into sending an unintended - or impersonating message via an SMTP server. - - Applications should prevent dereference of a URI that specifies a TCP - port number within the "well-known port" range (0 - 1023) unless the - protocol being used to dereference that URI is compatible with the - protocol expected on that well-known port. Although IANA maintains a - registry of well-known ports, applications should make such - restrictions user-configurable to avoid preventing the deployment of - new services. - - - - - - -Berners-Lee, et al. Standards Track [Page 43] - -RFC 3986 URI Generic Syntax January 2005 - - - When a URI contains percent-encoded octets that match the delimiters - for a given resolution or dereference protocol (for example, CR and - LF characters for the TELNET protocol), these percent-encodings must - not be decoded before transmission across that protocol. Transfer of - the percent-encoding, which might violate the protocol, is less - harmful than allowing decoded octets to be interpreted as additional - operations or parameters, perhaps triggering an unexpected and - possibly harmful remote operation. - -7.3. Back-End Transcoding - - When a URI is dereferenced, the data within it is often parsed by - both the user agent and one or more servers. In HTTP, for example, a - typical user agent will parse a URI into its five major components, - access the authority's server, and send it the data within the - authority, path, and query components. A typical server will take - that information, parse the path into segments and the query into - key/value pairs, and then invoke implementation-specific handlers to - respond to the request. As a result, a common security concern for - server implementations that handle a URI, either as a whole or split - into separate components, is proper interpretation of the octet data - represented by the characters and percent-encodings within that URI. - - Percent-encoded octets must be decoded at some point during the - dereference process. Applications must split the URI into its - components and subcomponents prior to decoding the octets, as - otherwise the decoded octets might be mistaken for delimiters. - Security checks of the data within a URI should be applied after - decoding the octets. Note, however, that the "%00" percent-encoding - (NUL) may require special handling and should be rejected if the - application is not expecting to receive raw data within a component. - - Special care should be taken when the URI path interpretation process - involves the use of a back-end file system or related system - functions. File systems typically assign an operational meaning to - special characters, such as the "/", "\", ":", "[", and "]" - characters, and to special device names like ".", "..", "...", "aux", - "lpt", etc. In some cases, merely testing for the existence of such - a name will cause the operating system to pause or invoke unrelated - system calls, leading to significant security concerns regarding - denial of service and unintended data transfer. It would be - impossible for this specification to list all such significant - characters and device names. Implementers should research the - reserved names and characters for the types of storage device that - may be attached to their applications and restrict the use of data - obtained from URI components accordingly. - - - - - -Berners-Lee, et al. Standards Track [Page 44] - -RFC 3986 URI Generic Syntax January 2005 - - -7.4. Rare IP Address Formats - - Although the URI syntax for IPv4address only allows the common - dotted-decimal form of IPv4 address literal, many implementations - that process URIs make use of platform-dependent system routines, - such as gethostbyname() and inet_aton(), to translate the string - literal to an actual IP address. Unfortunately, such system routines - often allow and process a much larger set of formats than those - described in Section 3.2.2. - - For example, many implementations allow dotted forms of three - numbers, wherein the last part is interpreted as a 16-bit quantity - and placed in the right-most two bytes of the network address (e.g., - a Class B network). Likewise, a dotted form of two numbers means - that the last part is interpreted as a 24-bit quantity and placed in - the right-most three bytes of the network address (Class A), and a - single number (without dots) is interpreted as a 32-bit quantity and - stored directly in the network address. Adding further to the - confusion, some implementations allow each dotted part to be - interpreted as decimal, octal, or hexadecimal, as specified in the C - language (i.e., a leading 0x or 0X implies hexadecimal; a leading 0 - implies octal; otherwise, the number is interpreted as decimal). - - These additional IP address formats are not allowed in the URI syntax - due to differences between platform implementations. However, they - can become a security concern if an application attempts to filter - access to resources based on the IP address in string literal format. - If this filtering is performed, literals should be converted to - numeric form and filtered based on the numeric value, and not on a - prefix or suffix of the string form. - -7.5. Sensitive Information - - URI producers should not provide a URI that contains a username or - password that is intended to be secret. URIs are frequently - displayed by browsers, stored in clear text bookmarks, and logged by - user agent history and intermediary applications (proxies). A - password appearing within the userinfo component is deprecated and - should be considered an error (or simply ignored) except in those - rare cases where the 'password' parameter is intended to be public. - -7.6. Semantic Attacks - - Because the userinfo subcomponent is rarely used and appears before - the host in the authority component, it can be used to construct a - URI intended to mislead a human user by appearing to identify one - (trusted) naming authority while actually identifying a different - authority hidden behind the noise. For example - - - -Berners-Lee, et al. Standards Track [Page 45] - -RFC 3986 URI Generic Syntax January 2005 - - - ftp://cnn.example.com&story=breaking_news@10.0.0.1/top_story.htm - - might lead a human user to assume that the host is 'cnn.example.com', - whereas it is actually '10.0.0.1'. Note that a misleading userinfo - subcomponent could be much longer than the example above. - - A misleading URI, such as that above, is an attack on the user's - preconceived notions about the meaning of a URI rather than an attack - on the software itself. User agents may be able to reduce the impact - of such attacks by distinguishing the various components of the URI - when they are rendered, such as by using a different color or tone to - render userinfo if any is present, though there is no panacea. More - information on URI-based semantic attacks can be found in [Siedzik]. - -8. IANA Considerations - - URI scheme names, as defined by <scheme> in Section 3.1, form a - registered namespace that is managed by IANA according to the - procedures defined in [BCP35]. No IANA actions are required by this - document. - -9. Acknowledgements - - This specification is derived from RFC 2396 [RFC2396], RFC 1808 - [RFC1808], and RFC 1738 [RFC1738]; the acknowledgements in those - documents still apply. It also incorporates the update (with - corrections) for IPv6 literals in the host syntax, as defined by - Robert M. Hinden, Brian E. Carpenter, and Larry Masinter in - [RFC2732]. In addition, contributions by Gisle Aas, Reese Anschultz, - Daniel Barclay, Tim Bray, Mike Brown, Rob Cameron, Jeremy Carroll, - Dan Connolly, Adam M. Costello, John Cowan, Jason Diamond, Martin - Duerst, Stefan Eissing, Clive D.W. Feather, Al Gilman, Tony Hammond, - Elliotte Harold, Pat Hayes, Henry Holtzman, Ian B. Jacobs, Michael - Kay, John C. Klensin, Graham Klyne, Dan Kohn, Bruce Lilly, Andrew - Main, Dave McAlpin, Ira McDonald, Michael Mealling, Ray Merkert, - Stephen Pollei, Julian Reschke, Tomas Rokicki, Miles Sabin, Kai - Schaetzl, Mark Thomson, Ronald Tschalaer, Norm Walsh, Marc Warne, - Stuart Williams, and Henry Zongaro are gratefully acknowledged. - -10. References - -10.1. Normative References - - [ASCII] American National Standards Institute, "Coded Character - Set -- 7-bit American Standard Code for Information - Interchange", ANSI X3.4, 1986. - - - - - -Berners-Lee, et al. Standards Track [Page 46] - -RFC 3986 URI Generic Syntax January 2005 - - - [RFC2234] Crocker, D. and P. Overell, "Augmented BNF for Syntax - Specifications: ABNF", RFC 2234, November 1997. - - [STD63] Yergeau, F., "UTF-8, a transformation format of - ISO 10646", STD 63, RFC 3629, November 2003. - - [UCS] International Organization for Standardization, - "Information Technology - Universal Multiple-Octet Coded - Character Set (UCS)", ISO/IEC 10646:2003, December 2003. - -10.2. Informative References - - [BCP19] Freed, N. and J. Postel, "IANA Charset Registration - Procedures", BCP 19, RFC 2978, October 2000. - - [BCP35] Petke, R. and I. King, "Registration Procedures for URL - Scheme Names", BCP 35, RFC 2717, November 1999. - - [RFC0952] Harrenstien, K., Stahl, M., and E. Feinler, "DoD Internet - host table specification", RFC 952, October 1985. - - [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", - STD 13, RFC 1034, November 1987. - - [RFC1123] Braden, R., "Requirements for Internet Hosts - Application - and Support", STD 3, RFC 1123, October 1989. - - [RFC1535] Gavron, E., "A Security Problem and Proposed Correction - With Widely Deployed DNS Software", RFC 1535, - October 1993. - - [RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW: A - Unifying Syntax for the Expression of Names and Addresses - of Objects on the Network as used in the World-Wide Web", - RFC 1630, June 1994. - - [RFC1736] Kunze, J., "Functional Recommendations for Internet - Resource Locators", RFC 1736, February 1995. - - [RFC1737] Sollins, K. and L. Masinter, "Functional Requirements for - Uniform Resource Names", RFC 1737, December 1994. - - [RFC1738] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform - Resource Locators (URL)", RFC 1738, December 1994. - - [RFC1808] Fielding, R., "Relative Uniform Resource Locators", - RFC 1808, June 1995. - - - - -Berners-Lee, et al. Standards Track [Page 47] - -RFC 3986 URI Generic Syntax January 2005 - - - [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part Two: Media Types", RFC 2046, - November 1996. - - [RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997. - - [RFC2396] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform - Resource Identifiers (URI): Generic Syntax", RFC 2396, - August 1998. - - [RFC2518] Goland, Y., Whitehead, E., Faizi, A., Carter, S., and D. - Jensen, "HTTP Extensions for Distributed Authoring -- - WEBDAV", RFC 2518, February 1999. - - [RFC2557] Palme, J., Hopmann, A., and N. Shelness, "MIME - Encapsulation of Aggregate Documents, such as HTML - (MHTML)", RFC 2557, March 1999. - - [RFC2718] Masinter, L., Alvestrand, H., Zigmond, D., and R. Petke, - "Guidelines for new URL Schemes", RFC 2718, November 1999. - - [RFC2732] Hinden, R., Carpenter, B., and L. Masinter, "Format for - Literal IPv6 Addresses in URL's", RFC 2732, December 1999. - - [RFC3305] Mealling, M. and R. Denenberg, "Report from the Joint - W3C/IETF URI Planning Interest Group: Uniform Resource - Identifiers (URIs), URLs, and Uniform Resource Names - (URNs): Clarifications and Recommendations", RFC 3305, - August 2002. - - [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, - "Internationalizing Domain Names in Applications (IDNA)", - RFC 3490, March 2003. - - [RFC3513] Hinden, R. and S. Deering, "Internet Protocol Version 6 - (IPv6) Addressing Architecture", RFC 3513, April 2003. - - [Siedzik] Siedzik, R., "Semantic Attacks: What's in a URL?", - April 2001, <http://www.giac.org/practical/gsec/ - Richard_Siedzik_GSEC.pdf>. - - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 48] - -RFC 3986 URI Generic Syntax January 2005 - - -Appendix A. Collected ABNF for URI - - URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] - - hier-part = "//" authority path-abempty - / path-absolute - / path-rootless - / path-empty - - URI-reference = URI / relative-ref - - absolute-URI = scheme ":" hier-part [ "?" query ] - - relative-ref = relative-part [ "?" query ] [ "#" fragment ] - - relative-part = "//" authority path-abempty - / path-absolute - / path-noscheme - / path-empty - - scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) - - authority = [ userinfo "@" ] host [ ":" port ] - userinfo = *( unreserved / pct-encoded / sub-delims / ":" ) - host = IP-literal / IPv4address / reg-name - port = *DIGIT - - IP-literal = "[" ( IPv6address / IPvFuture ) "]" - - IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" ) - - IPv6address = 6( h16 ":" ) ls32 - / "::" 5( h16 ":" ) ls32 - / [ h16 ] "::" 4( h16 ":" ) ls32 - / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32 - / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32 - / [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32 - / [ *4( h16 ":" ) h16 ] "::" ls32 - / [ *5( h16 ":" ) h16 ] "::" h16 - / [ *6( h16 ":" ) h16 ] "::" - - h16 = 1*4HEXDIG - ls32 = ( h16 ":" h16 ) / IPv4address - IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet - - - - - - - -Berners-Lee, et al. Standards Track [Page 49] - -RFC 3986 URI Generic Syntax January 2005 - - - dec-octet = DIGIT ; 0-9 - / %x31-39 DIGIT ; 10-99 - / "1" 2DIGIT ; 100-199 - / "2" %x30-34 DIGIT ; 200-249 - / "25" %x30-35 ; 250-255 - - reg-name = *( unreserved / pct-encoded / sub-delims ) - - path = path-abempty ; begins with "/" or is empty - / path-absolute ; begins with "/" but not "//" - / path-noscheme ; begins with a non-colon segment - / path-rootless ; begins with a segment - / path-empty ; zero characters - - path-abempty = *( "/" segment ) - path-absolute = "/" [ segment-nz *( "/" segment ) ] - path-noscheme = segment-nz-nc *( "/" segment ) - path-rootless = segment-nz *( "/" segment ) - path-empty = 0<pchar> - - segment = *pchar - segment-nz = 1*pchar - segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" ) - ; non-zero-length segment without any colon ":" - - pchar = unreserved / pct-encoded / sub-delims / ":" / "@" - - query = *( pchar / "/" / "?" ) - - fragment = *( pchar / "/" / "?" ) - - pct-encoded = "%" HEXDIG HEXDIG - - unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" - reserved = gen-delims / sub-delims - gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" - sub-delims = "!" / "$" / "&" / "'" / "(" / ")" - / "*" / "+" / "," / ";" / "=" - -Appendix B. Parsing a URI Reference with a Regular Expression - - As the "first-match-wins" algorithm is identical to the "greedy" - disambiguation method used by POSIX regular expressions, it is - natural and commonplace to use a regular expression for parsing the - potential five components of a URI reference. - - The following line is the regular expression for breaking-down a - well-formed URI reference into its components. - - - -Berners-Lee, et al. Standards Track [Page 50] - -RFC 3986 URI Generic Syntax January 2005 - - - ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))? - 12 3 4 5 6 7 8 9 - - The numbers in the second line above are only to assist readability; - they indicate the reference points for each subexpression (i.e., each - paired parenthesis). We refer to the value matched for subexpression - <n> as $<n>. For example, matching the above expression to - - http://www.ics.uci.edu/pub/ietf/uri/#Related - - results in the following subexpression matches: - - $1 = http: - $2 = http - $3 = //www.ics.uci.edu - $4 = www.ics.uci.edu - $5 = /pub/ietf/uri/ - $6 = <undefined> - $7 = <undefined> - $8 = #Related - $9 = Related - - where <undefined> indicates that the component is not present, as is - the case for the query component in the above example. Therefore, we - can determine the value of the five components as - - scheme = $2 - authority = $4 - path = $5 - query = $7 - fragment = $9 - - Going in the opposite direction, we can recreate a URI reference from - its components by using the algorithm of Section 5.3. - -Appendix C. Delimiting a URI in Context - - URIs are often transmitted through formats that do not provide a - clear context for their interpretation. For example, there are many - occasions when a URI is included in plain text; examples include text - sent in email, USENET news, and on printed paper. In such cases, it - is important to be able to delimit the URI from the rest of the text, - and in particular from punctuation marks that might be mistaken for - part of the URI. - - In practice, URIs are delimited in a variety of ways, but usually - within double-quotes "http://example.com/", angle brackets - <http://example.com/>, or just by using whitespace: - - - -Berners-Lee, et al. Standards Track [Page 51] - -RFC 3986 URI Generic Syntax January 2005 - - - http://example.com/ - - These wrappers do not form part of the URI. - - In some cases, extra whitespace (spaces, line-breaks, tabs, etc.) may - have to be added to break a long URI across lines. The whitespace - should be ignored when the URI is extracted. - - No whitespace should be introduced after a hyphen ("-") character. - Because some typesetters and printers may (erroneously) introduce a - hyphen at the end of line when breaking it, the interpreter of a URI - containing a line break immediately after a hyphen should ignore all - whitespace around the line break and should be aware that the hyphen - may or may not actually be part of the URI. - - Using <> angle brackets around each URI is especially recommended as - a delimiting style for a reference that contains embedded whitespace. - - The prefix "URL:" (with or without a trailing space) was formerly - recommended as a way to help distinguish a URI from other bracketed - designators, though it is not commonly used in practice and is no - longer recommended. - - For robustness, software that accepts user-typed URI should attempt - to recognize and strip both delimiters and embedded whitespace. - - For example, the text - - Yes, Jim, I found it under "http://www.w3.org/Addressing/", - but you can probably pick it up from <ftp://foo.example. - com/rfc/>. Note the warning in <http://www.ics.uci.edu/pub/ - ietf/uri/historical.html#WARNING>. - - contains the URI references - - http://www.w3.org/Addressing/ - ftp://foo.example.com/rfc/ - http://www.ics.uci.edu/pub/ietf/uri/historical.html#WARNING - - - - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 52] - -RFC 3986 URI Generic Syntax January 2005 - - -Appendix D. Changes from RFC 2396 - -D.1. Additions - - An ABNF rule for URI has been introduced to correspond to one common - usage of the term: an absolute URI with optional fragment. - - IPv6 (and later) literals have been added to the list of possible - identifiers for the host portion of an authority component, as - described by [RFC2732], with the addition of "[" and "]" to the - reserved set and a version flag to anticipate future versions of IP - literals. Square brackets are now specified as reserved within the - authority component and are not allowed outside their use as - delimiters for an IP literal within host. In order to make this - change without changing the technical definition of the path, query, - and fragment components, those rules were redefined to directly - specify the characters allowed. - - As [RFC2732] defers to [RFC3513] for definition of an IPv6 literal - address, which, unfortunately, lacks an ABNF description of - IPv6address, we created a new ABNF rule for IPv6address that matches - the text representations defined by Section 2.2 of [RFC3513]. - Likewise, the definition of IPv4address has been improved in order to - limit each decimal octet to the range 0-255. - - Section 6, on URI normalization and comparison, has been completely - rewritten and extended by using input from Tim Bray and discussion - within the W3C Technical Architecture Group. - -D.2. Modifications - - The ad-hoc BNF syntax of RFC 2396 has been replaced with the ABNF of - [RFC2234]. This change required all rule names that formerly - included underscore characters to be renamed with a dash instead. In - addition, a number of syntax rules have been eliminated or simplified - to make the overall grammar more comprehensible. Specifications that - refer to the obsolete grammar rules may be understood by replacing - those rules according to the following table: - - - - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 53] - -RFC 3986 URI Generic Syntax January 2005 - - - +----------------+--------------------------------------------------+ - | obsolete rule | translation | - +----------------+--------------------------------------------------+ - | absoluteURI | absolute-URI | - | relativeURI | relative-part [ "?" query ] | - | hier_part | ( "//" authority path-abempty / | - | | path-absolute ) [ "?" query ] | - | | | - | opaque_part | path-rootless [ "?" query ] | - | net_path | "//" authority path-abempty | - | abs_path | path-absolute | - | rel_path | path-rootless | - | rel_segment | segment-nz-nc | - | reg_name | reg-name | - | server | authority | - | hostport | host [ ":" port ] | - | hostname | reg-name | - | path_segments | path-abempty | - | param | *<pchar excluding ";"> | - | | | - | uric | unreserved / pct-encoded / ";" / "?" / ":" | - | | / "@" / "&" / "=" / "+" / "$" / "," / "/" | - | | | - | uric_no_slash | unreserved / pct-encoded / ";" / "?" / ":" | - | | / "@" / "&" / "=" / "+" / "$" / "," | - | | | - | mark | "-" / "_" / "." / "!" / "~" / "*" / "'" | - | | / "(" / ")" | - | | | - | escaped | pct-encoded | - | hex | HEXDIG | - | alphanum | ALPHA / DIGIT | - +----------------+--------------------------------------------------+ - - Use of the above obsolete rules for the definition of scheme-specific - syntax is deprecated. - - Section 2, on characters, has been rewritten to explain what - characters are reserved, when they are reserved, and why they are - reserved, even when they are not used as delimiters by the generic - syntax. The mark characters that are typically unsafe to decode, - including the exclamation mark ("!"), asterisk ("*"), single-quote - ("'"), and open and close parentheses ("(" and ")"), have been moved - to the reserved set in order to clarify the distinction between - reserved and unreserved and, hopefully, to answer the most common - question of scheme designers. Likewise, the section on - percent-encoded characters has been rewritten, and URI normalizers - are now given license to decode any percent-encoded octets - - - -Berners-Lee, et al. Standards Track [Page 54] - -RFC 3986 URI Generic Syntax January 2005 - - - corresponding to unreserved characters. In general, the terms - "escaped" and "unescaped" have been replaced with "percent-encoded" - and "decoded", respectively, to reduce confusion with other forms of - escape mechanisms. - - The ABNF for URI and URI-reference has been redesigned to make them - more friendly to LALR parsers and to reduce complexity. As a result, - the layout form of syntax description has been removed, along with - the uric, uric_no_slash, opaque_part, net_path, abs_path, rel_path, - path_segments, rel_segment, and mark rules. All references to - "opaque" URIs have been replaced with a better description of how the - path component may be opaque to hierarchy. The relativeURI rule has - been replaced with relative-ref to avoid unnecessary confusion over - whether they are a subset of URI. The ambiguity regarding the - parsing of URI-reference as a URI or a relative-ref with a colon in - the first segment has been eliminated through the use of five - separate path matching rules. - - The fragment identifier has been moved back into the section on - generic syntax components and within the URI and relative-ref rules, - though it remains excluded from absolute-URI. The number sign ("#") - character has been moved back to the reserved set as a result of - reintegrating the fragment syntax. - - The ABNF has been corrected to allow the path component to be empty. - This also allows an absolute-URI to consist of nothing after the - "scheme:", as is present in practice with the "dav:" namespace - [RFC2518] and with the "about:" scheme used internally by many WWW - browser implementations. The ambiguity regarding the boundary - between authority and path has been eliminated through the use of - five separate path matching rules. - - Registry-based naming authorities that use the generic syntax are now - defined within the host rule. This change allows current - implementations, where whatever name provided is simply fed to the - local name resolution mechanism, to be consistent with the - specification. It also removes the need to re-specify DNS name - formats here. Furthermore, it allows the host component to contain - percent-encoded octets, which is necessary to enable - internationalized domain names to be provided in URIs, processed in - their native character encodings at the application layers above URI - processing, and passed to an IDNA library as a registered name in the - UTF-8 character encoding. The server, hostport, hostname, - domainlabel, toplabel, and alphanum rules have been removed. - - The resolving relative references algorithm of [RFC2396] has been - rewritten with pseudocode for this revision to improve clarity and - fix the following issues: - - - -Berners-Lee, et al. Standards Track [Page 55] - -RFC 3986 URI Generic Syntax January 2005 - - - o [RFC2396] section 5.2, step 6a, failed to account for a base URI - with no path. - - o Restored the behavior of [RFC1808] where, if the reference - contains an empty path and a defined query component, the target - URI inherits the base URI's path component. - - o The determination of whether a URI reference is a same-document - reference has been decoupled from the URI parser, simplifying the - URI processing interface within applications in a way consistent - with the internal architecture of deployed URI processing - implementations. The determination is now based on comparison to - the base URI after transforming a reference to absolute form, - rather than on the format of the reference itself. This change - may result in more references being considered "same-document" - under this specification than there would be under the rules given - in RFC 2396, especially when normalization is used to reduce - aliases. However, it does not change the status of existing - same-document references. - - o Separated the path merge routine into two routines: merge, for - describing combination of the base URI path with a relative-path - reference, and remove_dot_segments, for describing how to remove - the special "." and ".." segments from a composed path. The - remove_dot_segments algorithm is now applied to all URI reference - paths in order to match common implementations and to improve the - normalization of URIs in practice. This change only impacts the - parsing of abnormal references and same-scheme references wherein - the base URI has a non-hierarchical path. - -Index - - A - ABNF 11 - absolute 27 - absolute-path 26 - absolute-URI 27 - access 9 - authority 17, 18 - - B - base URI 28 - - C - character encoding 4 - character 4 - characters 8, 11 - coded character set 4 - - - -Berners-Lee, et al. Standards Track [Page 56] - -RFC 3986 URI Generic Syntax January 2005 - - - D - dec-octet 20 - dereference 9 - dot-segments 23 - - F - fragment 16, 24 - - G - gen-delims 13 - generic syntax 6 - - H - h16 20 - hier-part 16 - hierarchical 10 - host 18 - - I - identifier 5 - IP-literal 19 - IPv4 20 - IPv4address 19, 20 - IPv6 19 - IPv6address 19, 20 - IPvFuture 19 - - L - locator 7 - ls32 20 - - M - merge 32 - - N - name 7 - network-path 26 - - P - path 16, 22, 26 - path-abempty 22 - path-absolute 22 - path-empty 22 - path-noscheme 22 - path-rootless 22 - path-abempty 16, 22, 26 - path-absolute 16, 22, 26 - path-empty 16, 22, 26 - - - -Berners-Lee, et al. Standards Track [Page 57] - -RFC 3986 URI Generic Syntax January 2005 - - - path-rootless 16, 22 - pchar 23 - pct-encoded 12 - percent-encoding 12 - port 22 - - Q - query 16, 23 - - R - reg-name 21 - registered name 20 - relative 10, 28 - relative-path 26 - relative-ref 26 - remove_dot_segments 33 - representation 9 - reserved 12 - resolution 9, 28 - resource 5 - retrieval 9 - - S - same-document 27 - sameness 9 - scheme 16, 17 - segment 22, 23 - segment-nz 23 - segment-nz-nc 23 - sub-delims 13 - suffix 27 - - T - transcription 8 - - U - uniform 4 - unreserved 13 - URI grammar - absolute-URI 27 - ALPHA 11 - authority 18 - CR 11 - dec-octet 20 - DIGIT 11 - DQUOTE 11 - fragment 24 - gen-delims 13 - - - -Berners-Lee, et al. Standards Track [Page 58] - -RFC 3986 URI Generic Syntax January 2005 - - - h16 20 - HEXDIG 11 - hier-part 16 - host 19 - IP-literal 19 - IPv4address 20 - IPv6address 20 - IPvFuture 19 - LF 11 - ls32 20 - OCTET 11 - path 22 - path-abempty 22 - path-absolute 22 - path-empty 22 - path-noscheme 22 - path-rootless 22 - pchar 23 - pct-encoded 12 - port 22 - query 24 - reg-name 21 - relative-ref 26 - reserved 13 - scheme 17 - segment 23 - segment-nz 23 - segment-nz-nc 23 - SP 11 - sub-delims 13 - unreserved 13 - URI 16 - URI-reference 25 - userinfo 18 - URI 16 - URI-reference 25 - URL 7 - URN 7 - userinfo 18 - - - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 59] - -RFC 3986 URI Generic Syntax January 2005 - - -Authors' Addresses - - Tim Berners-Lee - World Wide Web Consortium - Massachusetts Institute of Technology - 77 Massachusetts Avenue - Cambridge, MA 02139 - USA - - Phone: +1-617-253-5702 - Fax: +1-617-258-5999 - EMail: timbl@w3.org - URI: http://www.w3.org/People/Berners-Lee/ - - - Roy T. Fielding - Day Software - 5251 California Ave., Suite 110 - Irvine, CA 92617 - USA - - Phone: +1-949-679-2960 - Fax: +1-949-679-2972 - EMail: fielding@gbiv.com - URI: http://roy.gbiv.com/ - - - Larry Masinter - Adobe Systems Incorporated - 345 Park Ave - San Jose, CA 95110 - USA - - Phone: +1-408-536-3024 - EMail: LMM@acm.org - URI: http://larry.masinter.net/ - - - - - - - - - - - - - - - -Berners-Lee, et al. Standards Track [Page 60] - -RFC 3986 URI Generic Syntax January 2005 - - -Full Copyright Statement - - Copyright (C) The Internet Society (2005). - - This document is subject to the rights, licenses and restrictions - contained in BCP 78, and except as set forth therein, the authors - retain all their rights. - - This document and the information contained herein are provided on an - "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS - OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET - ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, - INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE - INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED - WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Intellectual Property - - The IETF takes no position regarding the validity or scope of any - Intellectual Property Rights or other rights that might be claimed to - pertain to the implementation or use of the technology described in - this document or the extent to which any license under such rights - might or might not be available; nor does it represent that it has - made any independent effort to identify any such rights. Information - on the IETF's procedures with respect to rights in IETF Documents can - be found in BCP 78 and BCP 79. - - Copies of IPR disclosures made to the IETF Secretariat and any - assurances of licenses to be made available, or the result of an - attempt made to obtain a general license or permission for the use of - such proprietary rights by implementers or users of this - specification can be obtained from the IETF on-line IPR repository at - http://www.ietf.org/ipr. - - The IETF invites any interested party to bring to its attention any - copyrights, patents or patent applications, or other proprietary - rights that may cover technology that may be required to implement - this standard. Please address the information to the IETF at ietf- - ipr@ietf.org. - - -Acknowledgement - - Funding for the RFC Editor function is currently provided by the - Internet Society. - - - - - - -Berners-Lee, et al. Standards Track [Page 61] - |