summaryrefslogtreecommitdiff
path: root/xhtml1-20020801/guidelines.html
diff options
context:
space:
mode:
Diffstat (limited to 'xhtml1-20020801/guidelines.html')
-rw-r--r--xhtml1-20020801/guidelines.html229
1 files changed, 229 insertions, 0 deletions
diff --git a/xhtml1-20020801/guidelines.html b/xhtml1-20020801/guidelines.html
new file mode 100644
index 0000000..42bdfd1
--- /dev/null
+++ b/xhtml1-20020801/guidelines.html
@@ -0,0 +1,229 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
+<head>
+<meta name="generator" content="HTML Tidy, see www.w3.org" />
+<title>XHTML 1.0 - HTML Compatibility Guidelines</title>
+<link rel="stylesheet" type="text/css" media="screen" href="xhtml.css" />
+<link rel="stylesheet" type="text/css" media="screen" href="W3C-REC.css" />
+</head>
+<body>
+<div class="navbar">[<a href="prohibitions.html">previous</a>] &#160; [<a href="acks.html">next</a>] &#160; [<a href="Cover.html#toc">table of contents</a>]
+
+<hr />
+</div>
+
+<h1><a name="guidelines" id="guidelines">C.</a> HTML Compatibility Guidelines</h1>
+
+<div class='subtoc'>
+<p><strong>Contents</strong></p>
+
+<ul class='toc'>
+<li class='tocline'>C.1. <a href="#C_1" class="tocxref">Processing Instructions and the XML Declaration</a></li>
+
+<li class='tocline'>C.2. <a href="#C_2" class="tocxref">Empty Elements</a></li>
+
+<li class='tocline'>C.3. <a href="#C_3" class="tocxref"> Element Minimization and Empty Element Content</a></li>
+
+<li class='tocline'>C.4. <a href="#C_4" class="tocxref">Embedded Style Sheets and Scripts</a></li>
+
+<li class='tocline'>C.5. <a href="#C_5" class="tocxref">Line Breaks within Attribute Values</a></li>
+
+<li class='tocline'>C.6. <a href="#C_6" class="tocxref">Isindex</a></li>
+
+<li class='tocline'>C.7. <a href="#C_7" class="tocxref">The <code>lang</code> and <code>xml:lang</code> Attributes</a></li>
+
+<li class='tocline'>C.8. <a href="#C_8" class="tocxref">Fragment Identifiers</a></li>
+
+<li class='tocline'>C.9. <a href="#C_9" class="tocxref">Character Encoding</a></li>
+
+<li class='tocline'>C.10. <a href="#C_10" class="tocxref">Boolean Attributes</a></li>
+
+<li class='tocline'>C.11. <a href="#C_11" class="tocxref">Document Object Model and XHTML</a></li>
+
+<li class='tocline'>C.12. <a href="#C_12" class="tocxref">Using Ampersands in Attribute Values (and Elsewhere)</a></li>
+
+<li class='tocline'>C.13. <a href="#C_13" class="tocxref">Cascading Style Sheets (CSS) and XHTML</a></li>
+
+<li class='tocline'>C.14. <a href="#C_14" class="tocxref">Referencing Style Elements when serving as XML</a></li>
+
+<li class='tocline'>C.15. <a href="#C_15" class="tocxref">White Space Characters in HTML vs. XML</a></li>
+
+<li class='tocline'>C.16. <a href="#C_16" class="tocxref">The Named Character Reference &amp;apos;</a></li>
+</ul>
+</div>
+
+<p><strong>This appendix is informative.</strong></p>
+
+<p>This appendix summarizes design guidelines for authors who wish their XHTML documents to render on existing HTML user agents. <em>Note that this recommendation does not define how HTML conforming
+user agents should process HTML documents. Nor does it define the meaning of the Internet Media Type <code>text/html</code>. For these definitions, see [<a class="nref" href=
+"references.html#ref-html4">HTML4</a>] and [<a class="nref" href="references.html#ref-rfc2854">RFC2854</a>] respectively.</em></p>
+
+<h2><a name="C_1" id="C_1">C.1.</a> Processing Instructions and the XML Declaration</h2>
+
+<p>Be aware that processing instructions are rendered on some user agents. Also, some user agents interpret the XML declaration to mean that the document is unrecognized XML rather than HTML, and
+therefore may not render the document as expected. For compatibility with these types of legacy browsers, you may want to avoid using processing instructions and XML declarations. Remember, however,
+that when the XML declaration is not included in a document, the document can only use the default character encodings UTF-8 or UTF-16.</p>
+
+<h2><a name="C_2" id="C_2">C.2.</a> Empty Elements</h2>
+
+<p>Include a space before the trailing <code>/</code> and <code>&gt;</code> of empty elements, e.g. <code class="greenmono">&lt;br&#160;/&gt;</code>, <code class="greenmono">&lt;hr&#160;/&gt;</code>
+and <code class="greenmono">&lt;img src="karen.jpg" alt="Karen"&#160;/&gt;</code>. Also, use the minimized tag syntax for empty elements, e.g. <code class="greenmono">&lt;br /&gt;</code>, as the
+alternative syntax <code class="greenmono">&lt;br&gt;&lt;/br&gt;</code> allowed by XML gives uncertain results in many existing user agents.</p>
+
+<h2><a name="C_3" id="C_3">C.3.</a> Element Minimization and Empty Element Content</h2>
+
+<p>Given an empty instance of an element whose content model is not <code>EMPTY</code> (for example, an empty title or paragraph) do not use the minimized form (e.g. use <code class="greenmono">
+&lt;p&gt; &lt;/p&gt;</code> and not <code class="greenmono">&lt;p&#160;/&gt;</code>).</p>
+
+<h2><a name="C_4" id="C_4">C.4.</a> Embedded Style Sheets and Scripts</h2>
+
+<p>Use external style sheets if your style sheet uses <code>&lt;</code> or <code>&amp;</code> or <code>]]&gt;</code> or <code>--</code>. Use external scripts if your script uses <code>&lt;</code> or
+<code>&amp;</code> or <code>]]&gt;</code> or <code>--</code>. Note that XML parsers are permitted to silently remove the contents of comments. Therefore, the historical practice of "hiding" scripts
+and style sheets within "comments" to make the documents backward compatible is likely to not work as expected in XML-based user agents.</p>
+
+<h2><a name="C_5" id="C_5">C.5.</a> Line Breaks within Attribute Values</h2>
+
+<p>Avoid line breaks and multiple white space characters within attribute values. These are handled inconsistently by user agents.</p>
+
+<h2><a name="C_6" id="C_6">C.6.</a> Isindex</h2>
+
+<p>Don't include more than one <code>isindex</code> element in the document <code>head</code>. The <code>isindex</code> element is deprecated in favor of the <code>input</code> element.</p>
+
+<h2><a name="C_7" id="C_7">C.7.</a> The <code>lang</code> and <code>xml:lang</code> Attributes</h2>
+
+<p>Use both the <code>lang</code> and <code>xml:lang</code> attributes when specifying the language of an element. The value of the <code>xml:lang</code> attribute takes precedence.</p>
+
+<h2><a name="C_8" id="C_8">C.8.</a> Fragment Identifiers</h2>
+
+<p>In XML, <abbr title="Uniform Resource Identifiers">URI</abbr>-references [<a class="nref" href="references.html#ref-rfc2396">RFC2396</a>] that end with fragment identifiers of the form <code>
+"#foo"</code> do not refer to elements with an attribute <code>name="foo"</code>; rather, they refer to elements with an attribute defined to be of type <code>ID</code>, e.g., the <code>id</code>
+attribute in HTML 4. Many existing HTML clients don't support the use of <code>ID</code>-type attributes in this way, so identical values may be supplied for both of these attributes to ensure
+maximum forward and backward compatibility (e.g., <code class="greenmono">&lt;a id="foo" name="foo"&gt;...&lt;/a&gt;</code>).</p>
+
+<p>Further, since the set of legal values for attributes of type <code>ID</code> is much smaller than for those of type <code>CDATA</code>, the type of the <code>name</code> attribute has been
+changed to <code>NMTOKEN</code>. This attribute is constrained such that it can only have the same values as type <code>ID</code>, or as the <code>Name</code> production in XML 1.0 Section 2.3,
+production 5. Unfortunately, this constraint cannot be expressed in the XHTML 1.0 DTDs. Because of this change, care must be taken when converting existing HTML documents. The values of these
+attributes must be unique within the document, valid, and any references to these fragment identifiers (both internal and external) must be updated should the values be changed during conversion.</p>
+
+<p>Note that the collection of legal values in XML 1.0 Section 2.3, production 5 is much larger than that permitted to be used in the <code>ID</code> and <code>NAME</code> types defined in HTML 4.
+When defining fragment identifiers to be backward-compatible, only strings matching the pattern <code>[A-Za-z][A-Za-z0-9:_.-]*</code> should be used. See <a href=
+"http://www.w3.org/TR/html4/types.html#h-6.2">Section 6.2</a> of [<a class="nref" href="references.html#ref-html4">HTML4</a>] for more information.</p>
+
+<p>Finally, note that XHTML 1.0 has deprecated the <code>name</code> attribute of the <code>a</code>, <code>applet</code>, <code>form</code>, <code>frame</code>, <code>iframe</code>, <code>
+img</code>, and <code>map</code> elements, and it will be removed from XHTML in subsequent versions.</p>
+
+<h2><a name="C_9" id="C_9">C.9.</a> Character Encoding</h2>
+
+<p>Historically, the character encoding of an HTML document is either specified by a web server via the charset parameter of the HTTP Content-Type header, or via a <code>meta</code> element in the
+document itself. In an XML document, the character encoding of the document is specified on the XML declaration (e.g., <code class="greenmono">&lt;?xml version="1.0" encoding="EUC-JP"?&gt;</code>).
+In order to portably present documents with specific character encodings, the best approach is to ensure that the web server provides the correct headers. If this is not possible, a document that
+wants to set its character encoding explicitly must include both the XML declaration an encoding declaration and a <code>meta</code> http-equiv statement (e.g., <code class="greenmono">&lt;meta
+http-equiv="Content-type" content="text/html; charset=EUC-JP"&#160;/&gt;</code>). In XHTML-conforming user agents, the value of the encoding declaration of the XML declaration takes precedence.</p>
+
+<p>Note: be aware that if a document must include the character encoding declaration in a meta http-equiv statement, that document may always be interpreted by HTTP servers and/or user agents as
+being of the internet media type defined in that statement. If a document is to be served as multiple media types, the HTTP server must be used to set the encoding of the document.</p>
+
+<h2><a name="C_10" id="C_10">C.10.</a> Boolean Attributes</h2>
+
+<p>Some HTML user agents are unable to interpret boolean attributes when these appear in their full (non-minimized) form, as required by XML 1.0. Note this problem doesn't affect user agents
+compliant with HTML 4. The following attributes are involved: <code>compact</code>, <code>nowrap</code>, <code>ismap</code>, <code>declare</code>, <code>noshade</code>, <code>checked</code>, <code>
+disabled</code>, <code>readonly</code>, <code>multiple</code>, <code>selected</code>, <code>noresize</code>, <code>defer</code>.</p>
+
+<h2><a name="C_11" id="C_11">C.11.</a> Document Object Model and XHTML</h2>
+
+<p>The Document Object Model level 1 Recommendation [<a class="nref" href="references.html#ref-dom">DOM</a>] defines document object model interfaces for XML and HTML 4. The HTML 4 document object
+model specifies that HTML element and attribute names are returned in upper-case. The XML document object model specifies that element and attribute names are returned in the case they are specified.
+In XHTML 1.0, elements and attributes are specified in lower-case. This apparent difference can be addressed in two ways:</p>
+
+<ol>
+<li>User agents that access XHTML documents served as Internet media type <code>text/html</code> via the <abbr title="Document Object Model">DOM</abbr> can use the HTML DOM, and can rely upon element
+and attribute names being returned in upper-case from those interfaces.</li>
+
+<li>User agents that access XHTML documents served as Internet media types <code>text/xml</code>, <code>application/xml</code>, or <code>application/xhtml+xml</code> can also use the XML DOM.
+Elements and attributes will be returned in lower-case. Also, some XHTML elements may or may not appear in the object tree because they are optional in the content model (e.g. the <code>tbody</code>
+element within <code>table</code>). This occurs because in HTML 4 some elements were permitted to be minimized such that their start and end tags are both omitted (an SGML feature). This is not
+possible in XML. Rather than require document authors to insert extraneous elements, XHTML has made the elements optional. User agents need to adapt to this accordingly. For further information on
+this topic, see [<a class="nref" href="references.html#ref-dom2">DOM2</a>]</li>
+</ol>
+
+<h2><a name="C_12" id="C_12">C.12.</a> Using Ampersands in Attribute Values (and Elsewhere)</h2>
+
+<p>In both SGML and XML, the ampersand character ("&amp;") declares the beginning of an entity reference (e.g., &amp;reg; for the registered trademark symbol "&#174;"). Unfortunately, many HTML user
+agents have silently ignored incorrect usage of the ampersand character in HTML documents - treating ampersands that do not look like entity references as literal ampersands. XML-based user agents
+will not tolerate this incorrect usage, and any document that uses an ampersand incorrectly will not be "valid", and consequently will not conform to this specification. In order to ensure that
+documents are compatible with historical HTML user agents and XML-based user agents, ampersands used in a document that are to be treated as literal characters must be expressed themselves as an
+entity reference (e.g. "<code>&amp;amp;</code>"). For example, when the <code>href</code> attribute of the <code>a</code> element refers to a CGI script that takes parameters, it must be expressed as
+<code>http://my.site.dom/cgi-bin/myscript.pl?class=guest&amp;amp;name=user</code> rather than as <code>http://my.site.dom/cgi-bin/myscript.pl?class=guest&amp;name=user</code>.</p>
+
+<h2><a name="C_13" id="C_13">C.13.</a> Cascading Style Sheets (CSS) and XHTML</h2>
+
+<p>The Cascading Style Sheets level 2 Recommendation [<a class="nref" href="references.html#ref-css2">CSS2</a>] defines style properties which are applied to the parse tree of the HTML or XML
+documents. Differences in parsing will produce different visual or aural results, depending on the selectors used. The following hints will reduce this effect for documents which are served without
+modification as both media types:</p>
+
+<ol>
+<li>CSS style sheets for XHTML should use lower case element and attribute names.</li>
+
+<li>In tables, the tbody element will be inferred by the parser of an HTML user agent, but not by the parser of an XML user agent. Therefore you should always explicitly add a tbody element if it is
+referred to in a CSS selector.</li>
+
+<li>Within the XHTML namespace, user agents are expected to recognize the "id" attribute as an attribute of type ID. Therefore, style sheets should be able to continue using the shorthand "#"
+selector syntax even if the user agent does not read the DTD.</li>
+
+<li>Within the XHTML namespace, user agents are expected to recognize the "class" attribute. Therefore, style sheets should be able to continue using the shorthand "." selector syntax.</li>
+
+<li>CSS defines different conformance rules for HTML and XML documents; be aware that the HTML rules apply to XHTML documents delivered as HTML and the XML rules apply to XHTML documents delivered as
+XML.</li>
+</ol>
+
+<h2><a name="C_14" id="C_14">C.14.</a> Referencing Style Elements when serving as XML</h2>
+
+<p>In HTML 4 and XHTML, the <code>style</code> element can be used to define document-internal style rules. In XML, an XML stylesheet declaration is used to define style rules. In order to be
+compatible with this convention, <code>style</code> elements should have their fragment identifier set using the <code>id</code> attribute, and an XML stylesheet declaration should reference this
+fragment. For example:</p>
+
+<div class="good">
+<pre>
+&lt;?xml-stylesheet href="W3C-REC.css" type="text/css"?&gt;
+&lt;?xml-stylesheet href="#internalStyle" type="text/css"?&gt;
+&lt;!DOCTYPE html
+ PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
+&lt;html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"&gt;
+&lt;head&gt;
+&lt;title&gt;An internal stylesheet example&lt;/title&gt;
+&lt;style type="text/css" id="internalStyle"&gt;
+ code {
+ color: green;
+ font-family: monospace;
+ font-weight: bold;
+ }
+&lt;/style&gt;
+&lt;/head&gt;
+&lt;body&gt;
+&lt;p&gt;
+ This is text that uses our
+ &lt;code&gt;internal stylesheet&lt;/code&gt;.
+&lt;/p&gt;
+&lt;/body&gt;
+&lt;/html&gt;
+</pre>
+</div>
+
+<h2><a name="C_15" id="C_15">C.15.</a> White Space Characters in HTML vs. XML</h2>
+
+<p>Some characters that are legal in HTML documents, are illegal in XML document. For example, in HTML, the Formfeed character (U+000C) is treated as white space, in XHTML, due to XML's definition of
+characters, it is illegal.</p>
+
+<h2><a name="C_16" id="C_16">C.16.</a> The Named Character Reference &amp;apos;</h2>
+
+<p>The named character reference <code>&amp;apos;</code> (the apostrophe, U+0027) was introduced in XML 1.0 but does not appear in HTML. Authors should therefore use <code>&amp;#39;</code> instead of
+<code>&amp;apos;</code> to work as expected in HTML 4 user agents.</p>
+
+<hr />
+<div class="navbar">[<a href="prohibitions.html">previous</a>] &#160; [<a href="acks.html">next</a>] &#160; [<a href="Cover.html#toc">table of contents</a>]</div>
+</body>
+</html>
+