\input texinfo @c -*- texinfo -*- @c %**start of header @setfilename ../../info/nxml-mode.info @settitle nXML Mode @documentencoding UTF-8 @c %**end of header @copying This manual documents nXML mode, an Emacs major mode for editing XML with RELAX NG support. Copyright @copyright{} 2007--2015 Free Software Foundation, Inc. @quotation Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with the Front-Cover Texts being ``A GNU Manual,'' and with the Back-Cover Texts as in (a) below. A copy of the license is included in the section entitled ``GNU Free Documentation License''. (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and modify this GNU manual.'' @end quotation @end copying @dircategory Emacs editing modes @direntry * nXML Mode: (nxml-mode). XML editing mode with RELAX NG support. @end direntry @titlepage @title nXML mode @page @vskip 0pt plus 1filll @insertcopying @end titlepage @contents @node Top @top nXML Mode @insertcopying This manual is not yet complete. @menu * Introduction:: * Completion:: * Inserting end-tags:: * Paragraphs:: * Outlining:: * Locating a schema:: * DTDs:: * Limitations:: * GNU Free Documentation License:: The license for this documentation. @end menu @node Introduction @chapter Introduction nXML mode is an Emacs major-mode for editing XML documents. It supports editing well-formed XML documents, and provides schema-sensitive editing using RELAX NG Compact Syntax. To get started, visit a file containing an XML document, and, if necessary, use @kbd{M-x nxml-mode} to switch to nXML mode. By default, @code{auto-mode-alist} and @code{magic-fallback-alist} put buffers in nXML mode if they have recognizable XML content or file extensions. You may wish to customize the settings, for example to recognize different file extensions. Once in nXML mode, you can type @kbd{C-h m} for basic information on the mode. The @file{etc/nxml} directory in the Emacs distribution contains some data files used by nXML mode, and includes two files (@file{test-valid.xml} and @file{test-invalid.xml}) that provide examples of valid and invalid XML documents. To get validation and schema-sensitive editing, you need a RELAX NG Compact Syntax (RNC) schema for your document (@pxref{Locating a schema}). The @file{etc/schema} directory includes some schemas for popular document types. See @url{http://relaxng.org/} for more information on RELAX NG@. You can use the @samp{Trang} program from @url{http://www.thaiopensource.com/relaxng/trang.html} to automatically create RNC schemas. This program can: @itemize @bullet @item infer an RNC schema from an instance document; @item convert a DTD to an RNC schema; @item convert a RELAX NG XML syntax schema to an RNC schema. @end itemize @noindent To convert a RELAX NG XML syntax (@samp{.rng}) schema to a RNC one, you can also use the XSLT stylesheet from @url{https://github.com/oleg-pavliv/emacs/tree/master/xsl}. @ignore @c Original location, now defunct. @url{http://www.pantor.com/download.html}. @end ignore To convert a W3C XML Schema to an RNC schema, you need first to convert it to RELAX NG XML syntax using the RELAX NG converter tool @code{rngconv} (built on top of MSV). See @url{https://github.com/kohsuke/msv} and @url{https://msv.dev.java.net/}. For historical discussions only, see the mailing list archives at @url{http://groups.yahoo.com/group/emacs-nxml-mode/}. Please make all new discussions on the @samp{help-gnu-emacs} and @samp{emacs-devel} mailing lists. Report any bugs with @kbd{M-x report-emacs-bug}. @node Completion @chapter Completion Apart from real-time validation, the most important feature that nXML mode provides for assisting in document creation is "completion". Completion assists the user in inserting characters at point, based on knowledge of the schema and on the contents of the buffer before point. nXML mode adapts the standard GNU Emacs command for completion in a buffer: @code{completion-at-point}, which is bound to @kbd{C-M-i} and @kbd{M-@key{TAB}}. Note that many window systems and window managers use @kbd{M-@key{TAB}} themselves (typically for switching between windows) and do not pass it to applications. In that case, you should type @kbd{C-M-i} or @kbd{@key{ESC} @key{TAB}} for completion, or bind @code{completion-at-point} to a key that is convenient for you. In the following, I will assume that you type @kbd{C-M-i}. nXML mode completion works by examining the symbol preceding point. This is the symbol to be completed. The symbol to be completed may be the empty. Completion considers what symbols starting with the symbol to be completed would be valid replacements for the symbol to be completed, given the schema and the contents of the buffer before point. These symbols are the possible completions. An example may make this clearer. Suppose the buffer looks like this (where @point{} indicates point): @example <@point{} @end example @noindent In this case, the symbol to be completed is empty, and the possible completions are @samp{base}, @samp{isindex}, @samp{link}, @samp{meta}, @samp{script}, @samp{style}, @samp{title}. Another example is: @example <@point{} @end example @noindent @kbd{C-M-i} will yield @example @end example @noindent This says to use the schema @samp{xhtml.rnc} for a document with namespace @samp{http://www.w3.org/1999/xhtml}, and to use the schema @samp{docbook.rnc} for a document whose local name is @samp{book}. If the document element had both a namespace URI of @samp{http://www.w3.org/1999/xhtml} and a local name of @samp{book}, then the matching rule that comes first will be used and so the schema @samp{xhtml.rnc} would be used. There is no precedence between different types of rule; the first matching rule of any type is used. As usual with XML-related technologies, resources are identified by URIs. The @samp{uri} attribute identifies the schema by specifying the URI@. The URI may be relative. If so, it is resolved relative to the URI of the schema locating file that contains attribute. This means that if the value of @samp{uri} attribute does not contain a @samp{/}, then it will refer to a filename in the same directory as the schema locating file. @node Using the document's URI to locate a schema @subsection Using the document's URI to locate a schema A @samp{uri} rule locates a schema based on the URI of the document. The @samp{uri} attribute specifies the URI of the schema. The @samp{resource} attribute can be used to specify the schema for a particular document. For example, @example @end example @noindent specifies that the schema for @samp{spec.xml} is @samp{docbook.rnc}. The @samp{pattern} attribute can be used instead of the @samp{resource} attribute to specify the schema for any document whose URI matches a pattern. The pattern has the same syntax as an absolute or relative URI except that the path component of the URI can use a @samp{*} character to stand for zero or more characters within a path segment (i.e., any character other @samp{/}). Typically, the URI pattern looks like a relative URI, but, whereas a relative URI in the @samp{resource} attribute is resolved into a particular absolute URI using the base URI of the schema locating file, a relative URI pattern matches if it matches some number of complete path segments of the document's URI ending with the last path segment of the document's URI@. For example, @example @end example @noindent specifies that the schema for documents with a URI whose path ends with @samp{.xsl} is @samp{xslt.rnc}. A @samp{transformURI} rule locates a schema by transforming the URI of the document. The @samp{fromPattern} attribute specifies a URI pattern with the same meaning as the @samp{pattern} attribute of the @samp{uri} element. The @samp{toPattern} attribute is a URI pattern that is used to generate the URI of the schema. Each @samp{*} in the @samp{toPattern} is replaced by the string that matched the corresponding @samp{*} in the @samp{fromPattern}. The resulting string is appended to the initial part of the document's URI that was not explicitly matched by the @samp{fromPattern}. The rule matches only if the transformed URI identifies an existing resource. For example, the rule @example @end example @noindent would transform the URI @samp{file:///home/jjc/docs/spec.xml} into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this rule specifies that to locate a schema for a document @samp{@var{foo}.xml}, Emacs should test whether a file @samp{@var{foo}.rnc} exists in the same directory as @samp{@var{foo}.xml}, and, if so, should use it as the schema. @node Using the document element to locate a schema @subsection Using the document element to locate a schema A @samp{documentElement} rule locates a schema based on the local name and prefix of the document element. For example, a rule @example @end example @noindent specifies that when the name of the document element is @samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used as the schema. Either the @samp{prefix} or @samp{localName} attribute may be omitted to allow any prefix or local name. A @samp{namespace} rule locates a schema based on the namespace URI of the document element. For example, a rule @example @end example @noindent specifies that when the namespace URI of the document is @samp{http://www.w3.org/1999/XSL/Transform}, then @samp{xslt.rnc} should be used as the schema. @node Using type identifiers in schema locating files @subsection Using type identifiers in schema locating files Type identifiers allow a level of indirection in locating the schema for a document. Instead of associating the document directly with a schema URI, the document is associated with a type identifier, which is in turn associated with a schema URI@. nXML mode does not constrain the format of type identifiers. They can be simply strings without any formal structure or they can be public identifiers or URIs. Note that these type identifiers have nothing to do with the DOCTYPE declaration. When comparing type identifiers, whitespace is normalized in the same way as with the @samp{xsd:token} datatype: leading and trailing whitespace is stripped; other sequences of whitespace are normalized to a single space character. Each of the rules described in previous sections that uses a @samp{uri} attribute to specify a schema, can instead use a @samp{typeId} attribute to specify a type identifier. The type identifier can be associated with a URI using a @samp{typeId} element. For example, @example @end example @noindent declares three type identifiers @samp{XHTML} (representing the default variant of XHTML to be used), @samp{XHTML Strict} and @samp{XHTML Transitional}. Such a schema locating file would use @samp{xhtml-strict.rnc} for a document whose namespace is @samp{http://www.w3.org/1999/xhtml}. But it is considerably more flexible than a schema locating file that simply specified @example @end example @noindent A user can easily use @kbd{C-c C-s C-t} to select between XHTML Strict and XHTML Transitional. Also, a user can easily add a catalog @example @end example @noindent that makes the default variant of XHTML be XHTML Transitional. @node Using multiple schema locating files @subsection Using multiple schema locating files The @samp{include} element includes rules from another schema locating file. The behavior is exactly as if the rules from that file were included in place of the @samp{include} element. Relative URIs are resolved into absolute URIs before the inclusion is performed. For example, @example @end example @noindent includes the rules from @samp{rules.xml}. The process of locating a schema takes as input a list of schema locating files. The rules in all these files and in the files they include are resolved into a single list of rules, which are applied strictly in order. Sometimes this order is not what is needed. For example, suppose you have two schema locating files, a private file @example @end example @noindent followed by a public file @example @end example @noindent The effect of these two files is that the XHTML @samp{namespace} rule takes precedence over the @samp{transformURI} rule, which is almost certainly not what is needed. This can be solved by adding an @samp{applyFollowingRules} to the private file. @example @end example @node DTDs @chapter DTDs nXML mode is designed to support the creation of standalone XML documents that do not depend on a DTD@. Although it is common practice to insert a DOCTYPE declaration referencing an external DTD, this has undesirable side-effects. It means that the document is no longer self-contained. It also means that different XML parsers may interpret the document in different ways, since the XML Recommendation does not require XML parsers to read the DTD@. With DTDs, it was impractical to get validation without using an external DTD or reference to an parameter entity. With RELAX NG and other schema languages, you can simultaneously get the benefits of validation and standalone XML documents. Therefore, I recommend that you do not reference an external DOCTYPE in your XML documents. One problem is entities for characters. Typically, as well as providing validation, DTDs also provide a set of character entities for documents to use. Schemas cannot provide this functionality, because schema validation happens after XML parsing. The recommended solution is to either use the Unicode characters directly, or, if this is impractical, use character references. nXML mode supports this by providing commands for entering characters and character references using the Unicode names, and can display the glyph corresponding to a character reference. @node Limitations @chapter Limitations nXML mode has some limitations: @itemize @bullet @item DTD support is limited. Internal parsed general entities declared in the internal subset are supported provided they do not contain elements. Other usage of DTDs is ignored. @item The restrictions on RELAX NG schemas in section 7 of the RELAX NG specification are not enforced. @end itemize @node GNU Free Documentation License @appendix GNU Free Documentation License @include doclicense.texi @bye