summaryrefslogtreecommitdiff
path: root/Doc/library/xml.sax.handler.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/library/xml.sax.handler.rst')
-rw-r--r--Doc/library/xml.sax.handler.rst402
1 files changed, 402 insertions, 0 deletions
diff --git a/Doc/library/xml.sax.handler.rst b/Doc/library/xml.sax.handler.rst
new file mode 100644
index 0000000000..bc287d1337
--- /dev/null
+++ b/Doc/library/xml.sax.handler.rst
@@ -0,0 +1,402 @@
+
+:mod:`xml.sax.handler` --- Base classes for SAX handlers
+========================================================
+
+.. module:: xml.sax.handler
+ :synopsis: Base classes for SAX event handlers.
+.. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no>
+.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
+
+
+.. versionadded:: 2.0
+
+The SAX API defines four kinds of handlers: content handlers, DTD handlers,
+error handlers, and entity resolvers. Applications normally only need to
+implement those interfaces whose events they are interested in; they can
+implement the interfaces in a single object or in multiple objects. Handler
+implementations should inherit from the base classes provided in the module
+:mod:`xml.sax.handler`, so that all methods get default implementations.
+
+
+.. class:: ContentHandler
+
+ This is the main callback interface in SAX, and the one most important to
+ applications. The order of events in this interface mirrors the order of the
+ information in the document.
+
+
+.. class:: DTDHandler
+
+ Handle DTD events.
+
+ This interface specifies only those DTD events required for basic parsing
+ (unparsed entities and attributes).
+
+
+.. class:: EntityResolver
+
+ Basic interface for resolving entities. If you create an object implementing
+ this interface, then register the object with your Parser, the parser will call
+ the method in your object to resolve all external entities.
+
+
+.. class:: ErrorHandler
+
+ Interface used by the parser to present error and warning messages to the
+ application. The methods of this object control whether errors are immediately
+ converted to exceptions or are handled in some other way.
+
+In addition to these classes, :mod:`xml.sax.handler` provides symbolic constants
+for the feature and property names.
+
+
+.. data:: feature_namespaces
+
+ Value: ``"http://xml.org/sax/features/namespaces"`` --- true: Perform Namespace
+ processing. --- false: Optionally do not perform Namespace processing (implies
+ namespace-prefixes; default). --- access: (parsing) read-only; (not parsing)
+ read/write
+
+
+.. data:: feature_namespace_prefixes
+
+ Value: ``"http://xml.org/sax/features/namespace-prefixes"`` --- true: Report
+ the original prefixed names and attributes used for Namespace
+ declarations. --- false: Do not report attributes used for Namespace
+ declarations, and optionally do not report original prefixed names
+ (default). --- access: (parsing) read-only; (not parsing) read/write
+
+
+.. data:: feature_string_interning
+
+ Value: ``"http://xml.org/sax/features/string-interning"`` --- true: All element
+ names, prefixes, attribute names, Namespace URIs, and local names are interned
+ using the built-in intern function. --- false: Names are not necessarily
+ interned, although they may be (default). --- access: (parsing) read-only; (not
+ parsing) read/write
+
+
+.. data:: feature_validation
+
+ Value: ``"http://xml.org/sax/features/validation"`` --- true: Report all
+ validation errors (implies external-general-entities and
+ external-parameter-entities). --- false: Do not report validation errors. ---
+ access: (parsing) read-only; (not parsing) read/write
+
+
+.. data:: feature_external_ges
+
+ Value: ``"http://xml.org/sax/features/external-general-entities"`` --- true:
+ Include all external general (text) entities. --- false: Do not include
+ external general entities. --- access: (parsing) read-only; (not parsing)
+ read/write
+
+
+.. data:: feature_external_pes
+
+ Value: ``"http://xml.org/sax/features/external-parameter-entities"`` --- true:
+ Include all external parameter entities, including the external DTD subset. ---
+ false: Do not include any external parameter entities, even the external DTD
+ subset. --- access: (parsing) read-only; (not parsing) read/write
+
+
+.. data:: all_features
+
+ List of all features.
+
+
+.. data:: property_lexical_handler
+
+ Value: ``"http://xml.org/sax/properties/lexical-handler"`` --- data type:
+ xml.sax.sax2lib.LexicalHandler (not supported in Python 2) --- description: An
+ optional extension handler for lexical events like comments. --- access:
+ read/write
+
+
+.. data:: property_declaration_handler
+
+ Value: ``"http://xml.org/sax/properties/declaration-handler"`` --- data type:
+ xml.sax.sax2lib.DeclHandler (not supported in Python 2) --- description: An
+ optional extension handler for DTD-related events other than notations and
+ unparsed entities. --- access: read/write
+
+
+.. data:: property_dom_node
+
+ Value: ``"http://xml.org/sax/properties/dom-node"`` --- data type:
+ org.w3c.dom.Node (not supported in Python 2) --- description: When parsing,
+ the current DOM node being visited if this is a DOM iterator; when not parsing,
+ the root DOM node for iteration. --- access: (parsing) read-only; (not parsing)
+ read/write
+
+
+.. data:: property_xml_string
+
+ Value: ``"http://xml.org/sax/properties/xml-string"`` --- data type: String ---
+ description: The literal string of characters that was the source for the
+ current event. --- access: read-only
+
+
+.. data:: all_properties
+
+ List of all known property names.
+
+
+.. _content-handler-objects:
+
+ContentHandler Objects
+----------------------
+
+Users are expected to subclass :class:`ContentHandler` to support their
+application. The following methods are called by the parser on the appropriate
+events in the input document:
+
+
+.. method:: ContentHandler.setDocumentLocator(locator)
+
+ Called by the parser to give the application a locator for locating the origin
+ of document events.
+
+ SAX parsers are strongly encouraged (though not absolutely required) to supply a
+ locator: if it does so, it must supply the locator to the application by
+ invoking this method before invoking any of the other methods in the
+ DocumentHandler interface.
+
+ The locator allows the application to determine the end position of any
+ document-related event, even if the parser is not reporting an error. Typically,
+ the application will use this information for reporting its own errors (such as
+ character content that does not match an application's business rules). The
+ information returned by the locator is probably not sufficient for use with a
+ search engine.
+
+ Note that the locator will return correct information only during the invocation
+ of the events in this interface. The application should not attempt to use it at
+ any other time.
+
+
+.. method:: ContentHandler.startDocument()
+
+ Receive notification of the beginning of a document.
+
+ The SAX parser will invoke this method only once, before any other methods in
+ this interface or in DTDHandler (except for :meth:`setDocumentLocator`).
+
+
+.. method:: ContentHandler.endDocument()
+
+ Receive notification of the end of a document.
+
+ The SAX parser will invoke this method only once, and it will be the last method
+ invoked during the parse. The parser shall not invoke this method until it has
+ either abandoned parsing (because of an unrecoverable error) or reached the end
+ of input.
+
+
+.. method:: ContentHandler.startPrefixMapping(prefix, uri)
+
+ Begin the scope of a prefix-URI Namespace mapping.
+
+ The information from this event is not necessary for normal Namespace
+ processing: the SAX XML reader will automatically replace prefixes for element
+ and attribute names when the ``feature_namespaces`` feature is enabled (the
+ default).
+
+ There are cases, however, when applications need to use prefixes in character
+ data or in attribute values, where they cannot safely be expanded automatically;
+ the :meth:`startPrefixMapping` and :meth:`endPrefixMapping` events supply the
+ information to the application to expand prefixes in those contexts itself, if
+ necessary.
+
+ .. % XXX This is not really the default, is it? MvL
+
+ Note that :meth:`startPrefixMapping` and :meth:`endPrefixMapping` events are not
+ guaranteed to be properly nested relative to each-other: all
+ :meth:`startPrefixMapping` events will occur before the corresponding
+ :meth:`startElement` event, and all :meth:`endPrefixMapping` events will occur
+ after the corresponding :meth:`endElement` event, but their order is not
+ guaranteed.
+
+
+.. method:: ContentHandler.endPrefixMapping(prefix)
+
+ End the scope of a prefix-URI mapping.
+
+ See :meth:`startPrefixMapping` for details. This event will always occur after
+ the corresponding :meth:`endElement` event, but the order of
+ :meth:`endPrefixMapping` events is not otherwise guaranteed.
+
+
+.. method:: ContentHandler.startElement(name, attrs)
+
+ Signals the start of an element in non-namespace mode.
+
+ The *name* parameter contains the raw XML 1.0 name of the element type as a
+ string and the *attrs* parameter holds an object of the :class:`Attributes`
+ interface (see :ref:`attributes-objects`) containing the attributes of
+ the element. The object passed as *attrs* may be re-used by the parser; holding
+ on to a reference to it is not a reliable way to keep a copy of the attributes.
+ To keep a copy of the attributes, use the :meth:`copy` method of the *attrs*
+ object.
+
+
+.. method:: ContentHandler.endElement(name)
+
+ Signals the end of an element in non-namespace mode.
+
+ The *name* parameter contains the name of the element type, just as with the
+ :meth:`startElement` event.
+
+
+.. method:: ContentHandler.startElementNS(name, qname, attrs)
+
+ Signals the start of an element in namespace mode.
+
+ The *name* parameter contains the name of the element type as a ``(uri,
+ localname)`` tuple, the *qname* parameter contains the raw XML 1.0 name used in
+ the source document, and the *attrs* parameter holds an instance of the
+ :class:`AttributesNS` interface (see :ref:`attributes-ns-objects`)
+ containing the attributes of the element. If no namespace is associated with
+ the element, the *uri* component of *name* will be ``None``. The object passed
+ as *attrs* may be re-used by the parser; holding on to a reference to it is not
+ a reliable way to keep a copy of the attributes. To keep a copy of the
+ attributes, use the :meth:`copy` method of the *attrs* object.
+
+ Parsers may set the *qname* parameter to ``None``, unless the
+ ``feature_namespace_prefixes`` feature is activated.
+
+
+.. method:: ContentHandler.endElementNS(name, qname)
+
+ Signals the end of an element in namespace mode.
+
+ The *name* parameter contains the name of the element type, just as with the
+ :meth:`startElementNS` method, likewise the *qname* parameter.
+
+
+.. method:: ContentHandler.characters(content)
+
+ Receive notification of character data.
+
+ The Parser will call this method to report each chunk of character data. SAX
+ parsers may return all contiguous character data in a single chunk, or they may
+ split it into several chunks; however, all of the characters in any single event
+ must come from the same external entity so that the Locator provides useful
+ information.
+
+ *content* may be a Unicode string or a byte string; the ``expat`` reader module
+ produces always Unicode strings.
+
+ .. note::
+
+ The earlier SAX 1 interface provided by the Python XML Special Interest Group
+ used a more Java-like interface for this method. Since most parsers used from
+ Python did not take advantage of the older interface, the simpler signature was
+ chosen to replace it. To convert old code to the new interface, use *content*
+ instead of slicing content with the old *offset* and *length* parameters.
+
+
+.. method:: ContentHandler.ignorableWhitespace(whitespace)
+
+ Receive notification of ignorable whitespace in element content.
+
+ Validating Parsers must use this method to report each chunk of ignorable
+ whitespace (see the W3C XML 1.0 recommendation, section 2.10): non-validating
+ parsers may also use this method if they are capable of parsing and using
+ content models.
+
+ SAX parsers may return all contiguous whitespace in a single chunk, or they may
+ split it into several chunks; however, all of the characters in any single event
+ must come from the same external entity, so that the Locator provides useful
+ information.
+
+
+.. method:: ContentHandler.processingInstruction(target, data)
+
+ Receive notification of a processing instruction.
+
+ The Parser will invoke this method once for each processing instruction found:
+ note that processing instructions may occur before or after the main document
+ element.
+
+ A SAX parser should never report an XML declaration (XML 1.0, section 2.8) or a
+ text declaration (XML 1.0, section 4.3.1) using this method.
+
+
+.. method:: ContentHandler.skippedEntity(name)
+
+ Receive notification of a skipped entity.
+
+ The Parser will invoke this method once for each entity skipped. Non-validating
+ processors may skip entities if they have not seen the declarations (because,
+ for example, the entity was declared in an external DTD subset). All processors
+ may skip external entities, depending on the values of the
+ ``feature_external_ges`` and the ``feature_external_pes`` properties.
+
+
+.. _dtd-handler-objects:
+
+DTDHandler Objects
+------------------
+
+:class:`DTDHandler` instances provide the following methods:
+
+
+.. method:: DTDHandler.notationDecl(name, publicId, systemId)
+
+ Handle a notation declaration event.
+
+
+.. method:: DTDHandler.unparsedEntityDecl(name, publicId, systemId, ndata)
+
+ Handle an unparsed entity declaration event.
+
+
+.. _entity-resolver-objects:
+
+EntityResolver Objects
+----------------------
+
+
+.. method:: EntityResolver.resolveEntity(publicId, systemId)
+
+ Resolve the system identifier of an entity and return either the system
+ identifier to read from as a string, or an InputSource to read from. The default
+ implementation returns *systemId*.
+
+
+.. _sax-error-handler:
+
+ErrorHandler Objects
+--------------------
+
+Objects with this interface are used to receive error and warning information
+from the :class:`XMLReader`. If you create an object that implements this
+interface, then register the object with your :class:`XMLReader`, the parser
+will call the methods in your object to report all warnings and errors. There
+are three levels of errors available: warnings, (possibly) recoverable errors,
+and unrecoverable errors. All methods take a :exc:`SAXParseException` as the
+only parameter. Errors and warnings may be converted to an exception by raising
+the passed-in exception object.
+
+
+.. method:: ErrorHandler.error(exception)
+
+ Called when the parser encounters a recoverable error. If this method does not
+ raise an exception, parsing may continue, but further document information
+ should not be expected by the application. Allowing the parser to continue may
+ allow additional errors to be discovered in the input document.
+
+
+.. method:: ErrorHandler.fatalError(exception)
+
+ Called when the parser encounters an error it cannot recover from; parsing is
+ expected to terminate when this method returns.
+
+
+.. method:: ErrorHandler.warning(exception)
+
+ Called when the parser presents minor warning information to the application.
+ Parsing is expected to continue when this method returns, and document
+ information will continue to be passed to the application. Raising an exception
+ in this method will cause parsing to end.
+