summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authormilde <milde@929543f6-e4f2-0310-98a6-ba3bd3dd1d04>2010-05-05 12:08:10 +0000
committermilde <milde@929543f6-e4f2-0310-98a6-ba3bd3dd1d04>2010-05-05 12:08:10 +0000
commit87cfa9d99f38c846ed02308f8c33d607f71450ff (patch)
tree5d4ef766985a033e1e77d0a294344d19404f2df2
parent31da62a25c31e64de43216e4a52fc08e6f4a132e (diff)
downloaddocutils-87cfa9d99f38c846ed02308f8c33d607f71450ff.tar.gz
Drop the latex2e-special output_encoding default ("latin-1").
git-svn-id: http://svn.code.sf.net/p/docutils/code/trunk@6319 929543f6-e4f2-0310-98a6-ba3bd3dd1d04
-rw-r--r--docutils/HISTORY.txt3
-rw-r--r--docutils/RELEASE-NOTES.txt5
-rw-r--r--docutils/docs/user/latex.txt73
-rw-r--r--docutils/docutils/writers/latex2e/__init__.py163
-rw-r--r--docutils/test/functional/expected/latex_docinfo.tex2
-rw-r--r--docutils/test/functional/expected/standalone_rst_latex.tex380
-rw-r--r--docutils/test/functional/input/data/latex_encoding.txt2
-rw-r--r--docutils/test/functional/input/data/unicode.txt73
-rwxr-xr-xdocutils/test/test_writers/test_latex2e.py16
9 files changed, 602 insertions, 115 deletions
diff --git a/docutils/HISTORY.txt b/docutils/HISTORY.txt
index e484fb233..30b05feba 100644
--- a/docutils/HISTORY.txt
+++ b/docutils/HISTORY.txt
@@ -55,9 +55,12 @@ Changes Since 0.6
- Fix hyperlink targets (labels) for images, figures, and tables.
- Apply [ 2961988 ] Load babel after inputenc and fontenc.
- Apply [ 2961991 ] Call hyperref with unicode option.
+ - Drop the special `output_encoding`_ default ("latin-1").
+ The Docutils wide default (usually "UTF-8") is used instead.
__ docs/ref/restructuredtext.html#inline-literals
__ docs/user/config.html#docutils-footnotes
+__ docs/user/config.html#output_encoding
* docutils/writers/manpage.py
diff --git a/docutils/RELEASE-NOTES.txt b/docutils/RELEASE-NOTES.txt
index 4df71db05..5b79b4fb8 100644
--- a/docutils/RELEASE-NOTES.txt
+++ b/docutils/RELEASE-NOTES.txt
@@ -31,12 +31,15 @@ Components:
- Deprecate ``figure_footnotes`` setting.
- Rename ``use_latex_footnotes`` setting to `docutils_footnotes`__.
- New ``latex_preamble`` setting.
- - PDF standard fonts (Times/Helvetica/Courier) as default.
+ - Use PDF standard fonts (Times/Helvetica/Courier) as default.
- `hyperref` package called with ``unicode`` option (see the
`hyperref config tips`__ for how to override).
+ - Drop the special `output_encoding`__ default ("latin-1").
+ The Docutils wide default (usually "UTF-8") is used instead.
__ docs/user/config.html#docutils-footnotes
__ docs/user/latex.html#hyperlinks
+__ docs/user/latex.html#output-encoding
General:
diff --git a/docutils/docs/user/latex.txt b/docutils/docs/user/latex.txt
index 6c2ee0971..8a47dd6c7 100644
--- a/docutils/docs/user/latex.txt
+++ b/docutils/docs/user/latex.txt
@@ -943,7 +943,7 @@ Example 1:
This will improve the look on screen with the default Computer Modern
fonts at the expense of problems with `search and text extraction`_
- The recommended workaround is to select a T1-encoded "Type 1" (vector)
+ The recommended way is to select a T1-encoded "Type 1" (vector)
font, for example `Latin Modern`_
Example 2:
@@ -1580,30 +1580,33 @@ Example:
text encoding
-------------
-The encoding of the LaTeX source file, i.e.
-Docutils' *output* encoding becomes the LaTeX *input* encoding.
+The encoding of the LaTeX source file is Docutils' *output* encoding
+but LaTeX' *input* encoding.
Option: output-encoding_
``--output-encoding=OUTPUT-ENCODING``
Default:
- latin1
+ "utf8"
- .. TODO: use the Docutils-wide default: output-encoding = input-encoding
+Example:
+ Encode the LaTeX source file with the ISO `latin-1` (west european)
+ 8-bit encoding (the default in Docutils versions up to 0.6.)::
+ --output-encoding=latin-1
-LaTeX comes with two packages for UTF-8 support,
+Note:
+ LaTeX comes with two packages for UTF-8 support,
-:utf8: by the standard `inputenc`_ package with only limited coverage
- (mainly accented chars, only few non-alphabetic symbols, no Greek or
- Cyrillic).
+ :utf8: by the standard `inputenc`_ package with only limited coverage
+ (mainly accented chars, no Greekv).
-:utf8x: supported by the `ucs`_ package covers a wider range of Unicode
- characters than does "utf8". It is, however, a non-standard
- extension and no longer developed.
+ :utf8x: supported by the `ucs`_ package covers a wider range of Unicode
+ characters than does "utf8". It is, however, a non-standard
+ extension and no longer developed.
-Currently (in version 0.6), "utf8" is used if the output-encoding is
-any of "utf_8", "U8", "UTF", or "utf8".
+ Currently (in version 0.6), "utf8" is used if the output-encoding is
+ any of "utf_8", "U8", "UTF", or "utf8".
.. with utf8x:
If LaTeX issues a Warning about unloaded/unknown characters adding ::
@@ -1795,6 +1798,23 @@ If updating LaTeX is not an option, just remove the "px" from the length
specification. HTML/CSS will default to "px" while the `latexe2` writer
will add the fallback unit "bp".
+Error ``Symbol \textcurrency not provided`` ...
+```````````````````````````````````````````````
+
+The currency sign (\\u00a4) is not supported by all fonts (some have
+an Euro sign at its place). You might see an error like::
+
+ ! Package textcomp Error: Symbol \textcurrency not provided by
+ (textcomp) font family ptm in TS1 encoding.
+ (textcomp) Default family used instead.
+
+(which in case of font family "ptm" is a false positive). Add either
+
+:warn: turn the error in a warning, use the default symbol (bitmap), or
+:force,almostfull: use the symbol provided by the font at the users
+ risk,
+
+to the document options or use a different font package.
Search and text extraction
``````````````````````````
@@ -1806,21 +1826,34 @@ umlauts) might fail. See font_ and `font encoding`_ (as well as
.. _Searching PDF files:
http://www.tex.ac.uk/cgi-bin/texfaq2html?label=srchpdf
-Unicode box drawing characters
-```````````````````````````````
+Unicode box drawing and block characters
+````````````````````````````````````````
+
+- Generate LaTeX code with `output-encoding`_ "utf-8".
+
+- Add the pmboxdraw_ package to the `style sheets`_.
+ (For shaded boxes also add the `color` package.)
- - generate LaTeX code with ``--output-encoding=utf-8:strict``.
+Unfortunately, this defines only a subset of the characters
+(see pmboxdraw.pdf_ for a list).
- - In the latex file, edit the preamble to load "ucs" with "postscript"
- option and also load the pstricks package::
+Alternatively:
+
+- In the latex file, edit the preamble to load ucs_ with "postscript"
+ option and also load the pstricks package::
- \usepackage[utf8]{inputenc}
+ \usepackage[postscript]{ucs}
+ \usepackage{pstricks}
+ \usepackage[utf8x]{inputenc}
- - Convert to PDF with ``latex``, ``dvips``, and ``ps2pdf``.
+- Convert to PDF with ``latex``, ``dvips``, and ``ps2pdf``.
+
+.. _pmboxdraw:
+ http://www.ctan.org/tex-archive/help/Catalogue/entries/pmboxdraw.html
+.. _pmboxdraw.pdf:
+ http://www.ctan.org/tex-archive/macros/latex/contrib/oberdiek/pmboxdraw.pdf
Bugs and open issues
--------------------
diff --git a/docutils/docutils/writers/latex2e/__init__.py b/docutils/docutils/writers/latex2e/__init__.py
index e6a0e233a..ab9569715 100644
--- a/docutils/docutils/writers/latex2e/__init__.py
+++ b/docutils/docutils/writers/latex2e/__init__.py
@@ -17,8 +17,8 @@ import os
import time
import re
import string
-from docutils import frontend, nodes, languages, writers, utils, transforms, io
-from docutils.writers.newlatex2e import unicode_map
+from docutils import frontend, nodes, languages, writers, utils, io
+from docutils.transforms import writer_aux
# compatibility module for Python <= 2.4
if not hasattr(string, 'Template'):
@@ -39,7 +39,7 @@ class Writer(writers.Writer):
r'\usepackage{courier}'])
settings_spec = (
'LaTeX-Specific Options',
- 'The LaTeX "--output-encoding" default is "latin-1:strict".',
+ None,
(('Specify documentclass. Default is "article".',
['--documentclass'],
{'default': 'article', }),
@@ -198,10 +198,8 @@ class Writer(writers.Writer):
{'default': None, }),
),)
- settings_defaults = {'output_encoding': 'latin-1',
- 'sectnum_depth': 0 # updated by SectNum transform
+ settings_defaults = {'sectnum_depth': 0 # updated by SectNum transform
}
-
relative_path_settings = ('stylesheet_path',)
config_section = 'latex2e writer'
@@ -225,7 +223,7 @@ class Writer(writers.Writer):
transform_list = writers.Writer.get_transforms(self)
# print transform_list
# Convert specific admonitions to generic one
- transform_list.append(transforms.writer_aux.Admonitions)
+ transform_list.append(writer_aux.Admonitions)
# TODO: footnote collection transform
# transform_list.append(footnotes.collect)
return transform_list
@@ -355,7 +353,7 @@ class SortableDict(dict):
"""Dictionary with additional sorting methods
Tip: use key starting with with '_' for sorting before small letters
- and with '~' for sorting after small letters.
+ and with '~' for sorting after small letters.
"""
def sortedkeys(self):
"""Return sorted list of keys"""
@@ -559,6 +557,11 @@ PreambleCmds.table = r"""\usepackage{longtable}
\setlength{\extrarowheight}{2pt}
\newlength{\DUtablewidth} % internal use in tables"""
+# Options [force,almostfull] prevent spurious error messages, see
+# de.comp.text.tex/2005-12/msg01855
+PreambleCmds.textcomp = """\
+\\usepackage{textcomp} % text symbol macros"""
+
PreambleCmds.documenttitle = r"""
%% Document title
\title{%s}
@@ -635,9 +638,10 @@ class Table(object):
Table style might be
- :standard: horizontal and vertical lines
- :booktabs: only horizontal lines (requires "booktabs" LaTeX package)
- :nolines: (or borderless) no lines
+ :standard: horizontal and vertical lines
+ :booktabs: only horizontal lines (requires "booktabs" LaTeX package)
+ :borderless: no borders around table cells
+ :nolines: alias for borderless
"""
def __init__(self,translator,latex_type,table_style):
self._translator = translator
@@ -936,7 +940,7 @@ class LaTeXTranslator(nodes.NodeVisitor):
self.docutils_footnotes = True
self.warn('`use_latex_footnotes` is deprecated. '
'The setting has been renamed to `docutils_footnotes` '
- 'and the alias will be removed in a future version.')
+ 'and the alias will be removed in a future version.')
self.figure_footnotes = settings.figure_footnotes
if self.figure_footnotes:
self.docutils_footnotes = True
@@ -1008,19 +1012,23 @@ class LaTeXTranslator(nodes.NodeVisitor):
# Process settings
# ~~~~~~~~~~~~~~~~
- # persistent requirements
- if self.font_encoding == '':
- fontenc_header = r'%\usepackage[OT1]{fontenc}'
+ # Static requirements
+ # TeX font encoding
+ if self.font_encoding:
+ encodings = [r'\usepackage[%s]{fontenc}' % self.font_encoding]
else:
- fontenc_header = r'\usepackage[%s]{fontenc}' % self.font_encoding
- self.requirements['_persistent'] = '\n'.join([
- fontenc_header,
- r'\usepackage[%s]{inputenc}' % self.latex_encoding,
+ encodings = [r'%\usepackage[OT1]{fontenc}'] # just a comment
+ # Docutils' output-encoding => TeX input encoding:
+ if self.latex_encoding != 'ascii':
+ encodings.append(r'\usepackage[%s]{inputenc}'
+ % self.latex_encoding)
+ self.requirements['_static'] = '\n'.join(
+ encodings + [
r'\usepackage{ifthen}',
- # multi-language support (language is in document settings)
+ # multi-language support (language is in document options)
'\\usepackage{babel}%s' % self.babel.setup,
])
- # page layout with typearea (if there are relevant document options).
+ # page layout with typearea (if there are relevant document options)
if (settings.documentclass.find('scr') == -1 and
(self.d_options.find('DIV') != -1 or
self.d_options.find('BCOR') != -1)):
@@ -1096,7 +1104,6 @@ class LaTeXTranslator(nodes.NodeVisitor):
"""Translate docutils encoding name into LaTeX's.
Default method is remove "-" and "_" chars from docutils_encoding.
-
"""
tr = { 'iso-8859-1': 'latin1', # west european
'iso-8859-2': 'latin2', # east european
@@ -1189,8 +1196,9 @@ class LaTeXTranslator(nodes.NodeVisitor):
# Unicode chars that are not recognized by LaTeX's utf8 encoding
unsupported_unicode_chars = {
0x00A0: ur'~', # NO-BREAK SPACE
- 0x00AD: ur'\-', # SOFT HYPHEN
- 0x2011: ur'\hbox{-}', # NON-BREAKING HYPHEN
+ 0x00AD: ur'\-', # SOFT HYPHEN
+ #
+ 0x2011: ur'\hbox{-}', # NON-BREAKING HYPHEN
0x21d4: ur'$\Leftrightarrow$',
# Docutils footnote symbols:
0x2660: ur'$\spadesuit$',
@@ -1216,10 +1224,87 @@ class LaTeXTranslator(nodes.NodeVisitor):
0x2665: ur'\ding{170}', # black heartsuit
0x2666: ur'\ding{169}', # black diamondsuit
}
- # TODO: replacements using textcomp
- ## textcomp_chars = {
- ## 0x00B5: ur'\textmu{}', # MICRO SIGN
- ## }
+ # recognized with 'utf8', if textcomp is loaded
+ textcomp_chars = {
+ # Latin-1 Supplement
+ 0x00a2: ur'\textcent{}', # ¢ CENT SIGN
+ 0x00a4: ur'\textcurrency{}', # ¤ CURRENCY SYMBOL
+ 0x00a5: ur'\textyen{}', # ¥ YEN SIGN
+ 0x00a6: ur'\textbrokenbar{}', # ¦ BROKEN BAR
+ 0x00a7: ur'\textsection{}', # § SECTION SIGN
+ 0x00a8: ur'\textasciidieresis{}', # ¨ DIAERESIS
+ 0x00a9: ur'\textcopyright{}', # © COPYRIGHT SIGN
+ 0x00aa: ur'\textordfeminine{}', # ª FEMININE ORDINAL INDICATOR
+ 0x00ac: ur'\textlnot{}', # ¬ NOT SIGN
+ 0x00ae: ur'\textregistered{}', # ® REGISTERED SIGN
+ 0x00af: ur'\textasciimacron{}', # ¯ MACRON
+ 0x00b0: ur'\textdegree{}', # ° DEGREE SIGN
+ 0x00b1: ur'\textpm{}', # ± PLUS-MINUS SIGN
+ 0x00b2: ur'\texttwosuperior{}', # ² SUPERSCRIPT TWO
+ 0x00b3: ur'\textthreesuperior{}', # ³ SUPERSCRIPT THREE
+ 0x00b4: ur'\textasciiacute{}', # ´ ACUTE ACCENT
+ 0x00b5: ur'\textmu{}', # µ MICRO SIGN
+ 0x00b6: ur'\textparagraph{}', # ¶ PILCROW SIGN # not equal to \textpilcrow
+ 0x00b9: ur'\textonesuperior{}', # ¹ SUPERSCRIPT ONE
+ 0x00ba: ur'\textordmasculine{}', # º MASCULINE ORDINAL INDICATOR
+ 0x00bc: ur'\textonequarter{}', # 1/4 FRACTION
+ 0x00bd: ur'\textonehalf{}', # 1/2 FRACTION
+ 0x00be: ur'\textthreequarters{}', # 3/4 FRACTION
+ 0x00d7: ur'\texttimes{}', # × MULTIPLICATION SIGN
+ 0x00f7: ur'\textdiv{}', # ÷ DIVISION SIGN
+ #
+ 0x0192: ur'\textflorin{}', # LATIN SMALL LETTER F WITH HOOK
+ 0x02b9: ur'\textasciiacute{}', # MODIFIER LETTER PRIME
+ 0x02ba: ur'\textacutedbl{}', # MODIFIER LETTER DOUBLE PRIME
+ 0x2016: ur'\textbardbl{}', # DOUBLE VERTICAL LINE
+ 0x2022: ur'\textbullet{}', # BULLET
+ 0x2030: ur'\textperthousand{}', # PER MILLE SIGN
+ 0x2031: ur'\textpertenthousand{}', # PER TEN THOUSAND SIGN
+ 0x2032: ur'\textasciiacute{}', # PRIME
+ 0x2033: ur'\textacutedbl{}', # DOUBLE PRIME
+ 0x2035: ur'\textasciigrave{}', # REVERSED PRIME
+ 0x2036: ur'\textgravedbl{}', # REVERSED DOUBLE PRIME
+ 0x203b: ur'\textreferencemark{}', # REFERENCE MARK
+ 0x203d: ur'\textinterrobang{}', # INTERROBANG
+ 0x2044: ur'\textfractionsolidus{}', # FRACTION SLASH
+ 0x2045: ur'\textlquill{}', # LEFT SQUARE BRACKET WITH QUILL
+ 0x2046: ur'\textrquill{}', # RIGHT SQUARE BRACKET WITH QUILL
+ 0x2052: ur'\textdiscount{}', # COMMERCIAL MINUS SIGN
+ 0x20a1: ur'\textcolonmonetary{}', # COLON SIGN
+ 0x20a3: ur'\textfrenchfranc{}', # FRENCH FRANC SIGN
+ 0x20a4: ur'\textlira{}', # LIRA SIGN
+ 0x20a6: ur'\textnaira{}', # NAIRA SIGN
+ 0x20a9: ur'\textwon{}', # WON SIGN
+ 0x20ab: ur'\textdong{}', # DONG SIGN
+ 0x20ac: ur'\texteuro{}', # EURO SIGN
+ 0x20b1: ur'\textpeso{}', # PESO SIGN
+ 0x20b2: ur'\textguarani{}', # GUARANI SIGN
+ 0x2103: ur'\textcelsius{}', # DEGREE CELSIUS
+ 0x2116: ur'\textnumero{}', # NUMERO SIGN
+ 0x2117: ur'\textcircledP{}', # SOUND RECORDING COYRIGHT
+ 0x211e: ur'\textrecipe{}', # PRESCRIPTION TAKE
+ 0x2120: ur'\textservicemark{}', # SERVICE MARK
+ 0x2122: ur'\texttrademark{}', # TRADE MARK SIGN
+ 0x2126: ur'\textohm{}', # OHM SIGN
+ 0x2127: ur'\textmho{}', # INVERTED OHM SIGN
+ 0x212e: ur'\textestimated{}', # ESTIMATED SYMBOL
+ 0x2190: ur'\textleftarrow{}', # LEFTWARDS ARROW
+ 0x2191: ur'\textuparrow{}', # UPWARDS ARROW
+ 0x2192: ur'\textrightarrow{}', # RIGHTWARDS ARROW
+ 0x2193: ur'\textdownarrow{}', # DOWNWARDS ARROW
+ 0x2212: ur'\textminus{}', # MINUS SIGN
+ 0x2217: ur'\textasteriskcentered{}', # ASTERISK OPERATOR
+ 0x221a: ur'\textsurd{}', # SQUARE ROOT
+ 0x2422: ur'\textblank{}', # BLANK SYMBOL
+ 0x2423: ur'\textvisiblespace{}', # OPEN BOX
+ 0x25e6: ur'\textopenbullet{}', # WHITE BULLET
+ 0x25ef: ur'\textbigcircle{}', # LARGE CIRCLE
+ 0x266a: ur'\textmusicalnote{}', # EIGHTH NOTE
+ 0x26ad: ur'\textmarried{}', # MARRIAGE SYMBOL
+ 0x26ae: ur'\textdivorced{}', # DIVORCE SYMBOL
+ 0x27e8: ur'\textlangle{}', # MATHEMATICAL LEFT ANGLE BRACKET
+ 0x27e9: ur'\textrangle{}', # MATHEMATICAL RIGHT ANGLE BRACKET
+ }
# TODO: greek alphabet ... ?
# see also LaTeX codec
# http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/252124
@@ -1255,12 +1340,16 @@ class LaTeXTranslator(nodes.NodeVisitor):
text = self.babel.quote_quotes(text)
# Unicode chars:
table.update(unsupported_unicode_chars)
+ table.update(pifont_chars)
if not self.latex_encoding.startswith('utf8'):
table.update(unicode_chars)
- # Unicode chars that require a feature/package to render
- if [ch for ch in pifont_chars.keys() if unichr(ch) in text]:
- self.requirements['pifont'] = '\\usepackage{pifont}'
- table.update(pifont_chars)
+ table.update(textcomp_chars)
+ # Characters that require a feature/package to render
+ for ch in text:
+ if ord(ch) in pifont_chars:
+ self.requirements['pifont'] = '\\usepackage{pifont}'
+ if ord(ch) in textcomp_chars:
+ self.requirements['textcomp'] = PreambleCmds.textcomp
text = text.translate(table)
@@ -2280,6 +2369,7 @@ class LaTeXTranslator(nodes.NodeVisitor):
self.depart_inline(node)
def has_unbalanced_braces(self, string):
+ """Test whether there are unmatched '{' or '}' characters."""
level = 0
for ch in string:
if ch == '{':
@@ -2303,7 +2393,7 @@ class LaTeXTranslator(nodes.NodeVisitor):
if href.find('^^') != -1 or self.has_unbalanced_braces(href):
self.error(
'External link "%s" not supported by LaTeX.\n'
- ' (Must not contain "^^" or unbalanced braces.)' % href)
+ ' (Must not contain "^^" or unbalanced braces.)' % href)
if node['refuri'] == node.astext():
self.out.append(r'\url{%s}' % href)
raise nodes.SkipNode
@@ -2587,7 +2677,7 @@ class LaTeXTranslator(nodes.NodeVisitor):
if isinstance(node.parent, nodes.table):
self.pop_output_collector()
- def minitoc(self, title, depth):
+ def minitoc(self, node, title, depth):
"""Generate a local table of contents with LaTeX package minitoc"""
section_name = self.d_class.section(self.section_level)
# name-prefix for current section level
@@ -2598,7 +2688,8 @@ class LaTeXTranslator(nodes.NodeVisitor):
minitoc_name = minitoc_names[section_name]
except KeyError: # minitoc only supports part- and toplevel
self.warn('Skipping local ToC at %s level.\n' % section_name +
- ' Feature not supported with option "use-latex-toc"')
+ ' Feature not supported with option "use-latex-toc"',
+ base_node=node)
return
# Requirements/Setup
self.requirements['minitoc'] = PreambleCmds.minitoc
@@ -2640,7 +2731,7 @@ class LaTeXTranslator(nodes.NodeVisitor):
title = self.encode(node.pop(0).astext())
depth = node.get('depth', 0)
if 'local' in node['classes']:
- self.minitoc(title, depth)
+ self.minitoc(title, node, depth)
self.context.append('')
return
if depth:
diff --git a/docutils/test/functional/expected/latex_docinfo.tex b/docutils/test/functional/expected/latex_docinfo.tex
index 880589b0b..716eb5065 100644
--- a/docutils/test/functional/expected/latex_docinfo.tex
+++ b/docutils/test/functional/expected/latex_docinfo.tex
@@ -3,7 +3,7 @@
\usepackage{fixltx2e} % LaTeX patches, \textsubscript
\usepackage{cmap} % fix search and cut-and-paste in PDF
\usepackage[T1]{fontenc}
-\usepackage[latin1]{inputenc}
+\usepackage[utf8]{inputenc}
\usepackage{ifthen}
\usepackage{babel}
diff --git a/docutils/test/functional/expected/standalone_rst_latex.tex b/docutils/test/functional/expected/standalone_rst_latex.tex
index d91e4eade..50c425628 100644
--- a/docutils/test/functional/expected/standalone_rst_latex.tex
+++ b/docutils/test/functional/expected/standalone_rst_latex.tex
@@ -3,7 +3,7 @@
\usepackage{fixltx2e} % LaTeX patches, \textsubscript
\usepackage{cmap} % fix search and cut-and-paste in PDF
\usepackage[T1]{fontenc}
-\usepackage[latin1]{inputenc}
+\usepackage[utf8]{inputenc}
\usepackage{ifthen}
\usepackage{babel}
\usepackage{color}
@@ -11,11 +11,13 @@
\floatplacement{figure}{H} % place figures here definitely
\usepackage{graphicx}
\usepackage{multirow}
+\usepackage{pifont}
\usepackage{longtable}
\usepackage{array}
\setlength{\extrarowheight}{2pt}
\newlength{\DUtablewidth} % internal use in tables
\usepackage{tabularx}
+\usepackage{textcomp} % text symbol macros
%%% Custom LaTeX preamble
% PDF Standard Fonts
@@ -808,10 +810,10 @@ And this is the third paragraph.
%
\DUfootnotetext{id13}{id4}{*}{%
Footnotes may also use symbols, specified with a ``*'' label.
-Here's a reference to the next footnote:\DUfootnotemark{id14}{id15}{\dag{}}.
+Here's a reference to the next footnote:\DUfootnotemark{id14}{id15}{†}.
}
%
-\DUfootnotetext{id15}{id14}{\dag{}}{%
+\DUfootnotetext{id15}{id14}{†}{%
This footnote shows the next symbol in the sequence.
}
%
@@ -1346,7 +1348,7 @@ Here's one:
%
% Double-dashes -- "--" -- must be escaped somehow in HTML output.
%
-% Comments may contain non-ASCII characters:
+% Comments may contain non-ASCII characters: ä ö ü æ ø å
(View the HTML source to see the comment.)
@@ -1679,108 +1681,110 @@ width as the third line.
%___________________________________________________________________________
-\subsection*{3.4~~~Various non-ASCII characters%
+\subsection*{3.4~~~Non-ASCII characters%
\phantomsection%
- \addcontentsline{toc}{subsection}{3.4~~~Various non-ASCII characters}%
- \label{various-non-ascii-characters}%
+ \addcontentsline{toc}{subsection}{3.4~~~Non-ASCII characters}%
+ \label{non-ascii-characters}%
}
+Punctuation and footnote symbols
+
\leavevmode
\setlength{\DUtablewidth}{\linewidth}
\begin{longtable}[c]{|p{0.028\DUtablewidth}|p{0.424\DUtablewidth}|}
\hline
-
+–
&
-copyright sign
+en-dash
\\
\hline
-
+—
&
-registered sign
+em-dash
\\
\hline
-
+‘
&
-left pointing guillemet
+single turned comma quotation mark
\\
\hline
-
+’
&
-right pointing guillemet
+single comma quotation mark
\\
\hline
-\textendash{}
+‚
&
-en-dash
+low single comma quotation mark
\\
\hline
-\textemdash{}
+“
&
-em-dash
+double turned comma quotation mark
\\
\hline
-`
+”
&
-single turned comma quotation mark
+double comma quotation mark
\\
\hline
-'
+„
&
-single comma quotation mark
+low double comma quotation mark
\\
\hline
-\quotesinglbase{}
+†
&
-low single comma quotation mark
+dagger
\\
\hline
-\textquotedblleft{}
+‡
&
-double turned comma quotation mark
+double dagger
\\
\hline
-\textquotedblright{}
+\ding{169}
&
-double comma quotation mark
+black diamond suit
\\
\hline
-\quotedblbase
+\ding{170}
&
-low double comma quotation mark
+black heart suit
\\
\hline
-\dag{}
+$\spadesuit$
&
-dagger
+black spade suit
\\
\hline
-\ddag{}
+$\clubsuit$
&
-double dagger
+black club suit
\\
\hline
-\dots{}
+…
&
ellipsis
\\
\hline
-\texttrademark{}
+™
&
trade mark sign
\\
@@ -1793,11 +1797,307 @@ left-right double arrow
\hline
\end{longtable}
-The following line should not be wrapped, because it uses
-non-breakable spaces:
+The \DUroletitlereference{Latin-1 extended} Unicode block
+
+\leavevmode
+\setlength{\DUtablewidth}{\linewidth}
+\begin{longtable}[c]{|p{0.051\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|}
+\hline
+
+%
+ &
+0
+ &
+1
+ &
+2
+ &
+3
+ &
+4
+ &
+5
+ &
+6
+ &
+7
+ &
+8
+ &
+9
+ \\
+\hline
+
+160
+ & &
+ &
+ &
+ & &
+ &
+ &
+ &
+ &
+ \\
+\hline
+
+170
+ &
+ &
+ &
+ &
+\-
+ &
+ &
+ &
+ &
+ &
+ &
+ \\
+\hline
+
+180
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ \\
+\hline
+
+190
+ &
+ &
+¿
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ \\
+\hline
+
+200
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ \\
+\hline
+
+210
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ \\
+\hline
+
+220
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ \\
+\hline
+
+230
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ \\
+\hline
+
+240
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ &
+ \\
+\hline
+
+250
+ &
+ &
+ &
+ &
+ &
+ &
+ÿ
+ & & & & \\
+\hline
+\end{longtable}
+%
+\begin{itemize}
+
+\item The following line should not be wrapped, because it uses
+no-break spaces (\textbackslash{}u00a0):
X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X
+\item Line wrapping with/without breakpoints marked by soft hyphens
+(\textbackslash{}u00ad):
+
+pdn\-derd\-mdtd\-ri\-schpdn\-derd\-mdtd\-ri\-schpdn\-derd\-mdtd\-ri\-schpdn\-derd\-mdtd\-ri\-schpdn\-derd\-mdtd\-ri\-sch
+
+pdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrisch
+
+\item The currency sign (\textbackslash{}u00a4) is not supported by all fonts
+(some have an Euro sign at its place). You might see an error
+like:
+%
+\begin{quote}{\ttfamily \raggedright \noindent
+!~Package~textcomp~Error:~Symbol~\textbackslash{}textcurrency~not~provided~by\\
+(textcomp)~~~~~~~~~~~~~~~~font~family~ptm~in~TS1~encoding.\\
+(textcomp)~~~~~~~~~~~~~~~~Default~family~used~instead.
+}
+\end{quote}
+
+(which in case of font family ptm is a false positive). Add either
+%
+\begin{DUfieldlist}
+\item[{warn:}]
+turn the error in a warning, use the default symbol (bitmap), or
+
+\item[{force,almostfull:}]
+use the symbol provided by the font at the users
+risk,
+
+\end{DUfieldlist}
+
+to the document options or use a different font package.
+
+\end{itemize}
+
%___________________________________________________________________________
@@ -1816,7 +2116,7 @@ The following characters play a special role in LaTeX and are called
%
\begin{quote}
-\# \$ \% \& \textasciitilde{} \_ \textasciicircum{} \{ \}
+\# \$ \% \& \textasciitilde{} \_ \textasciicircum{} \textbackslash{} \{ \}
\end{quote}
diff --git a/docutils/test/functional/input/data/latex_encoding.txt b/docutils/test/functional/input/data/latex_encoding.txt
index 6d5cc0b9e..1405a5ca7 100644
--- a/docutils/test/functional/input/data/latex_encoding.txt
+++ b/docutils/test/functional/input/data/latex_encoding.txt
@@ -6,7 +6,7 @@ The LaTeX Info pages lists under "2.18 Special Characters"
The following characters play a special role in LaTeX and are called
"special printing characters", or simply "special characters".
- # $ % & ~ _ ^ \ { }
+ # $ % & ~ _ ^ \\ { }
The special chars verbatim::
diff --git a/docutils/test/functional/input/data/unicode.txt b/docutils/test/functional/input/data/unicode.txt
index 4bdd57653..bed6a8c08 100644
--- a/docutils/test/functional/input/data/unicode.txt
+++ b/docutils/test/functional/input/data/unicode.txt
@@ -1,27 +1,70 @@
-Various non-ASCII characters
-----------------------------
+Non-ASCII characters
+--------------------
+
+Punctuation and footnote symbols
= ===================================
-© copyright sign
-® registered sign
-« left pointing guillemet
-» right pointing guillemet
– en-dash
— em-dash
-‘ single turned comma quotation mark
-’ single comma quotation mark
-‚ low single comma quotation mark
-“ double turned comma quotation mark
-” double comma quotation mark
-„ low double comma quotation mark
+‘ single turned comma quotation mark
+’ single comma quotation mark
+‚ low single comma quotation mark
+“ double turned comma quotation mark
+” double comma quotation mark
+„ low double comma quotation mark
† dagger
‡ double dagger
+♦ black diamond suit
+♥ black heart suit
+♠ black spade suit
+♣ black club suit
… ellipsis
™ trade mark sign
⇔ left-right double arrow
= ===================================
-The following line should not be wrapped, because it uses
-non-breakable spaces:
-X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X
+The `Latin-1 extended` Unicode block
+
+=== = = = = = = = = = =
+ .. 0 1 2 3 4 5 6 7 8 9
+--- - - - - - - - - - -
+160   ¡ ¢ £ ¥ ¦ § ¨ ©
+170 ª « ¬ ­ ® ¯ ° ± ² ³
+180 ´ µ ¶ · ¸ ¹ º » ¼ ½
+190 ¾ ¿ À Á Â Ã Ä Å Æ Ç
+200 È É Ê Ë Ì Í Î Ï Ð Ñ
+210 Ò Ó Ô Õ Ö × Ø Ù Ú Û
+220 Ü Ý Þ ß à á â ã ä å
+230 æ ç è é ê ë ì í î ï
+240 ð ñ ò ó ô õ ö ÷ ø ù
+250 ú û ü ý þ ÿ
+=== = = = = = = = = = =
+
+* The following line should not be wrapped, because it uses
+ no-break spaces (\\u00a0):
+
+ X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X
+
+* Line wrapping with/without breakpoints marked by soft hyphens
+ (\\u00ad):
+
+ pdn­derd­mdtd­ri­schpdn­derd­mdtd­ri­schpdn­derd­mdtd­ri­schpdn­derd­mdtd­ri­schpdn­derd­mdtd­ri­sch
+
+ pdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrisch
+
+* The currency sign (\\u00a4) is not supported by all fonts
+ (some have an Euro sign at its place). You might see an error
+ like::
+
+ ! Package textcomp Error: Symbol \textcurrency not provided by
+ (textcomp) font family ptm in TS1 encoding.
+ (textcomp) Default family used instead.
+
+ (which in case of font family ptm is a false positive). Add either
+
+ :warn: turn the error in a warning, use the default symbol (bitmap), or
+ :force,almostfull: use the symbol provided by the font at the users
+ risk,
+
+ to the document options or use a different font package.
diff --git a/docutils/test/test_writers/test_latex2e.py b/docutils/test/test_writers/test_latex2e.py
index 60e939194..cc2302bd3 100755
--- a/docutils/test/test_writers/test_latex2e.py
+++ b/docutils/test/test_writers/test_latex2e.py
@@ -1,3 +1,4 @@
+# -*- coding: utf8 -*-
#! /usr/bin/env python
# $Id$
@@ -50,7 +51,7 @@ parts = dict(
head_prefix = r"""\documentclass[a4paper,english]{article}
""",
requirements = r"""\usepackage[T1]{fontenc}
-\usepackage[latin1]{inputenc}
+\usepackage[utf8]{inputenc}
\usepackage{ifthen}
\usepackage{babel}
""",
@@ -79,6 +80,10 @@ r"""\usepackage{longtable}
\newlength{\DUtablewidth} % internal use in tables
"""))
+head_textcomp = head_template.substitute(
+ dict(parts, requirements = parts['requirements'] +
+r"""\usepackage{textcomp} % text symbol macros
+"""))
totest = {}
totest_latex_toc = {}
@@ -96,6 +101,15 @@ head + r"""
"""],
]
+totest['textcomp'] = [
+["2 µm is just 2/1000000 m",
+head_textcomp + r"""
+2 µm is just 2/1000000 m
+
+\end{document}
+"""],
+]
+
totest['table_of_contents'] = [
# input
["""\