diff options
| -rw-r--r-- | docutils/HISTORY.txt | 3 | ||||
| -rw-r--r-- | docutils/RELEASE-NOTES.txt | 5 | ||||
| -rw-r--r-- | docutils/docs/user/latex.txt | 73 | ||||
| -rw-r--r-- | docutils/docutils/writers/latex2e/__init__.py | 163 | ||||
| -rw-r--r-- | docutils/test/functional/expected/latex_docinfo.tex | 2 | ||||
| -rw-r--r-- | docutils/test/functional/expected/standalone_rst_latex.tex | 380 | ||||
| -rw-r--r-- | docutils/test/functional/input/data/latex_encoding.txt | 2 | ||||
| -rw-r--r-- | docutils/test/functional/input/data/unicode.txt | 73 | ||||
| -rwxr-xr-x | docutils/test/test_writers/test_latex2e.py | 16 |
9 files changed, 602 insertions, 115 deletions
diff --git a/docutils/HISTORY.txt b/docutils/HISTORY.txt index e484fb233..30b05feba 100644 --- a/docutils/HISTORY.txt +++ b/docutils/HISTORY.txt @@ -55,9 +55,12 @@ Changes Since 0.6 - Fix hyperlink targets (labels) for images, figures, and tables. - Apply [ 2961988 ] Load babel after inputenc and fontenc. - Apply [ 2961991 ] Call hyperref with unicode option. + - Drop the special `output_encoding`_ default ("latin-1"). + The Docutils wide default (usually "UTF-8") is used instead. __ docs/ref/restructuredtext.html#inline-literals __ docs/user/config.html#docutils-footnotes +__ docs/user/config.html#output_encoding * docutils/writers/manpage.py diff --git a/docutils/RELEASE-NOTES.txt b/docutils/RELEASE-NOTES.txt index 4df71db05..5b79b4fb8 100644 --- a/docutils/RELEASE-NOTES.txt +++ b/docutils/RELEASE-NOTES.txt @@ -31,12 +31,15 @@ Components: - Deprecate ``figure_footnotes`` setting. - Rename ``use_latex_footnotes`` setting to `docutils_footnotes`__. - New ``latex_preamble`` setting. - - PDF standard fonts (Times/Helvetica/Courier) as default. + - Use PDF standard fonts (Times/Helvetica/Courier) as default. - `hyperref` package called with ``unicode`` option (see the `hyperref config tips`__ for how to override). + - Drop the special `output_encoding`__ default ("latin-1"). + The Docutils wide default (usually "UTF-8") is used instead. __ docs/user/config.html#docutils-footnotes __ docs/user/latex.html#hyperlinks +__ docs/user/latex.html#output-encoding General: diff --git a/docutils/docs/user/latex.txt b/docutils/docs/user/latex.txt index 6c2ee0971..8a47dd6c7 100644 --- a/docutils/docs/user/latex.txt +++ b/docutils/docs/user/latex.txt @@ -943,7 +943,7 @@ Example 1: This will improve the look on screen with the default Computer Modern fonts at the expense of problems with `search and text extraction`_ - The recommended workaround is to select a T1-encoded "Type 1" (vector) + The recommended way is to select a T1-encoded "Type 1" (vector) font, for example `Latin Modern`_ Example 2: @@ -1580,30 +1580,33 @@ Example: text encoding ------------- -The encoding of the LaTeX source file, i.e. -Docutils' *output* encoding becomes the LaTeX *input* encoding. +The encoding of the LaTeX source file is Docutils' *output* encoding +but LaTeX' *input* encoding. Option: output-encoding_ ``--output-encoding=OUTPUT-ENCODING`` Default: - latin1 + "utf8" - .. TODO: use the Docutils-wide default: output-encoding = input-encoding +Example: + Encode the LaTeX source file with the ISO `latin-1` (west european) + 8-bit encoding (the default in Docutils versions up to 0.6.):: + --output-encoding=latin-1 -LaTeX comes with two packages for UTF-8 support, +Note: + LaTeX comes with two packages for UTF-8 support, -:utf8: by the standard `inputenc`_ package with only limited coverage - (mainly accented chars, only few non-alphabetic symbols, no Greek or - Cyrillic). + :utf8: by the standard `inputenc`_ package with only limited coverage + (mainly accented chars, no Greekv). -:utf8x: supported by the `ucs`_ package covers a wider range of Unicode - characters than does "utf8". It is, however, a non-standard - extension and no longer developed. + :utf8x: supported by the `ucs`_ package covers a wider range of Unicode + characters than does "utf8". It is, however, a non-standard + extension and no longer developed. -Currently (in version 0.6), "utf8" is used if the output-encoding is -any of "utf_8", "U8", "UTF", or "utf8". + Currently (in version 0.6), "utf8" is used if the output-encoding is + any of "utf_8", "U8", "UTF", or "utf8". .. with utf8x: If LaTeX issues a Warning about unloaded/unknown characters adding :: @@ -1795,6 +1798,23 @@ If updating LaTeX is not an option, just remove the "px" from the length specification. HTML/CSS will default to "px" while the `latexe2` writer will add the fallback unit "bp". +Error ``Symbol \textcurrency not provided`` ... +``````````````````````````````````````````````` + +The currency sign (\\u00a4) is not supported by all fonts (some have +an Euro sign at its place). You might see an error like:: + + ! Package textcomp Error: Symbol \textcurrency not provided by + (textcomp) font family ptm in TS1 encoding. + (textcomp) Default family used instead. + +(which in case of font family "ptm" is a false positive). Add either + +:warn: turn the error in a warning, use the default symbol (bitmap), or +:force,almostfull: use the symbol provided by the font at the users + risk, + +to the document options or use a different font package. Search and text extraction `````````````````````````` @@ -1806,21 +1826,34 @@ umlauts) might fail. See font_ and `font encoding`_ (as well as .. _Searching PDF files: http://www.tex.ac.uk/cgi-bin/texfaq2html?label=srchpdf -Unicode box drawing characters -``````````````````````````````` +Unicode box drawing and block characters +```````````````````````````````````````` + +- Generate LaTeX code with `output-encoding`_ "utf-8". + +- Add the pmboxdraw_ package to the `style sheets`_. + (For shaded boxes also add the `color` package.) - - generate LaTeX code with ``--output-encoding=utf-8:strict``. +Unfortunately, this defines only a subset of the characters +(see pmboxdraw.pdf_ for a list). - - In the latex file, edit the preamble to load "ucs" with "postscript" - option and also load the pstricks package:: +Alternatively: + +- In the latex file, edit the preamble to load ucs_ with "postscript" + option and also load the pstricks package:: - \usepackage[utf8]{inputenc} + \usepackage[postscript]{ucs} + \usepackage{pstricks} + \usepackage[utf8x]{inputenc} - - Convert to PDF with ``latex``, ``dvips``, and ``ps2pdf``. +- Convert to PDF with ``latex``, ``dvips``, and ``ps2pdf``. + +.. _pmboxdraw: + http://www.ctan.org/tex-archive/help/Catalogue/entries/pmboxdraw.html +.. _pmboxdraw.pdf: + http://www.ctan.org/tex-archive/macros/latex/contrib/oberdiek/pmboxdraw.pdf Bugs and open issues -------------------- diff --git a/docutils/docutils/writers/latex2e/__init__.py b/docutils/docutils/writers/latex2e/__init__.py index e6a0e233a..ab9569715 100644 --- a/docutils/docutils/writers/latex2e/__init__.py +++ b/docutils/docutils/writers/latex2e/__init__.py @@ -17,8 +17,8 @@ import os import time import re import string -from docutils import frontend, nodes, languages, writers, utils, transforms, io -from docutils.writers.newlatex2e import unicode_map +from docutils import frontend, nodes, languages, writers, utils, io +from docutils.transforms import writer_aux # compatibility module for Python <= 2.4 if not hasattr(string, 'Template'): @@ -39,7 +39,7 @@ class Writer(writers.Writer): r'\usepackage{courier}']) settings_spec = ( 'LaTeX-Specific Options', - 'The LaTeX "--output-encoding" default is "latin-1:strict".', + None, (('Specify documentclass. Default is "article".', ['--documentclass'], {'default': 'article', }), @@ -198,10 +198,8 @@ class Writer(writers.Writer): {'default': None, }), ),) - settings_defaults = {'output_encoding': 'latin-1', - 'sectnum_depth': 0 # updated by SectNum transform + settings_defaults = {'sectnum_depth': 0 # updated by SectNum transform } - relative_path_settings = ('stylesheet_path',) config_section = 'latex2e writer' @@ -225,7 +223,7 @@ class Writer(writers.Writer): transform_list = writers.Writer.get_transforms(self) # print transform_list # Convert specific admonitions to generic one - transform_list.append(transforms.writer_aux.Admonitions) + transform_list.append(writer_aux.Admonitions) # TODO: footnote collection transform # transform_list.append(footnotes.collect) return transform_list @@ -355,7 +353,7 @@ class SortableDict(dict): """Dictionary with additional sorting methods Tip: use key starting with with '_' for sorting before small letters - and with '~' for sorting after small letters. + and with '~' for sorting after small letters. """ def sortedkeys(self): """Return sorted list of keys""" @@ -559,6 +557,11 @@ PreambleCmds.table = r"""\usepackage{longtable} \setlength{\extrarowheight}{2pt} \newlength{\DUtablewidth} % internal use in tables""" +# Options [force,almostfull] prevent spurious error messages, see +# de.comp.text.tex/2005-12/msg01855 +PreambleCmds.textcomp = """\ +\\usepackage{textcomp} % text symbol macros""" + PreambleCmds.documenttitle = r""" %% Document title \title{%s} @@ -635,9 +638,10 @@ class Table(object): Table style might be - :standard: horizontal and vertical lines - :booktabs: only horizontal lines (requires "booktabs" LaTeX package) - :nolines: (or borderless) no lines + :standard: horizontal and vertical lines + :booktabs: only horizontal lines (requires "booktabs" LaTeX package) + :borderless: no borders around table cells + :nolines: alias for borderless """ def __init__(self,translator,latex_type,table_style): self._translator = translator @@ -936,7 +940,7 @@ class LaTeXTranslator(nodes.NodeVisitor): self.docutils_footnotes = True self.warn('`use_latex_footnotes` is deprecated. ' 'The setting has been renamed to `docutils_footnotes` ' - 'and the alias will be removed in a future version.') + 'and the alias will be removed in a future version.') self.figure_footnotes = settings.figure_footnotes if self.figure_footnotes: self.docutils_footnotes = True @@ -1008,19 +1012,23 @@ class LaTeXTranslator(nodes.NodeVisitor): # Process settings # ~~~~~~~~~~~~~~~~ - # persistent requirements - if self.font_encoding == '': - fontenc_header = r'%\usepackage[OT1]{fontenc}' + # Static requirements + # TeX font encoding + if self.font_encoding: + encodings = [r'\usepackage[%s]{fontenc}' % self.font_encoding] else: - fontenc_header = r'\usepackage[%s]{fontenc}' % self.font_encoding - self.requirements['_persistent'] = '\n'.join([ - fontenc_header, - r'\usepackage[%s]{inputenc}' % self.latex_encoding, + encodings = [r'%\usepackage[OT1]{fontenc}'] # just a comment + # Docutils' output-encoding => TeX input encoding: + if self.latex_encoding != 'ascii': + encodings.append(r'\usepackage[%s]{inputenc}' + % self.latex_encoding) + self.requirements['_static'] = '\n'.join( + encodings + [ r'\usepackage{ifthen}', - # multi-language support (language is in document settings) + # multi-language support (language is in document options) '\\usepackage{babel}%s' % self.babel.setup, ]) - # page layout with typearea (if there are relevant document options). + # page layout with typearea (if there are relevant document options) if (settings.documentclass.find('scr') == -1 and (self.d_options.find('DIV') != -1 or self.d_options.find('BCOR') != -1)): @@ -1096,7 +1104,6 @@ class LaTeXTranslator(nodes.NodeVisitor): """Translate docutils encoding name into LaTeX's. Default method is remove "-" and "_" chars from docutils_encoding. - """ tr = { 'iso-8859-1': 'latin1', # west european 'iso-8859-2': 'latin2', # east european @@ -1189,8 +1196,9 @@ class LaTeXTranslator(nodes.NodeVisitor): # Unicode chars that are not recognized by LaTeX's utf8 encoding unsupported_unicode_chars = { 0x00A0: ur'~', # NO-BREAK SPACE - 0x00AD: ur'\-', # SOFT HYPHEN - 0x2011: ur'\hbox{-}', # NON-BREAKING HYPHEN + 0x00AD: ur'\-', # SOFT HYPHEN + # + 0x2011: ur'\hbox{-}', # NON-BREAKING HYPHEN 0x21d4: ur'$\Leftrightarrow$', # Docutils footnote symbols: 0x2660: ur'$\spadesuit$', @@ -1216,10 +1224,87 @@ class LaTeXTranslator(nodes.NodeVisitor): 0x2665: ur'\ding{170}', # black heartsuit 0x2666: ur'\ding{169}', # black diamondsuit } - # TODO: replacements using textcomp - ## textcomp_chars = { - ## 0x00B5: ur'\textmu{}', # MICRO SIGN - ## } + # recognized with 'utf8', if textcomp is loaded + textcomp_chars = { + # Latin-1 Supplement + 0x00a2: ur'\textcent{}', # ¢ CENT SIGN + 0x00a4: ur'\textcurrency{}', # ¤ CURRENCY SYMBOL + 0x00a5: ur'\textyen{}', # ¥ YEN SIGN + 0x00a6: ur'\textbrokenbar{}', # ¦ BROKEN BAR + 0x00a7: ur'\textsection{}', # § SECTION SIGN + 0x00a8: ur'\textasciidieresis{}', # ¨ DIAERESIS + 0x00a9: ur'\textcopyright{}', # © COPYRIGHT SIGN + 0x00aa: ur'\textordfeminine{}', # ª FEMININE ORDINAL INDICATOR + 0x00ac: ur'\textlnot{}', # ¬ NOT SIGN + 0x00ae: ur'\textregistered{}', # ® REGISTERED SIGN + 0x00af: ur'\textasciimacron{}', # ¯ MACRON + 0x00b0: ur'\textdegree{}', # ° DEGREE SIGN + 0x00b1: ur'\textpm{}', # ± PLUS-MINUS SIGN + 0x00b2: ur'\texttwosuperior{}', # ² SUPERSCRIPT TWO + 0x00b3: ur'\textthreesuperior{}', # ³ SUPERSCRIPT THREE + 0x00b4: ur'\textasciiacute{}', # ´ ACUTE ACCENT + 0x00b5: ur'\textmu{}', # µ MICRO SIGN + 0x00b6: ur'\textparagraph{}', # ¶ PILCROW SIGN # not equal to \textpilcrow + 0x00b9: ur'\textonesuperior{}', # ¹ SUPERSCRIPT ONE + 0x00ba: ur'\textordmasculine{}', # º MASCULINE ORDINAL INDICATOR + 0x00bc: ur'\textonequarter{}', # 1/4 FRACTION + 0x00bd: ur'\textonehalf{}', # 1/2 FRACTION + 0x00be: ur'\textthreequarters{}', # 3/4 FRACTION + 0x00d7: ur'\texttimes{}', # × MULTIPLICATION SIGN + 0x00f7: ur'\textdiv{}', # ÷ DIVISION SIGN + # + 0x0192: ur'\textflorin{}', # LATIN SMALL LETTER F WITH HOOK + 0x02b9: ur'\textasciiacute{}', # MODIFIER LETTER PRIME + 0x02ba: ur'\textacutedbl{}', # MODIFIER LETTER DOUBLE PRIME + 0x2016: ur'\textbardbl{}', # DOUBLE VERTICAL LINE + 0x2022: ur'\textbullet{}', # BULLET + 0x2030: ur'\textperthousand{}', # PER MILLE SIGN + 0x2031: ur'\textpertenthousand{}', # PER TEN THOUSAND SIGN + 0x2032: ur'\textasciiacute{}', # PRIME + 0x2033: ur'\textacutedbl{}', # DOUBLE PRIME + 0x2035: ur'\textasciigrave{}', # REVERSED PRIME + 0x2036: ur'\textgravedbl{}', # REVERSED DOUBLE PRIME + 0x203b: ur'\textreferencemark{}', # REFERENCE MARK + 0x203d: ur'\textinterrobang{}', # INTERROBANG + 0x2044: ur'\textfractionsolidus{}', # FRACTION SLASH + 0x2045: ur'\textlquill{}', # LEFT SQUARE BRACKET WITH QUILL + 0x2046: ur'\textrquill{}', # RIGHT SQUARE BRACKET WITH QUILL + 0x2052: ur'\textdiscount{}', # COMMERCIAL MINUS SIGN + 0x20a1: ur'\textcolonmonetary{}', # COLON SIGN + 0x20a3: ur'\textfrenchfranc{}', # FRENCH FRANC SIGN + 0x20a4: ur'\textlira{}', # LIRA SIGN + 0x20a6: ur'\textnaira{}', # NAIRA SIGN + 0x20a9: ur'\textwon{}', # WON SIGN + 0x20ab: ur'\textdong{}', # DONG SIGN + 0x20ac: ur'\texteuro{}', # EURO SIGN + 0x20b1: ur'\textpeso{}', # PESO SIGN + 0x20b2: ur'\textguarani{}', # GUARANI SIGN + 0x2103: ur'\textcelsius{}', # DEGREE CELSIUS + 0x2116: ur'\textnumero{}', # NUMERO SIGN + 0x2117: ur'\textcircledP{}', # SOUND RECORDING COYRIGHT + 0x211e: ur'\textrecipe{}', # PRESCRIPTION TAKE + 0x2120: ur'\textservicemark{}', # SERVICE MARK + 0x2122: ur'\texttrademark{}', # TRADE MARK SIGN + 0x2126: ur'\textohm{}', # OHM SIGN + 0x2127: ur'\textmho{}', # INVERTED OHM SIGN + 0x212e: ur'\textestimated{}', # ESTIMATED SYMBOL + 0x2190: ur'\textleftarrow{}', # LEFTWARDS ARROW + 0x2191: ur'\textuparrow{}', # UPWARDS ARROW + 0x2192: ur'\textrightarrow{}', # RIGHTWARDS ARROW + 0x2193: ur'\textdownarrow{}', # DOWNWARDS ARROW + 0x2212: ur'\textminus{}', # MINUS SIGN + 0x2217: ur'\textasteriskcentered{}', # ASTERISK OPERATOR + 0x221a: ur'\textsurd{}', # SQUARE ROOT + 0x2422: ur'\textblank{}', # BLANK SYMBOL + 0x2423: ur'\textvisiblespace{}', # OPEN BOX + 0x25e6: ur'\textopenbullet{}', # WHITE BULLET + 0x25ef: ur'\textbigcircle{}', # LARGE CIRCLE + 0x266a: ur'\textmusicalnote{}', # EIGHTH NOTE + 0x26ad: ur'\textmarried{}', # MARRIAGE SYMBOL + 0x26ae: ur'\textdivorced{}', # DIVORCE SYMBOL + 0x27e8: ur'\textlangle{}', # MATHEMATICAL LEFT ANGLE BRACKET + 0x27e9: ur'\textrangle{}', # MATHEMATICAL RIGHT ANGLE BRACKET + } # TODO: greek alphabet ... ? # see also LaTeX codec # http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/252124 @@ -1255,12 +1340,16 @@ class LaTeXTranslator(nodes.NodeVisitor): text = self.babel.quote_quotes(text) # Unicode chars: table.update(unsupported_unicode_chars) + table.update(pifont_chars) if not self.latex_encoding.startswith('utf8'): table.update(unicode_chars) - # Unicode chars that require a feature/package to render - if [ch for ch in pifont_chars.keys() if unichr(ch) in text]: - self.requirements['pifont'] = '\\usepackage{pifont}' - table.update(pifont_chars) + table.update(textcomp_chars) + # Characters that require a feature/package to render + for ch in text: + if ord(ch) in pifont_chars: + self.requirements['pifont'] = '\\usepackage{pifont}' + if ord(ch) in textcomp_chars: + self.requirements['textcomp'] = PreambleCmds.textcomp text = text.translate(table) @@ -2280,6 +2369,7 @@ class LaTeXTranslator(nodes.NodeVisitor): self.depart_inline(node) def has_unbalanced_braces(self, string): + """Test whether there are unmatched '{' or '}' characters.""" level = 0 for ch in string: if ch == '{': @@ -2303,7 +2393,7 @@ class LaTeXTranslator(nodes.NodeVisitor): if href.find('^^') != -1 or self.has_unbalanced_braces(href): self.error( 'External link "%s" not supported by LaTeX.\n' - ' (Must not contain "^^" or unbalanced braces.)' % href) + ' (Must not contain "^^" or unbalanced braces.)' % href) if node['refuri'] == node.astext(): self.out.append(r'\url{%s}' % href) raise nodes.SkipNode @@ -2587,7 +2677,7 @@ class LaTeXTranslator(nodes.NodeVisitor): if isinstance(node.parent, nodes.table): self.pop_output_collector() - def minitoc(self, title, depth): + def minitoc(self, node, title, depth): """Generate a local table of contents with LaTeX package minitoc""" section_name = self.d_class.section(self.section_level) # name-prefix for current section level @@ -2598,7 +2688,8 @@ class LaTeXTranslator(nodes.NodeVisitor): minitoc_name = minitoc_names[section_name] except KeyError: # minitoc only supports part- and toplevel self.warn('Skipping local ToC at %s level.\n' % section_name + - ' Feature not supported with option "use-latex-toc"') + ' Feature not supported with option "use-latex-toc"', + base_node=node) return # Requirements/Setup self.requirements['minitoc'] = PreambleCmds.minitoc @@ -2640,7 +2731,7 @@ class LaTeXTranslator(nodes.NodeVisitor): title = self.encode(node.pop(0).astext()) depth = node.get('depth', 0) if 'local' in node['classes']: - self.minitoc(title, depth) + self.minitoc(title, node, depth) self.context.append('') return if depth: diff --git a/docutils/test/functional/expected/latex_docinfo.tex b/docutils/test/functional/expected/latex_docinfo.tex index 880589b0b..716eb5065 100644 --- a/docutils/test/functional/expected/latex_docinfo.tex +++ b/docutils/test/functional/expected/latex_docinfo.tex @@ -3,7 +3,7 @@ \usepackage{fixltx2e} % LaTeX patches, \textsubscript \usepackage{cmap} % fix search and cut-and-paste in PDF \usepackage[T1]{fontenc} -\usepackage[latin1]{inputenc} +\usepackage[utf8]{inputenc} \usepackage{ifthen} \usepackage{babel} diff --git a/docutils/test/functional/expected/standalone_rst_latex.tex b/docutils/test/functional/expected/standalone_rst_latex.tex index d91e4eade..50c425628 100644 --- a/docutils/test/functional/expected/standalone_rst_latex.tex +++ b/docutils/test/functional/expected/standalone_rst_latex.tex @@ -3,7 +3,7 @@ \usepackage{fixltx2e} % LaTeX patches, \textsubscript \usepackage{cmap} % fix search and cut-and-paste in PDF \usepackage[T1]{fontenc} -\usepackage[latin1]{inputenc} +\usepackage[utf8]{inputenc} \usepackage{ifthen} \usepackage{babel} \usepackage{color} @@ -11,11 +11,13 @@ \floatplacement{figure}{H} % place figures here definitely \usepackage{graphicx} \usepackage{multirow} +\usepackage{pifont} \usepackage{longtable} \usepackage{array} \setlength{\extrarowheight}{2pt} \newlength{\DUtablewidth} % internal use in tables \usepackage{tabularx} +\usepackage{textcomp} % text symbol macros %%% Custom LaTeX preamble % PDF Standard Fonts @@ -808,10 +810,10 @@ And this is the third paragraph. % \DUfootnotetext{id13}{id4}{*}{% Footnotes may also use symbols, specified with a ``*'' label. -Here's a reference to the next footnote:\DUfootnotemark{id14}{id15}{\dag{}}. +Here's a reference to the next footnote:\DUfootnotemark{id14}{id15}{†}. } % -\DUfootnotetext{id15}{id14}{\dag{}}{% +\DUfootnotetext{id15}{id14}{†}{% This footnote shows the next symbol in the sequence. } % @@ -1346,7 +1348,7 @@ Here's one: % % Double-dashes -- "--" -- must be escaped somehow in HTML output. % -% Comments may contain non-ASCII characters: +% Comments may contain non-ASCII characters: ä ö ü æ ø å (View the HTML source to see the comment.) @@ -1679,108 +1681,110 @@ width as the third line. %___________________________________________________________________________ -\subsection*{3.4~~~Various non-ASCII characters% +\subsection*{3.4~~~Non-ASCII characters% \phantomsection% - \addcontentsline{toc}{subsection}{3.4~~~Various non-ASCII characters}% - \label{various-non-ascii-characters}% + \addcontentsline{toc}{subsection}{3.4~~~Non-ASCII characters}% + \label{non-ascii-characters}% } +Punctuation and footnote symbols + \leavevmode \setlength{\DUtablewidth}{\linewidth} \begin{longtable}[c]{|p{0.028\DUtablewidth}|p{0.424\DUtablewidth}|} \hline - +– & -copyright sign +en-dash \\ \hline - +— & -registered sign +em-dash \\ \hline - +‘ & -left pointing guillemet +single turned comma quotation mark \\ \hline - +’ & -right pointing guillemet +single comma quotation mark \\ \hline -\textendash{} +‚ & -en-dash +low single comma quotation mark \\ \hline -\textemdash{} +“ & -em-dash +double turned comma quotation mark \\ \hline -` +” & -single turned comma quotation mark +double comma quotation mark \\ \hline -' +„ & -single comma quotation mark +low double comma quotation mark \\ \hline -\quotesinglbase{} +† & -low single comma quotation mark +dagger \\ \hline -\textquotedblleft{} +‡ & -double turned comma quotation mark +double dagger \\ \hline -\textquotedblright{} +\ding{169} & -double comma quotation mark +black diamond suit \\ \hline -\quotedblbase +\ding{170} & -low double comma quotation mark +black heart suit \\ \hline -\dag{} +$\spadesuit$ & -dagger +black spade suit \\ \hline -\ddag{} +$\clubsuit$ & -double dagger +black club suit \\ \hline -\dots{} +… & ellipsis \\ \hline -\texttrademark{} +™ & trade mark sign \\ @@ -1793,11 +1797,307 @@ left-right double arrow \hline \end{longtable} -The following line should not be wrapped, because it uses -non-breakable spaces: +The \DUroletitlereference{Latin-1 extended} Unicode block + +\leavevmode +\setlength{\DUtablewidth}{\linewidth} +\begin{longtable}[c]{|p{0.051\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|p{0.028\DUtablewidth}|} +\hline + +% + & +0 + & +1 + & +2 + & +3 + & +4 + & +5 + & +6 + & +7 + & +8 + & +9 + \\ +\hline + +160 + & & +¡ + & +¢ + & +£ + & & +¥ + & +¦ + & +§ + & +¨ + & +© + \\ +\hline + +170 + & +ª + & +« + & +¬ + & +\- + & +® + & +¯ + & +° + & +± + & +² + & +³ + \\ +\hline + +180 + & +´ + & +µ + & +¶ + & +· + & +¸ + & +¹ + & +º + & +» + & +¼ + & +½ + \\ +\hline + +190 + & +¾ + & +¿ + & +À + & +Á + & +Â + & +Ã + & +Ä + & +Å + & +Æ + & +Ç + \\ +\hline + +200 + & +È + & +É + & +Ê + & +Ë + & +Ì + & +Í + & +Î + & +Ï + & +Ð + & +Ñ + \\ +\hline + +210 + & +Ò + & +Ó + & +Ô + & +Õ + & +Ö + & +× + & +Ø + & +Ù + & +Ú + & +Û + \\ +\hline + +220 + & +Ü + & +Ý + & +Þ + & +ß + & +à + & +á + & +â + & +ã + & +ä + & +å + \\ +\hline + +230 + & +æ + & +ç + & +è + & +é + & +ê + & +ë + & +ì + & +í + & +î + & +ï + \\ +\hline + +240 + & +ð + & +ñ + & +ò + & +ó + & +ô + & +õ + & +ö + & +÷ + & +ø + & +ù + \\ +\hline + +250 + & +ú + & +û + & +ü + & +ý + & +þ + & +ÿ + & & & & \\ +\hline +\end{longtable} +% +\begin{itemize} + +\item The following line should not be wrapped, because it uses +no-break spaces (\textbackslash{}u00a0): X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X~X +\item Line wrapping with/without breakpoints marked by soft hyphens +(\textbackslash{}u00ad): + +pdn\-derd\-mdtd\-ri\-schpdn\-derd\-mdtd\-ri\-schpdn\-derd\-mdtd\-ri\-schpdn\-derd\-mdtd\-ri\-schpdn\-derd\-mdtd\-ri\-sch + +pdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrisch + +\item The currency sign (\textbackslash{}u00a4) is not supported by all fonts +(some have an Euro sign at its place). You might see an error +like: +% +\begin{quote}{\ttfamily \raggedright \noindent +!~Package~textcomp~Error:~Symbol~\textbackslash{}textcurrency~not~provided~by\\ +(textcomp)~~~~~~~~~~~~~~~~font~family~ptm~in~TS1~encoding.\\ +(textcomp)~~~~~~~~~~~~~~~~Default~family~used~instead. +} +\end{quote} + +(which in case of font family ptm is a false positive). Add either +% +\begin{DUfieldlist} +\item[{warn:}] +turn the error in a warning, use the default symbol (bitmap), or + +\item[{force,almostfull:}] +use the symbol provided by the font at the users +risk, + +\end{DUfieldlist} + +to the document options or use a different font package. + +\end{itemize} + %___________________________________________________________________________ @@ -1816,7 +2116,7 @@ The following characters play a special role in LaTeX and are called % \begin{quote} -\# \$ \% \& \textasciitilde{} \_ \textasciicircum{} \{ \} +\# \$ \% \& \textasciitilde{} \_ \textasciicircum{} \textbackslash{} \{ \} \end{quote} diff --git a/docutils/test/functional/input/data/latex_encoding.txt b/docutils/test/functional/input/data/latex_encoding.txt index 6d5cc0b9e..1405a5ca7 100644 --- a/docutils/test/functional/input/data/latex_encoding.txt +++ b/docutils/test/functional/input/data/latex_encoding.txt @@ -6,7 +6,7 @@ The LaTeX Info pages lists under "2.18 Special Characters" The following characters play a special role in LaTeX and are called "special printing characters", or simply "special characters". - # $ % & ~ _ ^ \ { } + # $ % & ~ _ ^ \\ { } The special chars verbatim:: diff --git a/docutils/test/functional/input/data/unicode.txt b/docutils/test/functional/input/data/unicode.txt index 4bdd57653..bed6a8c08 100644 --- a/docutils/test/functional/input/data/unicode.txt +++ b/docutils/test/functional/input/data/unicode.txt @@ -1,27 +1,70 @@ -Various non-ASCII characters ----------------------------- +Non-ASCII characters +-------------------- + +Punctuation and footnote symbols = =================================== -© copyright sign -® registered sign -« left pointing guillemet -» right pointing guillemet – en-dash — em-dash -‘ single turned comma quotation mark -’ single comma quotation mark -‚ low single comma quotation mark -“ double turned comma quotation mark -” double comma quotation mark -„ low double comma quotation mark +‘ single turned comma quotation mark +’ single comma quotation mark +‚ low single comma quotation mark +“ double turned comma quotation mark +” double comma quotation mark +„ low double comma quotation mark † dagger ‡ double dagger +♦ black diamond suit +♥ black heart suit +♠ black spade suit +♣ black club suit … ellipsis ™ trade mark sign ⇔ left-right double arrow = =================================== -The following line should not be wrapped, because it uses -non-breakable spaces: -X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X +The `Latin-1 extended` Unicode block + +=== = = = = = = = = = = + .. 0 1 2 3 4 5 6 7 8 9 +--- - - - - - - - - - - +160 ¡ ¢ £ ¥ ¦ § ¨ © +170 ª « ¬ ® ¯ ° ± ² ³ +180 ´ µ ¶ · ¸ ¹ º » ¼ ½ +190 ¾ ¿ À Á Â Ã Ä Å Æ Ç +200 È É Ê Ë Ì Í Î Ï Ð Ñ +210 Ò Ó Ô Õ Ö × Ø Ù Ú Û +220 Ü Ý Þ ß à á â ã ä å +230 æ ç è é ê ë ì í î ï +240 ð ñ ò ó ô õ ö ÷ ø ù +250 ú û ü ý þ ÿ +=== = = = = = = = = = = + +* The following line should not be wrapped, because it uses + no-break spaces (\\u00a0): + + X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X + +* Line wrapping with/without breakpoints marked by soft hyphens + (\\u00ad): + + pdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrisch + + pdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrischpdnderdmdtdrisch + +* The currency sign (\\u00a4) is not supported by all fonts + (some have an Euro sign at its place). You might see an error + like:: + + ! Package textcomp Error: Symbol \textcurrency not provided by + (textcomp) font family ptm in TS1 encoding. + (textcomp) Default family used instead. + + (which in case of font family ptm is a false positive). Add either + + :warn: turn the error in a warning, use the default symbol (bitmap), or + :force,almostfull: use the symbol provided by the font at the users + risk, + + to the document options or use a different font package. diff --git a/docutils/test/test_writers/test_latex2e.py b/docutils/test/test_writers/test_latex2e.py index 60e939194..cc2302bd3 100755 --- a/docutils/test/test_writers/test_latex2e.py +++ b/docutils/test/test_writers/test_latex2e.py @@ -1,3 +1,4 @@ +# -*- coding: utf8 -*- #! /usr/bin/env python # $Id$ @@ -50,7 +51,7 @@ parts = dict( head_prefix = r"""\documentclass[a4paper,english]{article} """, requirements = r"""\usepackage[T1]{fontenc} -\usepackage[latin1]{inputenc} +\usepackage[utf8]{inputenc} \usepackage{ifthen} \usepackage{babel} """, @@ -79,6 +80,10 @@ r"""\usepackage{longtable} \newlength{\DUtablewidth} % internal use in tables """)) +head_textcomp = head_template.substitute( + dict(parts, requirements = parts['requirements'] + +r"""\usepackage{textcomp} % text symbol macros +""")) totest = {} totest_latex_toc = {} @@ -96,6 +101,15 @@ head + r""" """], ] +totest['textcomp'] = [ +["2 µm is just 2/1000000 m", +head_textcomp + r""" +2 µm is just 2/1000000 m + +\end{document} +"""], +] + totest['table_of_contents'] = [ # input ["""\ |
