diff options
author | G. Branden Robinson <g.branden.robinson@gmail.com> | 2023-04-24 19:58:08 -0500 |
---|---|---|
committer | G. Branden Robinson <g.branden.robinson@gmail.com> | 2023-04-24 23:58:16 -0500 |
commit | 0223aef4164a7b07cb933a397894878bb61773b5 (patch) | |
tree | 62de6e9befb4add57befe2d2a338f7be784179f8 | |
parent | bd22b5bd0d26d0e3191f25a291960c4595dc873b (diff) | |
download | groff-git-0223aef4164a7b07cb933a397894878bb61773b5.tar.gz |
[docs]: Reduce use of term "entity".
Doug McIlroy noted this vague term, which groff employs for multiple
purposes. Eliminate its application to input processing. There is now
no longer such a thing as an "entity" in the groff language.
* doc/groff.texi (Character Translations): Do it. Also clarify
"nothing" as "the dummy character".
(Using Symbols): Do it. Also recast explanation of difference between
characters and glyphs. Explicitly state that spaces aren't glyphs.
Document that `rchar` request can't remove definitions supplied by
font description files.
(Ligatures and Kerning): Speak of "special characters", not
"entities".
(Other Differences): Recast discussion of character-to-glyph
transformation. Stop qualifying characters as "input". Recast
discussion of example.
* font/devutf8/NOTES: Revise use of terminology. Perform a Kemper
notectomy. Wrap long lines.
* man/groff.7.man (Request short reference) <char>: Speak of a "special
character", not an "entity".
<rchar>: Document that request can't remove definitions supplied by
font description files.
* man/groff_diff.7.man (Implementation differences): Sync with our
Texinfo manual.
The use of "entity" to describe how a glyph gets mapped back to a
character (sequence) for the HTML and terminal output devices is
retained. That usage is restricted to discussion of output drivers
(code comments and function names notwithstanding).
-rw-r--r-- | doc/groff.texi | 86 | ||||
-rw-r--r-- | font/devutf8/NOTES | 34 | ||||
-rw-r--r-- | man/groff.7.man | 22 | ||||
-rw-r--r-- | man/groff_diff.7.man | 73 |
4 files changed, 116 insertions, 99 deletions
diff --git a/doc/groff.texi b/doc/groff.texi index 67f2d31f8..7cf609238 100644 --- a/doc/groff.texi +++ b/doc/groff.texi @@ -6360,10 +6360,10 @@ not qualify, so our first attempt got a warning. @section Identifiers @cindex identifiers -An @dfn{identifier} is a label for an object of syntactical -importance:@: a register, name (macro, string, or diversion), typeface, -color, special character, character class, environment, or stream. -Valid identifiers consist of one or more ordinary characters. +An @dfn{identifier} labels a GNU @code{troff} datum such as a register, +name (macro, string, or diversion), typeface, color, special character, +character class, environment, or stream. Valid identifiers consist of +one or more ordinary characters. @cindex ordinary character @cindex character, ordinary An @slanted{ordinary character} is an input character that is not a @@ -9371,7 +9371,7 @@ foo bar @endExample @noindent -It is even possible to map the space character to nothing: +Even the space character can be mapped to the dummy character. @Example .tr aa \& @@ -9397,8 +9397,8 @@ affected by @code{tr}. @item Translating character to glyphs where one of them or both are undefined -is possible also; @code{tr} does not check whether the entities in its -argument do exist. +is possible also; @code{tr} does not check whether the elements of its +argument exist. @xref{Gtroff Internals}. @@ -10527,13 +10527,16 @@ this is font 1 again @cindex character, distinguished from glyph @cindex ligature A @dfn{glyph} is a graphical representation of a @dfn{character}. While -a character is an abstract entity containing semantic information, a -glyph is something that can be actually seen on screen or paper. It is -possible that a character has multiple glyph representation forms (for -example, the character `A' can be either written in a roman or an italic -font, yielding two different glyphs); sometimes more than one character -maps to a single glyph (this is a @dfn{ligature}---the most common is -`fi'). +a character is an abstraction of semantic information, a glyph is +something that can be seen on screen or paper. A character has many +possible representation forms (for example, the character `A' can be +written in an upright or slanted typeface, producing distinct +glyphs). Sometimes, a sequence of characters map to a single glyph:@: +this is a @dfn{ligature}---the most common is `fi'. + +Space characters never become glyphs in GNU @code{troff}. If not +discarded (as when trailing on text lines), they are represented by +horizontal motions in the output. @cindex symbol @cindex special fonts @@ -11064,16 +11067,15 @@ request, but before the already mounted special fonts. @xref{Character Classes}. @endDefreq -@DefreqList {rchar, c1 c2 @dots{}} -@DefreqListEndx {rfschar, f c1 c2 @dots{}} +@DefreqList {rchar, c @dots{}} +@DefreqListEndx {rfschar, f c @dots{}} @cindex removing glyph definition (@code{rchar}, @code{rfschar}) @cindex glyph, removing definition (@code{rchar}, @code{rfschar}) @cindex fallback glyph, removing definition (@code{rchar}, @code{rfschar}) -Remove the definitions of glyphs @var{c1}, @var{c2},@tie{}@dots{}, +Remove definition of each ordinary or special character @var{c}, undoing the effect of a @code{char}, @code{fchar}, or @code{schar} -request. - -Spaces and tabs are optional between @var{cn}@tie{}arguments. +request. Those supplied by font description files cannot be removed. +Spaces and tabs may separate @var{c}@tie{}arguments. The request @code{rfschar} removes glyph definitions defined with @code{fschar} for font@tie{}@var{f}. @@ -11399,8 +11401,8 @@ supported `ff', `ffi', and `ffl' ligatures. Advanced typesetters or @code{troff} does not support these (yet). Only the current font is checked for ligatures and kerns; neither -special fonts nor entities defined with the @code{char} request (and its -siblings) are taken into account. +special fonts nor special charcters defined with the @code{char} request +(and its siblings) are taken into account. @DefreqList {lg, [@Var{flag}]} @DefregListEndx {.lg} @@ -17217,21 +17219,20 @@ each rounded down to the nearest multiple of@tie{}12. @cindex characters, input, and output glyphs, compatibility with @acronym{AT&T} @code{troff} @cindex glyphs, output, and input characters, compatibility with @acronym{AT&T} @code{troff} In GNU @code{troff} there is a fundamental difference between -(unformatted) input characters and (formatted) output glyphs. -Everything that affects how a glyph is output is stored with the glyph -node; once a glyph node has been constructed, it is unaffected by any -subsequent requests that are executed, including @code{bd}, @code{cs}, -@code{tkf}, @code{tr}, or @code{fp} requests. Normally, glyphs are -constructed from input characters immediately before the glyph is added -to the current output line. Macros, diversions, and strings are all, in -fact, the same type of object; they contain lists of input characters -and glyph nodes in any combination. Special characters can be both: -before being added to the output, they act as input entities; -afterward, they denote glyphs. A glyph node does not behave like an -input character for the purposes of macro processing; it does not -inherit any of the special properties that the input character from -which it was constructed might have had. Consider the following -example. +(unformatted) characters and (formatted) glyphs. Everything that +affects how a glyph is output is stored with the glyph node; once a +glyph node has been constructed, it is unaffected by any subsequent +requests that are executed, including @code{bd}, @code{cs}, @code{tkf}, +@code{tr}, or @code{fp} requests. Normally, glyphs are constructed from +characters immediately before the glyph is added to an output line. +Macros, diversions, and strings are all, in fact, the same type of +object; they contain a sequence of intermixed character and glyph nodes. +Special characters transform from one to the other:@: before being added +to the output, they behave as characters; afterward, they are glyphs. A +glyph node does not behave like a character node when it is processed by +a macro:@: it does not inherit any of the special properties that the +character from which it was constructed might have had. For example, +the input @Example .di x @@ -17242,11 +17243,12 @@ example. @endExample @noindent -It prints @samp{\\} in GNU @code{troff}; each pair of input backslashes -is turned into one output backslash and the resulting output backslashes -are not interpreted as escape characters when they are reread. -@acronym{AT&T} @code{troff} would interpret them as escape characters -when they were reread and would end up printing one @samp{\}. +produces @samp{\\} in GNU @code{troff}. Each pair of backslashes +becomes one backslash @emph{glyph}; the resulting backslashes are thus +not interpreted as escape @emph{characters} when they are reread as the +diversion is output. @acronym{AT&T} @code{troff} @emph{would} interpret +them as escape characters when rereading them and end up printing one +@samp{\}. @cindex printing backslash (@code{\\}, @code{\e}, @code{\E}, @code{\[rs]}) @cindex backslash, printing (@code{\\}, @code{\e}, @code{\E}, @code{\[rs]}) diff --git a/font/devutf8/NOTES b/font/devutf8/NOTES index 0906b8ade..c20edebd4 100644 --- a/font/devutf8/NOTES +++ b/font/devutf8/NOTES @@ -1,47 +1,47 @@ -Note that all \[charXXX] entity names have been removed from the font files. -They don't make sense for Unicode. +All \[charXXX] special character names have been removed from the font +files. They don't make sense for Unicode. -The following entity from the original troff manual (by Ossanna and -Kernighan) is unmapped: +The following special character name from the AT&T troff manual by +Ossanna and Kernighan is unmapped: bs shaded solid ball (Bell System logo, AT&T logo) -Character 0x002D has not been given a name because its Unicode name +Code point 0x002D has not been given a name because its Unicode name HYPHEN-MINUS is so ambiguous that it is unusable for serious typographic -use. +work. \[wp] has been mapped to 0x2118, because according to Unicode 4.1's NamesList.txt, U+2118 SCRIPT CAPITAL P is really a Weierstrass 'p', neither SCRIPT nor CAPITAL. The following line could be added; \[space] is known to devps but is not -documented and not known to devdvi (actually, there is no space glyph within -the TeX system). +documented and not known to devdvi (actually, there is no space glyph +within the TeX system). space 24 0 0x0020 -devps maps \[*U] to 'Upsilon1', which is equivalent to 0x03D2. We map it to -0x03A5 instead. +devps maps \[*U] to 'Upsilon1', which is equivalent to 0x03D2. We map +it to 0x03A5 instead. -devps maps \[*W] to 'Omega', which is equivalent to either 0x2126 or 0x03A9. -We map it to 0x03A9. +devps maps \[*W] to 'Omega', which is equivalent to either 0x2126 or +0x03A9. We map it to 0x03A9. -devps maps \[*D] to 'Delta', which is equivalent to either 0x2206 or 0x0394. -We map it to 0x0394. +devps maps \[*D] to 'Delta', which is equivalent to either 0x2206 or +0x0394. We map it to 0x0394. Adding Unicode characters ------------------------- Assume you want to use a Unicode character not provided in the list, say -U+20AC. You need to do two things: +U+20AC. You need to do two things: - Add a line u20AC 24 0 0x20AC (the second column is computed as 24 * wcwidth(0x20AC)) to the file - R.proto, or, when groff is already installed, to the four fonts files in - $(prefix)/share/groff/<version>/font/devutf8/. + R.proto, or, when groff is already installed, to the four font + description files in $(prefix)/share/groff/<version>/font/devutf8/. - In your source file, use the notation \[u20AC] to access it. diff --git a/man/groff.7.man b/man/groff.7.man index 07944567d..af1ebbbb2 100644 --- a/man/groff.7.man +++ b/man/groff.7.man @@ -1254,8 +1254,11 @@ containing them is surrounded by parentheses. .\" ==================================================================== . .\" BEGIN Keep (roughly) parallel with groff.texi node "Identifiers". -An identifier is a label for an object of syntactical importance: -a register, +An +.I identifier +labels a GNU +.I troff \" GNU +datum such as a register, name (macro, string, @@ -2581,7 +2584,7 @@ Reset no-break control character to .REQ .c2 "o" Recognize ordinary character .I o -as the no-break control character. +as no-break control character. . .TPx .REQ .cc @@ -2646,7 +2649,7 @@ by moving its location to . .TPx .REQ .char "c contents" -Define entity +Define ordinary or special character .I c as .IR contents . @@ -3800,11 +3803,16 @@ Change post-vertical line spacing according to .scaleindicator p ). . .TPx -.REQ .rchar "c1 c2 \fR\&.\|.\|.\&\fP" -Remove the definitions of entities +.REQ .rchar "c1 c2 \fR.\|.\|.\&\fP" +Remove definition of each ordinary or special character .IR c1 , .IR c2 , -\&.\|.\|.\& +\&.\|.\|.\& defined by a +.request .char , +.request .fchar , +or +.request .schar +request. . .TPx .REQ .rd "prompt" diff --git a/man/groff_diff.7.man b/man/groff_diff.7.man index e1a17c598..56770f0ef 100644 --- a/man/groff_diff.7.man +++ b/man/groff_diff.7.man @@ -5453,45 +5453,48 @@ each rounded down to the nearest multiple of\~12. . . .P -In -.IR groff , -there is a fundamental difference between unformatted input -characters, and formatted output characters (glyphs). +In GNU +.I troff \" GNU +there is a fundamental difference between (unformatted) characters and +(formatted) glyphs. . -Everything that affects how a glyph is output is stored with the glyph; -once a glyph has been constructed, +Everything that affects how a glyph is output is stored with the glyph +node; +once a glyph node has been constructed, it is unaffected by any subsequent requests that are executed, -including the -.BR .bd , -.BR .cs , -.BR .tkf , -.BR .tr , +including +.BR bd , +.BR cs , +.BR tkf , +.BR tr , or -.B .fp +.B fp requests. . Normally, -glyphs are constructed from input characters immediately before the -glyph is added to the current output line. +glyphs are constructed from characters immediately before the glyph is +added to an output line. . Macros, diversions, and strings are all, in fact, the same type of object; -they contain lists of input characters and glyphs in any combination. +they contain a sequence of intermixed character and glyph nodes. . -Special characters can be both: before being added to the output, -they act as input entities; -afterwards, -they denote glyphs. +Special characters transform from one to the other: +before being added to the output, +they behave as characters; +afterward, +they are glyphs. . -A glyph does not behave like an input character for the purposes of -macro processing; -it does not inherit any of the special properties that the input -character from which it was constructed might have had. +A glyph node does not behave like a character node when it is processed +by a macro: +it does not inherit any of the special properties that the character +from which it was constructed might have had. . -Consider the following example. +For example, +the input . .RS .EX @@ -5503,17 +5506,21 @@ Consider the following example. .EE .RE . -It prints +produces .RB \[lq] \[rs]\[rs] \[rq] -in -.IR groff ; -each pair of input backslashes is turned into one output backslash and -the resulting output backslashes are not interpreted as escape -characters when they are reread. +in GNU +.IR troff . \" GNU +Each pair of backslashes becomes one backslash +.I glyph; +the resulting backslashes are thus not interpreted as escape +.I characters +when they are reread as the diversion is output. . -.RI AT&T\~ troff -would interpret them as escape characters when they were reread and -would end up printing one +AT&T +.I troff \" AT&T +.I would +interpret them as escape characters when rereading them and end up +printing one .RB \[lq] \[rs] \[rq]. . . |