diff options
author | Behdad Esfahbod <behdad@behdad.org> | 2015-08-31 09:53:16 +0100 |
---|---|---|
committer | Behdad Esfahbod <behdad@behdad.org> | 2015-08-31 09:53:16 +0100 |
commit | c424b41705b50055c7f92b268cf78a2680af73af (patch) | |
tree | 8d2f98ca31b93acf2ae41a351ce4f2fef1c2b5d6 | |
parent | 31594b98af0c9181982c77d8d3803753007f8fd4 (diff) | |
parent | 5470e744dd264c2dc33437a68d20bcf7c5ffb905 (diff) | |
download | harfbuzz-c424b41705b50055c7f92b268cf78a2680af73af.tar.gz |
Merge pull request #129 from simoncozens/docs
First two chapters. More to follow.
-rw-r--r-- | docs/usermanual-ch01.xml | 115 | ||||
-rw-r--r-- | docs/usermanual-ch02.xml | 182 | ||||
-rw-r--r-- | docs/usermanual-ch03.xml | 77 | ||||
-rw-r--r-- | docs/usermanual-ch04.xml | 18 | ||||
-rw-r--r-- | docs/usermanual-ch05.xml | 13 | ||||
-rw-r--r-- | docs/usermanual-ch06.xml | 8 |
6 files changed, 413 insertions, 0 deletions
diff --git a/docs/usermanual-ch01.xml b/docs/usermanual-ch01.xml new file mode 100644 index 00000000..1ee0cbee --- /dev/null +++ b/docs/usermanual-ch01.xml @@ -0,0 +1,115 @@ +<sect1 id="what-is-harfbuzz"> + <title>What is Harfbuzz?</title> + <para> + Harfbuzz is a <emphasis>text shaping engine</emphasis>. It solves + the problem of selecting and positioning glyphs from a font given a + Unicode string. + </para> + <sect2 id="why-do-i-need-it"> + <title>Why do I need it?</title> + <para> + Text shaping is an integral part of preparing text for display. It + is a fairly low level operation; Harfbuzz is used directly by + graphic rendering libraries such as Pango, and the layout engines + in Firefox, LibreOffice and Chromium. Unless you are + <emphasis>writing</emphasis> one of these layout engines yourself, + you will probably not need to use Harfbuzz - normally higher level + libraries will turn text into glyphs for you. + </para> + <para> + However, if you <emphasis>are</emphasis> writing a layout engine + or graphics library yourself, you will need to perform text + shaping, and this is where Harfbuzz can help you. Here are some + reasons why you need it: + </para> + <itemizedlist> + <listitem> + <para> + OpenType fonts contain a set of glyphs, indexed by glyph ID. + The glyph ID within the font does not necessarily relate to a + Unicode codepoint. For instance, some fonts have the letter + "a" as glyph ID 1. To pull the right glyph out of + the font in order to display it, you need to consult a table + within the font (the "cmap" table) which maps + Unicode codepoints to glyph IDs. Text shaping turns codepoints + into glyph IDs. + </para> + </listitem> + <listitem> + <para> + Many OpenType fonts contain ligatures: combinations of + characters which are rendered together. For instance, it's + common for the <literal>fi</literal> combination to appear in + print as the single ligature "fi". Whether you should + render text as <literal>fi</literal> or "fi" does not + depend on the input text, but on the capabilities of the font + and the level of ligature application you wish to perform. + Text shaping involves querying the font's ligature tables and + determining what substitutions should be made. + </para> + </listitem> + <listitem> + <para> + While ligatures like "fi" are typographic + refinements, some languages <emphasis>require</emphasis> such + substitutions to be made in order to display text correctly. + In Tamil, when the letter "TTA" (ட) letter is + followed by "U" (உ), the combination should appear + as the single glyph "டு". The sequence of Unicode + characters "டஉ" needs to be rendered as a single + glyph from the font - text shaping chooses the correct glyph + from the sequence of characters provided. + </para> + </listitem> + <listitem> + <para> + Similarly, each Arabic character has four different variants: + within a font, there will be glyphs for the initial, medial, + final, and isolated forms of each letter. Unicode only encodes + one codepoint per character, and so a Unicode string will not + tell you which glyph to use. Text shaping chooses the correct + form of the letter and returns the correct glyph from the font + that you need to render. + </para> + </listitem> + <listitem> + <para> + Other languages have marks and accents which need to be + rendered in certain positions around a base character. For + instance, the Moldovan language has the Cyrillic letter + "zhe" (ж) with a breve accent, like so: ӂ. Some + fonts will contain this character as an individual glyph, + whereas other fonts will not contain a zhe-with-breve glyph + but expect the rendering engine to form the character by + overlaying the two glyphs ж and ˘. Where you should draw the + combining breve depends on the height of the preceding glyph. + Again, for Arabic, the correct positioning of vowel marks + depends on the height of the character on which you are + placing the mark. Text shaping tells you whether you have a + precomposed glyph within your font or if you need to compose a + glyph yourself out of combining marks, and if so, where to + position those marks. + </para> + </listitem> + </itemizedlist> + <para> + If this is something that you need to do, then you need a text + shaping engine: you could use Uniscribe if you are using Windows; + you could use CoreText on OS X; or you could use Harfbuzz. In the + rest of this manual, we are going to assume that you are the + implementor of a text layout engine. + </para> + </sect2> + <sect2 id="why-is-it-called-harfbuzz"> + <title>Why is it called Harfbuzz?</title> + <para> + Harfbuzz began its life as text shaping code within the FreeType + project, (and you will see references to the FreeType authors + within the source code copyright declarations) but was then + abstracted out to its own project. This project is maintained by + Behdad Esfahbod, and named Harfbuzz. Originally, it was a shaping + engine for OpenType fonts - "Harfbuzz" is the Persian + for "open type". + </para> + </sect2> +</sect1>
\ No newline at end of file diff --git a/docs/usermanual-ch02.xml b/docs/usermanual-ch02.xml new file mode 100644 index 00000000..f0a161dd --- /dev/null +++ b/docs/usermanual-ch02.xml @@ -0,0 +1,182 @@ +<sect1 id="hello-harfbuzz"> + <title>Hello, Harfbuzz</title> + <para> + Here's the simplest Harfbuzz that can possibly work. We will improve + it later. + </para> + <orderedlist numeration="arabic"> + <listitem> + <para> + Create a buffer and put your text in it. + </para> + </listitem> + </orderedlist> + <programlisting language="C"> + #include <hb.h> + hb_buffer_t *buf; + buf = hb_buffer_create(); + hb_buffer_add_utf8(buf, text, strlen(text), 0, strlen(text)); +</programlisting> + <orderedlist numeration="arabic"> + <listitem override="2"> + <para> + Guess the script, language and direction of the buffer. + </para> + </listitem> + </orderedlist> + <programlisting language="C"> + hb_buffer_guess_segment_properties(buf); +</programlisting> + <orderedlist numeration="arabic"> + <listitem override="3"> + <para> + Create a face and a font, using FreeType for now. + </para> + </listitem> + </orderedlist> + <programlisting language="C"> + #include <hb-ft.h> + FT_New_Face(ft_library, font_path, index, &face) + hb_font_t *font = hb_ft_font_create(face); +</programlisting> + <orderedlist numeration="arabic"> + <listitem override="4"> + <para> + Shape! + </para> + </listitem> + </orderedlist> + <programlisting> + hb_shape(font, buf, NULL, 0); +</programlisting> + <orderedlist numeration="arabic"> + <listitem override="5"> + <para> + Get the glyph and position information. + </para> + </listitem> + </orderedlist> + <programlisting language="C"> + hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &glyph_count); + hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &glyph_count); +</programlisting> + <orderedlist numeration="arabic"> + <listitem override="6"> + <para> + Iterate over each glyph. + </para> + </listitem> + </orderedlist> + <programlisting language="C"> + for (i = 0; i < glyph_count; ++i) { + glyphid = glyph_info[i].codepoint; + x_offset = glyph_pos[i].x_offset / 64.0; + y_offset = glyph_pos[i].y_offset / 64.0; + x_advance = glyph_pos[i].x_advance / 64.0; + y_advance = glyph_pos[i].y_advance / 64.0; + draw_glyph(glyphid, cursor_x + x_offset, cursor_y + y_offset); + cursor_x += x_advance; + cursor_y += y_advance; + } +</programlisting> + <orderedlist numeration="arabic"> + <listitem override="7"> + <para> + Tidy up. + </para> + </listitem> + </orderedlist> + <programlisting language="C"> + hb_buffer_destroy(buf); + hb_font_destroy(hb_ft_font); +</programlisting> + <sect2 id="what-harfbuzz-doesnt-do"> + <title>What Harfbuzz doesn't do</title> + <para> + The code above will take a UTF8 string, shape it, and give you the + information required to lay it out correctly on a single + horizontal (or vertical) line using the font provided. That is the + extent of Harfbuzz's responsibility. + </para> + <para> + If you are implementing a text layout engine you may have other + responsibilities, that Harfbuzz will not help you with: + </para> + <itemizedlist> + <listitem> + <para> + Harfbuzz won't help you with bidirectionality. If you want to + lay out text with mixed Hebrew and English, you will need to + ensure that the buffer provided to Harfbuzz has those + characters in the correct layout order. This will be different + from the logical order in which the Unicode text is stored. In + other words, the user will hit the keys in the following + sequence: + </para> + <programlisting> +A B C [space] ג ב א [space] D E F + </programlisting> + <para> + but will expect to see in the output: + </para> + <programlisting> +ABC אבג DEF + </programlisting> + <para> + This reordering is called <emphasis>bidi processing</emphasis> + ("bidi" is short for bidirectional), and there's an + algorithm as an annex to the Unicode Standard which tells you how + to reorder a string from logical order into presentation order. + Before sending your string to Harfbuzz, you may need to apply the + bidi algorithm to it. Libraries such as ICU and fribidi can do + this for you. + </para> + <listitem> + <para> + Harfbuzz won't help you with text that contains different font + properties. For instance, if you have the string "a + <emphasis>huge</emphasis> breakfast", and you expect + "huge" to be italic, you will need to send three + strings to Harfbuzz: <literal>a</literal>, in your Roman font; + <literal>huge</literal> using your italic font; and + <literal>breakfast</literal> using your Roman font again. + Similarly if you change font, font size, script, language or + direction within your string, you will need to shape each run + independently and then output them independently. Harfbuzz + expects to shape a run of characters sharing the same + properties. + </para> + </listitem> + <listitem> + <para> + Harfbuzz won't help you with line breaking, hyphenation or + justification. As mentioned above, it lays out the string + along a <emphasis>single line</emphasis> of, notionally, + infinite length. If you want to find out where the potential + word, sentence and line break points are in your text, you + could use the ICU library's break iterator functions. + </para> + <para> + Harfbuzz can tell you how wide a shaped piece of text is, which is + useful input to a justification algorithm, but it knows nothing + about paragraphs, lines or line lengths. Nor will it adjust the + space between words to fit them proportionally into a line. If you + want to layout text in paragraphs, you will probably want to send + each word of your text to Harfbuzz to determine its shaped width + after glyph substitutions, then work out how many words will fit + on a line, and then finally output each word of the line separated + by a space of the correct size to fully justify the paragraph. + </para> + </listitem> + </itemizedlist> + <para> + As a layout engine implementor, Harfbuzz will help you with the + interface between your text and your font, and that's something + that you'll need - what you then do with the glyphs that your font + returns is up to you. The example we saw above enough to get us + started using Harfbuzz. Now we are going to use the remainder of + Harfbuzz's API to refine that example and improve our text shaping + capabilities. + </para> + </sect2> +</sect1>
\ No newline at end of file diff --git a/docs/usermanual-ch03.xml b/docs/usermanual-ch03.xml new file mode 100644 index 00000000..66ec0a88 --- /dev/null +++ b/docs/usermanual-ch03.xml @@ -0,0 +1,77 @@ +<sect1 id="buffers-language-script-and-direction"> + <title>Buffers, language, script and direction</title> + <para> + The input to Harfbuzz is a series of Unicode characters, stored in a + buffer. In this chapter, we'll look at how to set up a buffer with + the text that we want and then customize the properties of the + buffer. + </para> + <sect2 id="creating-and-destroying-buffers"> + <title>Creating and destroying buffers</title> + <para> + As we saw in our initial example, a buffer is created and + initialized with <literal>hb_buffer_create()</literal>. This + produces a new, empty buffer object, instantiated with some + default values and ready to accept your Unicode strings. + </para> + <para> + Harfbuzz manages the memory of objects that it creates (such as + buffers), so you don't have to. When you have finished working on + a buffer, you can call <literal>hb_buffer_destroy()</literal>: + </para> + <programlisting language="C"> + hb_buffer_t *buffer = hb_buffer_create(); + ... + hb_buffer_destroy(buffer); +</programlisting> + <para> + This will destroy the object and free its associated memory - + unless some other part of the program holds a reference to this + buffer. If you acquire a Harfbuzz buffer from another subsystem + and want to ensure that it is not garbage collected by someone + else destroying it, you should increase its reference count: + </para> + <programlisting language="C"> +void somefunc(hb_buffer_t *buffer) { + buffer = hb_buffer_reference(buffer); + ... +</programlisting> + <para> + And then decrease it once you're done with it: + </para> + <programlisting language="C"> + hb_buffer_destroy(buffer); +} +</programlisting> + <para> + To throw away all the data in your buffer and start from scratch, + call <literal>hb_buffer_reset(buffer)</literal>. If you want to + throw away the string in the buffer but keep the options, you can + instead call <literal>hb_buffer_clear_contents(buffer)</literal>. + </para> + </sect2> + <sect2 id="adding-text-to-the-buffer"> + <title>Adding text to the buffer</title> + <para> + Now we have a brand new Harfbuzz buffer. Let's start filling it + with text! From Harfbuzz's perspective, a buffer is just a stream + of Unicode codepoints, but your input string is probably in one of + the standard Unicode character encodings (UTF-8, UTF-16, UTF-3 ) + </para> + </sect2> + <sect2 id="setting-buffer-properties"> + <title>Setting buffer properties</title> + <para> + </para> + </sect2> + <sect2 id="what-about-the-other-scripts"> + <title>What about the other scripts?</title> + <para> + </para> + </sect2> + <sect2 id="customizing-unicode-functions"> + <title>Customizing Unicode functions</title> + <para> + </para> + </sect2> +</sect1>
\ No newline at end of file diff --git a/docs/usermanual-ch04.xml b/docs/usermanual-ch04.xml new file mode 100644 index 00000000..c469147d --- /dev/null +++ b/docs/usermanual-ch04.xml @@ -0,0 +1,18 @@ +<sect1 id="fonts-and-faces"> + <title>Fonts and faces</title> + <sect2 id="using-freetype"> + <title>Using FreeType</title> + <para> + </para> + </sect2> + <sect2 id="using-harfbuzzs-native-opentype-implementation"> + <title>Using Harfbuzz's native OpenType implementation</title> + <para> + </para> + </sect2> + <sect2 id="using-your-own-font-functions"> + <title>Using your own font functions</title> + <para> + </para> + </sect2> +</sect1>
\ No newline at end of file diff --git a/docs/usermanual-ch05.xml b/docs/usermanual-ch05.xml new file mode 100644 index 00000000..6f501749 --- /dev/null +++ b/docs/usermanual-ch05.xml @@ -0,0 +1,13 @@ +<sect1 id="shaping-and-shape-plans"> + <title>Shaping and shape plans</title> + <sect2 id="opentype-features"> + <title>OpenType features</title> + <para> + </para> + </sect2> + <sect2 id="plans-and-caching"> + <title>Plans and caching</title> + <para> + </para> + </sect2> +</sect1>
\ No newline at end of file diff --git a/docs/usermanual-ch06.xml b/docs/usermanual-ch06.xml new file mode 100644 index 00000000..ca674c0c --- /dev/null +++ b/docs/usermanual-ch06.xml @@ -0,0 +1,8 @@ +<sect1 id="glyph-information"> + <title>Glyph information</title> + <sect2 id="names-and-numbers"> + <title>Names and numbers</title> + <para> + </para> + </sect2> +</sect1>
\ No newline at end of file |