docs/internals.rst


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98

Pelican internals
#################

This section describe how Pelican works internally. As you'll see, it's quite
simple, but a bit of documentation doesn't hurt.  :)

You can also find in the :doc:`report` section an excerpt of a report the
original author wrote with some software design information.

.. _report: :doc:`report`

Overall structure
=================

What Pelican does is take a list of files and process them into some sort of
output. Usually, the input files are reStructuredText and Markdown files, and
the output is a blog, but both input and output can be anything you want.

The logic is separated into different classes and concepts:

* **Writers** are responsible for writing files: .html files, RSS feeds, and so
  on. Since those operations are commonly used, the object is created once and
  then passed to the generators.

* **Readers** are used to read from various formats (HTML, Markdown and
  reStructuredText for now, but the system is extensible). Given a file, they
  return metadata (author, tags, category, etc.) and content (HTML-formatted).

* **Generators** generate the different outputs. For instance, Pelican comes
  with ``ArticlesGenerator`` and ``PageGenerator``. Given a configuration, they
  can do whatever they want. Most of the time, it's generating files from
  inputs.

* Pelican also uses templates, so it's easy to write your own theme. The
  syntax is `Jinja2 <https://palletsprojects.com/p/jinja/>`_ and is very easy to learn, so
  don't hesitate to jump in and build your own theme.

How to implement a new reader?
==============================

Is there an awesome markup language you want to add to Pelican? Well, the only
thing you have to do is to create a class with a ``read`` method that returns
HTML content and some metadata.

Take a look at the Markdown reader::

    from pelican.readers import BaseReader
    from pelican.utils import pelican_open
    from markdown import Markdown

    class MarkdownReader(BaseReader):
        enabled = True

        def read(self, source_path):
            """Parse content and metadata of markdown files"""

            with pelican_open(source_path) as text:
                md_extensions = {'markdown.extensions.meta': {},
                                 'markdown.extensions.codehilite': {}}
                md = Markdown(extensions=md_extensions.keys(),
                              extension_configs=md_extensions)
                content = md.convert(text)

            metadata = {}
            for name, value in md.Meta.items():
                name = name.lower()
                meta = self.process_metadata(name, value[0])
                metadata[name] = meta
            return content, metadata

Simple, isn't it?

If your new reader requires additional Python dependencies, then you should
wrap their ``import`` statements in a ``try...except`` block.  Then inside the
reader's class, set the ``enabled`` class attribute to mark import success or
failure. This makes it possible for users to continue using their favourite
markup method without needing to install modules for formats they don't use.

How to implement a new generator?
=================================

Generators have two important methods. You're not forced to create both; only
the existing ones will be called.

* ``generate_context``, that is called first, for all the generators.
  Do whatever you have to do, and update the global context if needed. This
  context is shared between all generators, and will be passed to the
  templates. For instance, the ``PageGenerator`` ``generate_context`` method
  finds all the pages, transforms them into objects, and populates the context
  with them. Be careful *not* to output anything using this context at this
  stage, as it is likely to change by the effect of other generators.

* ``generate_output`` is then called. And guess what is it made for? Oh,
  generating the output.  :) It's here that you may want to look at the context
  and call the methods of the ``writer`` object that is passed as the first
  argument of this function. In the ``PageGenerator`` example, this method will
  look at all the pages recorded in the global context and output a file on the
  disk (using the writer method ``write_file``) for each page encountered.