summaryrefslogtreecommitdiff
path: root/Doc/extending/newtypes_tutorial.rst
blob: b4bf9b9e6f75f826bcc3de48406c3f4d52885002 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
.. highlightlang:: c

.. _defining-new-types:

**********************************
Defining Extension Types: Tutorial
**********************************

.. sectionauthor:: Michael Hudson <mwh@python.net>
.. sectionauthor:: Dave Kuhlman <dkuhlman@rexx.com>
.. sectionauthor:: Jim Fulton <jim@zope.com>


Python allows the writer of a C extension module to define new types that
can be manipulated from Python code, much like the built-in :class:`str`
and :class:`list` types.  The code for all extension types follows a
pattern, but there are some details that you need to understand before you
can get started.  This document is a gentle introduction to the topic.


.. _dnt-basics:

The Basics
==========

The :term:`CPython` runtime sees all Python objects as variables of type
:c:type:`PyObject\*`, which serves as a "base type" for all Python objects.
The :c:type:`PyObject` structure itself only contains the object's
:term:`reference count` and a pointer to the object's "type object".
This is where the action is; the type object determines which (C) functions
get called by the interpreter when, for instance, an attribute gets looked up
on an object, a method called, or it is multiplied by another object.  These
C functions are called "type methods".

So, if you want to define a new extension type, you need to create a new type
object.

This sort of thing can only be explained by example, so here's a minimal, but
complete, module that defines a new type named :class:`Custom` inside a C
extension module :mod:`custom`:

.. note::
   What we're showing here is the traditional way of defining *static*
   extension types.  It should be adequate for most uses.  The C API also
   allows defining heap-allocated extension types using the
   :c:func:`PyType_FromSpec` function, which isn't covered in this tutorial.

.. literalinclude:: ../includes/custom.c

Now that's quite a bit to take in at once, but hopefully bits will seem familiar
from the previous chapter.  This file defines three things:

#. What a :class:`Custom` **object** contains: this is the ``CustomObject``
   struct, which is allocated once for each :class:`Custom` instance.
#. How the :class:`Custom` **type** behaves: this is the ``CustomType`` struct,
   which defines a set of flags and function pointers that the interpreter
   inspects when specific operations are requested.
#. How to initialize the :mod:`custom` module: this is the ``PyInit_custom``
   function and the associated ``custommodule`` struct.

The first bit is::

   typedef struct {
       PyObject_HEAD
   } CustomObject;

This is what a Custom object will contain.  ``PyObject_HEAD`` is mandatory
at the start of each object struct and defines a field called ``ob_base``
of type :c:type:`PyObject`, containing a pointer to a type object and a
reference count (these can be accessed using the macros :c:macro:`Py_REFCNT`
and :c:macro:`Py_TYPE` respectively).  The reason for the macro is to
abstract away the layout and to enable additional fields in debug builds.

.. note::
   There is no semicolon above after the :c:macro:`PyObject_HEAD` macro.
   Be wary of adding one by accident: some compilers will complain.

Of course, objects generally store additional data besides the standard
``PyObject_HEAD`` boilerplate; for example, here is the definition for
standard Python floats::

   typedef struct {
       PyObject_HEAD
       double ob_fval;
   } PyFloatObject;

The second bit is the definition of the type object. ::

   static PyTypeObject CustomType = {
       PyVarObject_HEAD_INIT(NULL, 0)
       .tp_name = "custom.Custom",
       .tp_doc = "Custom objects",
       .tp_basicsize = sizeof(CustomObject),
       .tp_itemsize = 0,
       .tp_flags = Py_TPFLAGS_DEFAULT,
       .tp_new = PyType_GenericNew,
   };

.. note::
   We recommend using C99-style designated initializers as above, to
   avoid listing all the :c:type:`PyTypeObject` fields that you don't care
   about and also to avoid caring about the fields' declaration order.

The actual definition of :c:type:`PyTypeObject` in :file:`object.h` has
many more :ref:`fields <type-structs>` than the definition above.  The
remaining fields will be filled with zeros by the C compiler, and it's
common practice to not specify them explicitly unless you need them.

We're going to pick it apart, one field at a time::

   PyVarObject_HEAD_INIT(NULL, 0)

This line is mandatory boilerplate to initialize the ``ob_base``
field mentioned above. ::

   .tp_name = "custom.Custom",

The name of our type.  This will appear in the default textual representation of
our objects and in some error messages, for example:

.. code-block:: pycon

   >>> "" + custom.Custom()
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   TypeError: can only concatenate str (not "custom.Custom") to str

Note that the name is a dotted name that includes both the module name and the
name of the type within the module. The module in this case is :mod:`custom` and
the type is :class:`Custom`, so we set the type name to :class:`custom.Custom`.
Using the real dotted import path is important to make your type compatible
with the :mod:`pydoc` and :mod:`pickle` modules. ::

   .tp_basicsize = sizeof(CustomObject),
   .tp_itemsize = 0,

This is so that Python knows how much memory to allocate when creating
new :class:`Custom` instances.  :c:member:`~PyTypeObject.tp_itemsize` is
only used for variable-sized objects and should otherwise be zero.

.. note::

   If you want your type to be subclassable from Python, and your type has the same
   :c:member:`~PyTypeObject.tp_basicsize` as its base type, you may have problems with multiple
   inheritance.  A Python subclass of your type will have to list your type first
   in its :attr:`~class.__bases__`, or else it will not be able to call your type's
   :meth:`__new__` method without getting an error.  You can avoid this problem by
   ensuring that your type has a larger value for :c:member:`~PyTypeObject.tp_basicsize` than its
   base type does.  Most of the time, this will be true anyway, because either your
   base type will be :class:`object`, or else you will be adding data members to
   your base type, and therefore increasing its size.

We set the class flags to :const:`Py_TPFLAGS_DEFAULT`. ::

   .tp_flags = Py_TPFLAGS_DEFAULT,

All types should include this constant in their flags.  It enables all of the
members defined until at least Python 3.3.  If you need further members,
you will need to OR the corresponding flags.

We provide a doc string for the type in :c:member:`~PyTypeObject.tp_doc`. ::

   .tp_doc = "Custom objects",

To enable object creation, we have to provide a :c:member:`~PyTypeObject.tp_new`
handler.  This is the equivalent of the Python method :meth:`__new__`, but
has to be specified explicitly.  In this case, we can just use the default
implementation provided by the API function :c:func:`PyType_GenericNew`. ::

   .tp_new = PyType_GenericNew,

Everything else in the file should be familiar, except for some code in
:c:func:`PyInit_custom`::

   if (PyType_Ready(&CustomType) < 0)
       return;

This initializes the :class:`Custom` type, filling in a number of members
to the appropriate default values, including :attr:`ob_type` that we initially
set to *NULL*. ::

   PyModule_AddObject(m, "Custom", (PyObject *) &CustomType);

This adds the type to the module dictionary.  This allows us to create
:class:`Custom` instances by calling the :class:`Custom` class:

.. code-block:: pycon

   >>> import custom
   >>> mycustom = custom.Custom()

That's it!  All that remains is to build it; put the above code in a file called
:file:`custom.c` and:

.. code-block:: python

   from distutils.core import setup, Extension
   setup(name="custom", version="1.0",
         ext_modules=[Extension("custom", ["custom.c"])])

in a file called :file:`setup.py`; then typing

.. code-block:: shell-session

   $ python setup.py build

at a shell should produce a file :file:`custom.so` in a subdirectory; move to
that directory and fire up Python --- you should be able to ``import custom`` and
play around with Custom objects.

That wasn't so hard, was it?

Of course, the current Custom type is pretty uninteresting. It has no data and
doesn't do anything. It can't even be subclassed.

.. note::
   While this documentation showcases the standard :mod:`distutils` module
   for building C extensions, it is recommended in real-world use cases to
   use the newer and better-maintained ``setuptools`` library.  Documentation
   on how to do this is out of scope for this document and can be found in
   the `Python Packaging User's Guide <https://packaging.python.org/tutorials/distributing-packages/>`_.


Adding data and methods to the Basic example
============================================

Let's extend the basic example to add some data and methods.  Let's also make
the type usable as a base class. We'll create a new module, :mod:`custom2` that
adds these capabilities:

.. literalinclude:: ../includes/custom2.c


This version of the module has a number of changes.

We've added an extra include::

   #include <structmember.h>

This include provides declarations that we use to handle attributes, as
described a bit later.

The  :class:`Custom` type now has three data attributes in its C struct,
*first*, *last*, and *number*.  The *first* and *last* variables are Python
strings containing first and last names.  The *number* attribute is a C integer.

The object structure is updated accordingly::

   typedef struct {
       PyObject_HEAD
       PyObject *first; /* first name */
       PyObject *last;  /* last name */
       int number;
   } CustomObject;

Because we now have data to manage, we have to be more careful about object
allocation and deallocation.  At a minimum, we need a deallocation method::

   static void
   Custom_dealloc(CustomObject *self)
   {
       Py_XDECREF(self->first);
       Py_XDECREF(self->last);
       Py_TYPE(self)->tp_free((PyObject *) self);
   }

which is assigned to the :c:member:`~PyTypeObject.tp_dealloc` member::

   .tp_dealloc = (destructor) Custom_dealloc,

This method first clears the reference counts of the two Python attributes.
:c:func:`Py_XDECREF` correctly handles the case where its argument is
*NULL* (which might happen here if ``tp_new`` failed midway).  It then
calls the :c:member:`~PyTypeObject.tp_free` member of the object's type
(computed by ``Py_TYPE(self)``) to free the object's memory.  Note that
the object's type might not be :class:`CustomType`, because the object may
be an instance of a subclass.

.. note::
   The explicit cast to ``destructor`` above is needed because we defined
   ``Custom_dealloc`` to take a ``CustomObject *`` argument, but the ``tp_dealloc``
   function pointer expects to receive a ``PyObject *`` argument.  Otherwise,
   the compiler will emit a warning.  This is object-oriented polymorphism,
   in C!

We want to make sure that the first and last names are initialized to empty
strings, so we provide a ``tp_new`` implementation::

   static PyObject *
   Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
   {
       CustomObject *self;
       self = (CustomObject *) type->tp_alloc(type, 0);
       if (self != NULL) {
           self->first = PyUnicode_FromString("");
           if (self->first == NULL) {
               Py_DECREF(self);
               return NULL;
           }
           self->last = PyUnicode_FromString("");
           if (self->last == NULL) {
               Py_DECREF(self);
               return NULL;
           }
           self->number = 0;
       }
       return (PyObject *) self;
   }

and install it in the :c:member:`~PyTypeObject.tp_new` member::

   .tp_new = Custom_new,

The ``tp_new`` handler is responsible for creating (as opposed to initializing)
objects of the type.  It is exposed in Python as the :meth:`__new__` method.
It is not required to define a ``tp_new`` member, and indeed many extension
types will simply reuse :c:func:`PyType_GenericNew` as done in the first
version of the ``Custom`` type above.  In this case, we use the ``tp_new``
handler to initialize the ``first`` and ``last`` attributes to non-*NULL*
default values.

``tp_new`` is passed the type being instantiated (not necessarily ``CustomType``,
if a subclass is instantiated) and any arguments passed when the type was
called, and is expected to return the instance created.  ``tp_new`` handlers
always accept positional and keyword arguments, but they often ignore the
arguments, leaving the argument handling to initializer (a.k.a. ``tp_init``
in C or ``__init__`` in Python) methods.

.. note::
   ``tp_new`` shouldn't call ``tp_init`` explicitly, as the interpreter
   will do it itself.

The ``tp_new`` implementation calls the :c:member:`~PyTypeObject.tp_alloc`
slot to allocate memory::

   self = (CustomObject *) type->tp_alloc(type, 0);

Since memory allocation may fail, we must check the :c:member:`~PyTypeObject.tp_alloc`
result against *NULL* before proceeding.

.. note::
   We didn't fill the :c:member:`~PyTypeObject.tp_alloc` slot ourselves. Rather
   :c:func:`PyType_Ready` fills it for us by inheriting it from our base class,
   which is :class:`object` by default.  Most types use the default allocation
   strategy.

.. note::
   If you are creating a co-operative :c:member:`~PyTypeObject.tp_new` (one
   that calls a base type's :c:member:`~PyTypeObject.tp_new` or :meth:`__new__`),
   you must *not* try to determine what method to call using method resolution
   order at runtime.  Always statically determine what type you are going to
   call, and call its :c:member:`~PyTypeObject.tp_new` directly, or via
   ``type->tp_base->tp_new``.  If you do not do this, Python subclasses of your
   type that also inherit from other Python-defined classes may not work correctly.
   (Specifically, you may not be able to create instances of such subclasses
   without getting a :exc:`TypeError`.)

We also define an initialization function which accepts arguments to provide
initial values for our instance::

   static int
   Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
   {
       static char *kwlist[] = {"first", "last", "number", NULL};
       PyObject *first = NULL, *last = NULL, *tmp;

       if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
                                        &first, &last,
                                        &self->number))
           return -1;

       if (first) {
           tmp = self->first;
           Py_INCREF(first);
           self->first = first;
           Py_XDECREF(tmp);
       }
       if (last) {
           tmp = self->last;
           Py_INCREF(last);
           self->last = last;
           Py_XDECREF(tmp);
       }
       return 0;
   }

by filling the :c:member:`~PyTypeObject.tp_init` slot. ::

   .tp_init = (initproc) Custom_init,

The :c:member:`~PyTypeObject.tp_init` slot is exposed in Python as the
:meth:`__init__` method.  It is used to initialize an object after it's
created.  Initializers always accept positional and keyword arguments,
and they should return either ``0`` on success or ``-1`` on error.

Unlike the ``tp_new`` handler, there is no guarantee that ``tp_init``
is called at all (for example, the :mod:`pickle` module by default
doesn't call :meth:`__init__` on unpickled instances).  It can also be
called multiple times.  Anyone can call the :meth:`__init__` method on
our objects.  For this reason, we have to be extra careful when assigning
the new attribute values.  We might be tempted, for example to assign the
``first`` member like this::

   if (first) {
       Py_XDECREF(self->first);
       Py_INCREF(first);
       self->first = first;
   }

But this would be risky.  Our type doesn't restrict the type of the
``first`` member, so it could be any kind of object.  It could have a
destructor that causes code to be executed that tries to access the
``first`` member; or that destructor could release the
:term:`Global interpreter Lock` and let arbitrary code run in other
threads that accesses and modifies our object.

To be paranoid and protect ourselves against this possibility, we almost
always reassign members before decrementing their reference counts.  When
don't we have to do this?

* when we absolutely know that the reference count is greater than 1;

* when we know that deallocation of the object [#]_ will neither release
  the :term:`GIL` nor cause any calls back into our type's code;

* when decrementing a reference count in a :c:member:`~PyTypeObject.tp_dealloc`
  handler on a type which doesn't support cyclic garbage collection [#]_.

We want to expose our instance variables as attributes. There are a
number of ways to do that. The simplest way is to define member definitions::

   static PyMemberDef Custom_members[] = {
       {"first", T_OBJECT_EX, offsetof(CustomObject, first), 0,
        "first name"},
       {"last", T_OBJECT_EX, offsetof(CustomObject, last), 0,
        "last name"},
       {"number", T_INT, offsetof(CustomObject, number), 0,
        "custom number"},
       {NULL}  /* Sentinel */
   };

and put the definitions in the :c:member:`~PyTypeObject.tp_members` slot::

   .tp_members = Custom_members,

Each member definition has a member name, type, offset, access flags and
documentation string.  See the :ref:`Generic-Attribute-Management` section
below for details.

A disadvantage of this approach is that it doesn't provide a way to restrict the
types of objects that can be assigned to the Python attributes.  We expect the
first and last names to be strings, but any Python objects can be assigned.
Further, the attributes can be deleted, setting the C pointers to *NULL*.  Even
though we can make sure the members are initialized to non-*NULL* values, the
members can be set to *NULL* if the attributes are deleted.

We define a single method, :meth:`Custom.name()`, that outputs the objects name as the
concatenation of the first and last names. ::

   static PyObject *
   Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored))
   {
       if (self->first == NULL) {
           PyErr_SetString(PyExc_AttributeError, "first");
           return NULL;
       }
       if (self->last == NULL) {
           PyErr_SetString(PyExc_AttributeError, "last");
           return NULL;
       }
       return PyUnicode_FromFormat("%S %S", self->first, self->last);
   }

The method is implemented as a C function that takes a :class:`Custom` (or
:class:`Custom` subclass) instance as the first argument.  Methods always take an
instance as the first argument. Methods often take positional and keyword
arguments as well, but in this case we don't take any and don't need to accept
a positional argument tuple or keyword argument dictionary. This method is
equivalent to the Python method:

.. code-block:: python

   def name(self):
       return "%s %s" % (self.first, self.last)

Note that we have to check for the possibility that our :attr:`first` and
:attr:`last` members are *NULL*.  This is because they can be deleted, in which
case they are set to *NULL*.  It would be better to prevent deletion of these
attributes and to restrict the attribute values to be strings.  We'll see how to
do that in the next section.

Now that we've defined the method, we need to create an array of method
definitions::

   static PyMethodDef Custom_methods[] = {
       {"name", (PyCFunction) Custom_name, METH_NOARGS,
        "Return the name, combining the first and last name"
       },
       {NULL}  /* Sentinel */
   };

(note that we used the :const:`METH_NOARGS` flag to indicate that the method
is expecting no arguments other than *self*)

and assign it to the :c:member:`~PyTypeObject.tp_methods` slot::

   .tp_methods = Custom_methods,

Finally, we'll make our type usable as a base class for subclassing.  We've
written our methods carefully so far so that they don't make any assumptions
about the type of the object being created or used, so all we need to do is
to add the :const:`Py_TPFLAGS_BASETYPE` to our class flag definition::

   .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,

We rename :c:func:`PyInit_custom` to :c:func:`PyInit_custom2`, update the
module name in the :c:type:`PyModuleDef` struct, and update the full class
name in the :c:type:`PyTypeObject` struct.

Finally, we update our :file:`setup.py` file to build the new module:

.. code-block:: python

   from distutils.core import setup, Extension
   setup(name="custom", version="1.0",
         ext_modules=[
            Extension("custom", ["custom.c"]),
            Extension("custom2", ["custom2.c"]),
            ])


Providing finer control over data attributes
============================================

In this section, we'll provide finer control over how the :attr:`first` and
:attr:`last` attributes are set in the :class:`Custom` example. In the previous
version of our module, the instance variables :attr:`first` and :attr:`last`
could be set to non-string values or even deleted. We want to make sure that
these attributes always contain strings.

.. literalinclude:: ../includes/custom3.c


To provide greater control, over the :attr:`first` and :attr:`last` attributes,
we'll use custom getter and setter functions.  Here are the functions for
getting and setting the :attr:`first` attribute::

   static PyObject *
   Custom_getfirst(CustomObject *self, void *closure)
   {
       Py_INCREF(self->first);
       return self->first;
   }

   static int
   Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
   {
       PyObject *tmp;
       if (value == NULL) {
           PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
           return -1;
       }
       if (!PyUnicode_Check(value)) {
           PyErr_SetString(PyExc_TypeError,
                           "The first attribute value must be a string");
           return -1;
       }
       tmp = self->first;
       Py_INCREF(value);
       self->first = value;
       Py_DECREF(tmp);
       return 0;
   }

The getter function is passed a :class:`Custom` object and a "closure", which is
a void pointer.  In this case, the closure is ignored.  (The closure supports an
advanced usage in which definition data is passed to the getter and setter. This
could, for example, be used to allow a single set of getter and setter functions
that decide the attribute to get or set based on data in the closure.)

The setter function is passed the :class:`Custom` object, the new value, and the
closure.  The new value may be *NULL*, in which case the attribute is being
deleted.  In our setter, we raise an error if the attribute is deleted or if its
new value is not a string.

We create an array of :c:type:`PyGetSetDef` structures::

   static PyGetSetDef Custom_getsetters[] = {
       {"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
        "first name", NULL},
       {"last", (getter) Custom_getlast, (setter) Custom_setlast,
        "last name", NULL},
       {NULL}  /* Sentinel */
   };

and register it in the :c:member:`~PyTypeObject.tp_getset` slot::

   .tp_getset = Custom_getsetters,

The last item in a :c:type:`PyGetSetDef` structure is the "closure" mentioned
above.  In this case, we aren't using a closure, so we just pass *NULL*.

We also remove the member definitions for these attributes::

   static PyMemberDef Custom_members[] = {
       {"number", T_INT, offsetof(CustomObject, number), 0,
        "custom number"},
       {NULL}  /* Sentinel */
   };

We also need to update the :c:member:`~PyTypeObject.tp_init` handler to only
allow strings [#]_ to be passed::

   static int
   Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
   {
       static char *kwlist[] = {"first", "last", "number", NULL};
       PyObject *first = NULL, *last = NULL, *tmp;

       if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
                                        &first, &last,
                                        &self->number))
           return -1;

       if (first) {
           tmp = self->first;
           Py_INCREF(first);
           self->first = first;
           Py_DECREF(tmp);
       }
       if (last) {
           tmp = self->last;
           Py_INCREF(last);
           self->last = last;
           Py_DECREF(tmp);
       }
       return 0;
   }

With these changes, we can assure that the ``first`` and ``last`` members are
never *NULL* so we can remove checks for *NULL* values in almost all cases.
This means that most of the :c:func:`Py_XDECREF` calls can be converted to
:c:func:`Py_DECREF` calls.  The only place we can't change these calls is in
the ``tp_dealloc`` implementation, where there is the possibility that the
initialization of these members failed in ``tp_new``.

We also rename the module initialization function and module name in the
initialization function, as we did before, and we add an extra definition to the
:file:`setup.py` file.


Supporting cyclic garbage collection
====================================

Python has a :term:`cyclic garbage collector (GC) <garbage collection>` that
can identify unneeded objects even when their reference counts are not zero.
This can happen when objects are involved in cycles.  For example, consider:

.. code-block:: pycon

   >>> l = []
   >>> l.append(l)
   >>> del l

In this example, we create a list that contains itself. When we delete it, it
still has a reference from itself. Its reference count doesn't drop to zero.
Fortunately, Python's cyclic garbage collector will eventually figure out that
the list is garbage and free it.

In the second version of the :class:`Custom` example, we allowed any kind of
object to be stored in the :attr:`first` or :attr:`last` attributes [#]_.
Besides, in the second and third versions, we allowed subclassing
:class:`Custom`, and subclasses may add arbitrary attributes.  For any of
those two reasons, :class:`Custom` objects can participate in cycles:

.. code-block:: pycon

   >>> import custom3
   >>> class Derived(custom3.Custom): pass
   ...
   >>> n = Derived()
   >>> n.some_attribute = n

To allow a :class:`Custom` instance participating in a reference cycle to
be properly detected and collected by the cyclic GC, our :class:`Custom` type
needs to fill two additional slots and to enable a flag that enables these slots:

.. literalinclude:: ../includes/custom4.c


First, the traversal method lets the cyclic GC know about subobjects that could
participate in cycles::

   static int
   Custom_traverse(CustomObject *self, visitproc visit, void *arg)
   {
       int vret;
       if (self->first) {
           vret = visit(self->first, arg);
           if (vret != 0)
               return vret;
       }
       if (self->last) {
           vret = visit(self->last, arg);
           if (vret != 0)
               return vret;
       }
       return 0;
   }

For each subobject that can participate in cycles, we need to call the
:c:func:`visit` function, which is passed to the traversal method. The
:c:func:`visit` function takes as arguments the subobject and the extra argument
*arg* passed to the traversal method.  It returns an integer value that must be
returned if it is non-zero.

Python provides a :c:func:`Py_VISIT` macro that automates calling visit
functions.  With :c:func:`Py_VISIT`, we can minimize the amount of boilerplate
in ``Custom_traverse``::

   static int
   Custom_traverse(CustomObject *self, visitproc visit, void *arg)
   {
       Py_VISIT(self->first);
       Py_VISIT(self->last);
       return 0;
   }

.. note::
   The :c:member:`~PyTypeObject.tp_traverse` implementation must name its
   arguments exactly *visit* and *arg* in order to use :c:func:`Py_VISIT`.

Second, we need to provide a method for clearing any subobjects that can
participate in cycles::

   static int
   Custom_clear(CustomObject *self)
   {
       Py_CLEAR(self->first);
       Py_CLEAR(self->last);
       return 0;
   }

Notice the use of the :c:func:`Py_CLEAR` macro.  It is the recommended and safe
way to clear data attributes of arbitrary types while decrementing
their reference counts.  If you were to call :c:func:`Py_XDECREF` instead
on the attribute before setting it to *NULL*, there is a possibility
that the attribute's destructor would call back into code that reads the
attribute again (*especially* if there is a reference cycle).

.. note::
   You could emulate :c:func:`Py_CLEAR` by writing::

      PyObject *tmp;
      tmp = self->first;
      self->first = NULL;
      Py_XDECREF(tmp);

   Nevertheless, it is much easier and less error-prone to always
   use :c:func:`Py_CLEAR` when deleting an attribute.  Don't
   try to micro-optimize at the expense of robustness!

The deallocator ``Custom_dealloc`` may call arbitrary code when clearing
attributes.  It means the circular GC can be triggered inside the function.
Since the GC assumes reference count is not zero, we need to untrack the object
from the GC by calling :c:func:`PyObject_GC_UnTrack` before clearing members.
Here is our reimplemented deallocator using :c:func:`PyObject_GC_UnTrack`
and ``Custom_clear``::

   static void
   Custom_dealloc(CustomObject *self)
   {
       PyObject_GC_UnTrack(self);
       Custom_clear(self);
       Py_TYPE(self)->tp_free((PyObject *) self);
   }

Finally, we add the :const:`Py_TPFLAGS_HAVE_GC` flag to the class flags::

   .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC,

That's pretty much it.  If we had written custom :c:member:`~PyTypeObject.tp_alloc` or
:c:member:`~PyTypeObject.tp_free` handlers, we'd need to modify them for cyclic
garbage collection.  Most extensions will use the versions automatically provided.


Subclassing other types
=======================

It is possible to create new extension types that are derived from existing
types. It is easiest to inherit from the built in types, since an extension can
easily use the :c:type:`PyTypeObject` it needs. It can be difficult to share
these :c:type:`PyTypeObject` structures between extension modules.

In this example we will create a :class:`SubList` type that inherits from the
built-in :class:`list` type. The new type will be completely compatible with
regular lists, but will have an additional :meth:`increment` method that
increases an internal counter:

.. code-block:: pycon

   >>> import sublist
   >>> s = sublist.SubList(range(3))
   >>> s.extend(s)
   >>> print(len(s))
   6
   >>> print(s.increment())
   1
   >>> print(s.increment())
   2

.. literalinclude:: ../includes/sublist.c


As you can see, the source code closely resembles the :class:`Custom` examples in
previous sections. We will break down the main differences between them. ::

   typedef struct {
       PyListObject list;
       int state;
   } SubListObject;

The primary difference for derived type objects is that the base type's
object structure must be the first value.  The base type will already include
the :c:func:`PyObject_HEAD` at the beginning of its structure.

When a Python object is a :class:`SubList` instance, its ``PyObject *`` pointer
can be safely cast to both ``PyListObject *`` and ``SubListObject *``::

   static int
   SubList_init(SubListObject *self, PyObject *args, PyObject *kwds)
   {
       if (PyList_Type.tp_init((PyObject *) self, args, kwds) < 0)
           return -1;
       self->state = 0;
       return 0;
   }

We see above how to call through to the :attr:`__init__` method of the base
type.

This pattern is important when writing a type with custom
:c:member:`~PyTypeObject.tp_new` and :c:member:`~PyTypeObject.tp_dealloc`
members.  The :c:member:`~PyTypeObject.tp_new` handler should not actually
create the memory for the object with its :c:member:`~PyTypeObject.tp_alloc`,
but let the base class handle it by calling its own :c:member:`~PyTypeObject.tp_new`.

The :c:type:`PyTypeObject` struct supports a :c:member:`~PyTypeObject.tp_base`
specifying the type's concrete base class.  Due to cross-platform compiler
issues, you can't fill that field directly with a reference to
:c:type:`PyList_Type`; it should be done later in the module initialization
function::

   PyMODINIT_FUNC
   PyInit_sublist(void)
   {
       PyObject* m;
       SubListType.tp_base = &PyList_Type;
       if (PyType_Ready(&SubListType) < 0)
           return NULL;

       m = PyModule_Create(&sublistmodule);
       if (m == NULL)
           return NULL;

       Py_INCREF(&SubListType);
       PyModule_AddObject(m, "SubList", (PyObject *) &SubListType);
       return m;
   }

Before calling :c:func:`PyType_Ready`, the type structure must have the
:c:member:`~PyTypeObject.tp_base` slot filled in.  When we are deriving an
existing type, it is not necessary to fill out the :c:member:`~PyTypeObject.tp_alloc`
slot with :c:func:`PyType_GenericNew` -- the allocation function from the base
type will be inherited.

After that, calling :c:func:`PyType_Ready` and adding the type object to the
module is the same as with the basic :class:`Custom` examples.


.. rubric:: Footnotes

.. [#] This is true when we know that the object is a basic type, like a string or a
   float.

.. [#] We relied on this in the :c:member:`~PyTypeObject.tp_dealloc` handler
   in this example, because our type doesn't support garbage collection.

.. [#] We now know that the first and last members are strings, so perhaps we
   could be less careful about decrementing their reference counts, however,
   we accept instances of string subclasses.  Even though deallocating normal
   strings won't call back into our objects, we can't guarantee that deallocating
   an instance of a string subclass won't call back into our objects.

.. [#] Also, even with our attributes restricted to strings instances, the user
   could pass arbitrary :class:`str` subclasses and therefore still create
   reference cycles.